HK40100824A

HK40100824A - Codon optimized rpgrorf 15 genes and uses thereof

Info

Publication number: HK40100824A
Application number: HK62024088859.9A
Authority: HK
Inventors: David H. KIRN; Melissa A. KOTTERMAN; David Schaffer; Peter Francis
Original assignee: 4D Molecular Therapeutics Inc.
Priority date: 2020-09-02
Filing date: 2021-08-30
Publication date: 2024-05-03

Description

Codon-optimized RPGRORF15 gene and its applications

相关申请的交叉引用Cross-references to related applications

本申请要求2020年9月2日提交的美国临时专利申请序列号63/073,843的权益，其全部公开内容通过引用并入本文。This application claims the benefit of U.S. Provisional Patent Application Serial No. 63/073,843, filed September 2, 2020, the entire disclosure of which is incorporated herein by reference.

经由EFS-WEB提交序列表Submitting sequence lists via EFS-WEB

在2021年8月11日或其前后创建的题为“090400-5012-WO-Sequence-Listing”的计算机可读文本文件(文件大小为大约37KB)含有本申请的序列表，并特此通过引用全文并入本文。A computer-readable text file entitled “090400-5012-WO-Sequence-Listing” (approximately 37KB in size), created on or around August 11, 2021, contains the sequence list of this application, and is hereby incorporated herein by reference in its entirety.

发明背景Background of the Invention

X连锁视网膜色素变性(XLRP)是一种相对严重且遗传异质性的遗传性视网膜变性。大约70％的XLRP病例是由视网膜色素变性GTP酶调节因子(RPGR)基因中的突变引起的。RPGR基因编码广泛表达的几种不同的可变剪接转录本。编码的蛋白的功能尚不清楚，但研究表明其在称为纤毛的细胞结构中起重要作用。X-linked retinitis pigmentosa (XLRP) is a relatively severe and genetically heterogeneous inherited retinal degeneration. Approximately 70% of XLRP cases are caused by mutations in the retinitis pigmentosa GTPase regulator (RPGR) gene. The RPGR gene encodes several different alternative splicing transcripts that are widely expressed. The function of the encoded protein is not fully understood, but studies have shown that it plays an important role in cellular structures called cilia.

一种RPGR同工型含有称为ORF15的独特3'区域，其为567个氨基酸的富含Gly和Glu的羧基末端结构域。含有RPGR基因的外显子1-13和ORF15区域的这种版本的RPGR蛋白主要在视网膜中的光感受器中表达。RPGR的ORF15区域中的突变占所有XLRP病例的大约60％。One RPGR isoform contains a unique 3' region called ORF15, a 567-amino acid-rich C-terminal domain rich in Gly and Glu. This version of the RPGR protein, containing exons 1-13 and the ORF15 region of the RPGR gene, is primarily expressed in photoreceptors in the retina. Mutations in the ORF15 region of RPGR account for approximately 60% of all XLRP cases.

几项临床前研究支持使用RPGRorf15的野生型cDNA来挽救XLRP疾病表型。但是，野生型序列不佳的序列稳定性对在载体生产过程中维持序列完整性提出了挑战，并且野生型序列在人光感受器中的次优表达水平对治疗XLRP的基因疗法方法是一种挑战。Several preclinical studies support the use of wild-type cDNA of RPGRorf15 to rescue the XLRP disease phenotype. However, the poor sequence stability of the wild-type sequence poses a challenge to maintaining sequence integrity during vector production, and the suboptimal expression level of the wild-type sequence in human photoreceptors presents a challenge to gene therapy approaches for treating XLRP.

发明概述Invention Overview

公开了编码人视网膜色素变性GTP酶调节因子(RPGR)蛋白的密码子优化的核酸分子。在一方面，本公开提供了包含SEQ ID NO:1的核苷酸序列的核酸或包含与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少96％、至少97％、至少98％、或至少99％同一性的核苷酸序列的核酸，并且所述核酸编码具有SEQ ID NO:2的氨基酸序列的人RPGR多肽。在一些实施方案中，提供了包含SEQ ID NO:1的核苷酸序列或由其组成的核酸。在相关实施方案中，核酸以与野生型RPGR核酸序列(例如SEQ ID NO:3)在其他方面相同的细胞中的表达水平相比更高的水平表达。Disclosed are codon-optimized nucleic acid molecules encoding a human retinitis pigmentosa GTPase regulatory factor (RPGR) protein. In one aspect, this disclosure provides a nucleic acid comprising the nucleotide sequence of SEQ ID NO:1 or a nucleic acid comprising a nucleotide sequence having at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity with the nucleotide sequence of SEQ ID NO:1, and said nucleic acid encoding a human RPGR polypeptide having the amino acid sequence of SEQ ID NO:2. In some embodiments, a nucleic acid comprising or composed of the nucleotide sequence of SEQ ID NO:1 is provided. In related embodiments, the nucleic acid is expressed at a higher level than in cells otherwise identical to those expressing a wild-type RPGR nucleic acid sequence (e.g., SEQ ID NO:3).

在一些方面，如本文中所述的密码子优化的核酸分子具有相对于野生型RPGRcDNA(GenBank检索号NM_001034853；SEQ ID NO:3)的人密码子适应指数提高的人密码子适应指数。在一些实施方案中，密码子优化的核酸分子具有至少大约0.85、至少大约0.88、或至少大约0.89的人密码子适应指数。In some respects, the codon-optimized nucleic acid molecules described herein have a higher human codon fitness index relative to wild-type RPGRcDNA (GenBank search number NM_001034853; SEQ ID NO:3). In some embodiments, the codon-optimized nucleic acid molecules have a human codon fitness index of at least about 0.85, at least about 0.88, or at least about 0.89.

在某些实施方案中，核酸含有与SEQ ID NO:3中的G/C核苷酸百分比相比更高的G/C核苷酸百分比。在另一些实施方案中，核酸含有为至多大约59％、至多大约58％、或至多大约57％的G/C核苷酸百分比。在一些方面，核酸的平均G/C含量为大约55％至大约59％、大约56％至大约58％。在一些优选实施方案中，平均G/C含量为大约57％。In some embodiments, the nucleic acid contains a higher percentage of G/C nucleotides compared to the percentage of G/C nucleotides in SEQ ID NO:3. In other embodiments, the nucleic acid contains a G/C nucleotide percentage of up to about 59%, up to about 58%, or up to about 57%. In some aspects, the average G/C content of the nucleic acid is from about 55% to about 59%, or from about 56% to about 58%. In some preferred embodiments, the average G/C content is about 57%.

在另一些实施方案中，核酸相对于SEQ ID NO:3包含一个或多个优化的参数，其选自负性顺式作用位点(包括但不限于TATA盒和剪接位点)的去除和最佳密码子频率的提高。In other embodiments, the nucleic acid relative to SEQ ID NO:3 includes one or more optimized parameters selected from the removal of negative cis-acting sites (including but not limited to TATA boxes and splice sites) and the enhancement of optimal codon frequencies.

在另一个实施方案中，核酸可操作地连接至至少一个转录控制序列，优选与核酸异源的转录控制序列。在一些方面，转录控制序列是导致核酸例如在感光细胞中的细胞特异性表达的细胞或组织特异性启动子，诸如人视杆光感受器特异性人G蛋白偶联受体视紫红质激酶1(hGRK)或人光感受器间类视黄醇结合蛋白(IRBP)启动子。在优选实施方案中，转录控制序列包含人视杆光感受器特异性人G蛋白偶联受体视紫红质激酶1(hGRK)启动子。在另一些方面，转录控制序列是组成型启动子，其在许多细胞类型中导致类似的核酸表达水平(例如CAG、CBA、CMV或PGK启动子)。在优选实施方案中，转录控制序列包含如Young等人,Investigative Ophthalmology and Visual Science,44(9):4076-4085(2003)中所述的人G蛋白偶联受体激酶(hGRK，也称为视紫红质激酶)启动子。在一个特别优选的实施方案中，hGRK启动子包含SEQ ID NO:4的序列或包含与其具有至少95％、至少96％、至少97％、至少98％或至少99％同一性的序列：In another embodiment, the nucleic acid is operatively linked to at least one transcriptional control sequence, preferably a transcriptional control sequence heterologous to the nucleic acid. In some aspects, the transcriptional control sequence is a cell- or tissue-specific promoter that results in cell-specific expression of the nucleic acid, such as in photoreceptor-specific human G protein-coupled receptor rhodopsin kinase 1 (hGRK) or human inter-photoreceptor retinol-binding protein (IRBP) promoters. In a preferred embodiment, the transcriptional control sequence comprises a human rod photoreceptor-specific human G protein-coupled receptor rhodopsin kinase 1 (hGRK) promoter. In other aspects, the transcriptional control sequence is a constitutive promoter that results in similar levels of nucleic acid expression in many cell types (e.g., CAG, CBA, CMV, or PGK promoters). In a preferred embodiment, the transcriptional control sequence comprises a human G protein-coupled receptor kinase (hGRK, also known as rhodopsin kinase) promoter as described in Young et al., Investigative Ophthalmology and Visual Science, 44(9):4076-4085 (2003). In a particularly preferred embodiment, the hGRK promoter comprises the sequence of SEQ ID NO:4 or comprises a sequence having at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity with it.

GGGCCCCAGAAGCCTGGTGGTTGTTTGTCCTTCTCAGGGGAAAAGTGAGGCGGCCCCTGGGCCCCAGAAGCCTGGTGGTTGTTTGTCCTTCTCAGGGGAAAAGTGAGGCGGCCCCT

TGGAGGAAGGGGCCGGGCAGAATGATCTAATCGGATTCCAAGCAGCTCAGGGGATTGTCTGGAGGAAGGGGCCGGGCAGAATGATCTAATCGGATTCCAAGCAGCTCAGGGGATTGTC

TTTTTCTAGCACCTTCTTGCCACTCCTAAGCGTCCTCCGTGACCCCGGCTGGGATTTAGCCTGGTGCTGTGTCAGCCCCGGG(SEQ ID NO:4)。TTTTTCTAGCACCTTCTTGCCACTCCTAAGCGTCCTCCGTGACCCCGGCTGGGATTTAGCCTGGTGCTGTGTCAGCCCCGGG (SEQ ID NO: 4).

在相关实施方案中，本文中提供了包含可操作地连接至表达控制序列的核酸的表达盒，所述核酸包含SEQ ID NO:1的核苷酸序列或与其具有至少90％同一性的核苷酸序列。In related embodiments, this document provides an expression cassette comprising a nucleic acid operatively linked to an expression control sequence, said nucleic acid comprising the nucleotide sequence of SEQ ID NO:1 or a nucleotide sequence having at least 90% identity with it.

在相关实施方案中，本文中提供了包含核酸的载体，所述核酸包含SEQ ID NO:1的核苷酸序列或与其具有至少90％同一性的核苷酸序列。在优选实施方案中，载体是重组腺相关(rAAV)表达载体。在一些实施方案中，rAAV载体包含天然衣壳(例如AAV血清型2或AAV血清型5或AAV血清型8的衣壳)。在另一些实施方案中，rAAV载体包含相对于天然AAV衣壳经修饰(例如包含一个或多个肽插入和/或一个或多个氨基酸取代(例如酪氨酸至苯丙氨酸)和/或氨基酸插入或氨基酸缺失)的衣壳(例如相对于血清型2、5或8的AAV衣壳包含一种或多种修饰)。In related embodiments, vectors comprising nucleic acids are provided herein, said nucleic acids comprising the nucleotide sequence of SEQ ID NO:1 or a nucleotide sequence having at least 90% identity with it. In a preferred embodiment, the vector is a recombinant gland-associated (rAAV) expression vector. In some embodiments, the rAAV vector comprises a natural capsid (e.g., a capsid of AAV serotype 2, AAV serotype 5, or AAV serotype 8). In other embodiments, the rAAV vector comprises a capsid modified relative to the natural AAV capsid (e.g., comprising one or more peptide insertions and/or one or more amino acid substitutions (e.g., tyrosine to phenylalanine) and/or amino acid insertions or deletions) (e.g., comprising one or more modifications relative to the AAV capsid of serotype 2, 5, or 8).

在另一个实施方案中，本文中提供了包含核酸的宿主细胞，所述核酸包含SEQ IDNO:1的核苷酸序列或与其具有至少90％同一性的核苷酸序列。在一些方面，宿主细胞是哺乳动物细胞，包括但不限于CHO细胞、HEK293细胞、HeLa细胞、BHK21细胞、Vero细胞或V27细胞。在相关方面，宿主细胞选自CHO细胞、HEK293细胞、HEK293T细胞、HeLa细胞、BHK21细胞和Vero细胞。在另一些方面，宿主细胞是感光细胞(例如视杆(rods)；视锥(rods))、视网膜神经节细胞(RGC)、胶质细胞(例如穆勒胶质细胞、小胶质细胞)、双极细胞、无长突细胞、水平细胞或视网膜色素上皮(RPE)细胞。在相关实施方案中，本公开提供了提高SEQ ID NO:2的多肽的表达的方法，其包括在核酸分子表达SEQ ID NO:2的多肽的条件下培养宿主细胞，其中相对于包含含有SEQ ID NO:3的核苷酸序列(比较序列)的参考核酸的在相同条件下培养的宿主细胞，多肽的表达提高。In another embodiment, a host cell comprising nucleic acid is provided herein, said nucleic acid comprising the nucleotide sequence of SEQ ID NO:1 or a nucleotide sequence having at least 90% identity with it. In some aspects, the host cell is a mammalian cell, including but not limited to CHO cells, HEK293 cells, HeLa cells, BHK21 cells, Vero cells, or V27 cells. In related aspects, the host cell is selected from CHO cells, HEK293 cells, HEK293T cells, HeLa cells, BHK21 cells, and Vero cells. In other aspects, the host cell is a photoreceptor cell (e.g., rods; cones), retinal ganglion cells (RGCs), glial cells (e.g., Müller glial cells, microglia), bipolar cells, amacrine cells, horizontal cells, or retinal pigment epithelium (RPE) cells. In related embodiments, this disclosure provides a method for enhancing the expression of the polypeptide of SEQ ID NO:2, comprising culturing host cells under conditions in which the polypeptide of SEQ ID NO:2 is expressed in a nucleic acid molecule, wherein the expression of the polypeptide is enhanced relative to host cells cultured under the same conditions containing a reference nucleic acid containing the nucleotide sequence of SEQ ID NO:3 (comparison sequence).

在另一个实施方案中，本公开提供了提高人受试者中SEQ ID NO:2的多肽的表达的方法，其包括向受试者施用包含与SEQ ID NO:1的核苷酸序列具有至少85％、至少90％、至少95％、至少96％、至少97％、至少98％、或至少99％同一性并且编码具有SEQ ID NO:2的氨基酸序列的多肽的核苷酸序列的分离的核酸分子，或包含此类核苷酸序列的载体，其中相对于包含SEQ ID NO:3的核苷酸序列(比较序列)的参考核酸分子或包含参考核酸分子的载体，多肽的表达提高。In another embodiment, this disclosure provides a method for enhancing the expression of the polypeptide of SEQ ID NO:2 in a human subject, comprising administering to the subject an isolated nucleic acid molecule comprising a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity with the nucleotide sequence of SEQ ID NO:1 and encoding a polypeptide having the amino acid sequence of SEQ ID NO:2, or a vector comprising such a nucleotide sequence, wherein the expression of the polypeptide is enhanced relative to a reference nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:3 (comparison sequence) or a vector comprising a reference nucleic acid molecule.

在一些实施方案中，本公开提供了在人受试者中治疗与RGRP ORF15活性不足相关的眼部病症的方法，其包括向受试者施用本文中公开的核酸分子或载体。在一些实施方案中，视网膜病症是X连锁视网膜色素变性。In some embodiments, this disclosure provides a method for treating an ocular condition associated with insufficient RGRP ORF15 activity in human subjects, comprising administering to the subject a nucleic acid molecule or vector disclosed herein. In some embodiments, the retinal condition is X-linked retinitis pigmentosa.

附图说明Attached Figure Description

图1示出了pAAV-GRK-cohRPGRorf15-SV40的限制性消化物的凝胶电泳。用各种酶消化大量制备DNA(Maxiprep DNA)并通过琼脂糖凝胶电泳进行分析：泳道1＝2-logladder；泳道2＝BsrGI-H+BglII；泳道3＝Pml+Sph-HF；泳道4＝HindIII-HF+Sph-HF；泳道5＝Pst。所得限制性片段与所有消化物中的预测片段匹配(泳道2 3.9、2.5、0.6kb的片段；泳道3 3.7、2.1、1.3kb的片段；泳道4 3.9、1.7和1.5kb的片段；泳道5 4.6、1.4和1.2kb的片段)。以千碱基对计的突出的2-log ladder的尺寸显示在凝胶左侧。Figure 1 shows the gel electrophoresis of the restriction digests from pAAV-GRK-cohRPGRorf15-SV40. Large-scale DNA preparations (Maxiprep DNA) were prepared by digestion with various enzymes and analyzed by agarose gel electrophoresis: lane 1 = 2-log ladder; lane 2 = BsrGI-H + BglII; lane 3 = Pml + Sph-HF; lane 4 = HindIII-HF + Sph-HF; lane 5 = Pst. The resulting restriction fragments matched the predicted fragments in all digests (lanes 2: 3.9, 2.5, 0.6 kb; lane 3: 3.7, 2.1, 1.3 kb; lane 4: 3.9, 1.7, 1.5 kb; lane 5: 4.6, 1.4, 1.2 kb). The dimensions of the prominent 2-log ladder in kilobase pairs are shown on the left side of the gel.

图2是来自用pAAV-GRK-cohRPGRorf15-SV40转染的HEK293T细胞的细胞裂解物的蛋白印迹。用指定的一抗(Sigma；CT-15；Polyglut GT335)评估人RPGRorf15蛋白在HEK293细胞中的表达。对于每种抗体，泳道1＝未转染的对照；泳道2＝pAAV-GRK-cohRPGRorf15-SV40；泳道3＝pAAV-PGK-cohRPGRorf15-SV40。箭头指示hRPGRorf15蛋白。分子量标志物(以千道尔顿计)显示在左侧。Figure 2 shows a Western blot of cell lysates from HEK293T cells transfected with pAAV-GRK-cohRPGRorf15-SV40. Human RPGrrf15 protein expression in HEK293 cells was assessed using the specified primary antibodies (Sigma; CT-15; Polyglut GT335). For each antibody, lane 1 = untransfected control; lane 2 = pAAV-GRK-cohRPGRorf15-SV40; lane 3 = pAAV-PGK-cohRPGRorf15-SV40. Arrows indicate hRPGRorf15 protein. Molecular weight markers (in kilodaltons) are shown on the left.

图3用包含在hGRK1启动子的控制下的SEQ ID NO:1的密码子优化的RPGRorf15的重组AAV(rAAV)病毒粒子的转导导致XLRP-iPSC来源的感光细胞中cohRPGRorf15(SEQ IDNO:1)转录本水平的强烈提高。在转导后三十天，在用包含pAAV-GRK-cohRPGRorf15-SV40和SEQ ID NO:9的衣壳的rAAV以50,000的MOI转导后，对从XLRP-iPSC来源的光感受器培养物中提取的RNA进行微滴式数字PCR。确定hRPGR1-19(内部对照)和cohRPGRorf15转录本水平并量化为高于设定阈值的拷贝/mL，并按对数标尺绘图。在转导后，密码子优化的hRPGRorf15(SEQ ID NO:1)转录本水平在统计学上大于hRPGR1-19。NT＝未转导，MOI＝感染复数，hRPGR1-19＝人视网膜色素变性GTP酶调节因子外显子1-19，组成型同工型，cohRPGRorf15＝密码子优化的人视网膜色素变性GTP酶调节因子开放阅读框15，SEQ IDNO:1的视网膜特异性同工型。*与MOI 50,000hRPGR1-19相比p≤0.05，与NT cohRPGRorf15相比p≤0.05。误差条±标准偏差。每个患者n＝3。Y轴以对数标尺表示。Figure 3 shows that transduction of recombinant AAV (rAAV) viral particles containing codon-optimized RPGrrf15 (SEQ ID NO:1) under the control of the hGRK1 promoter resulted in a strong increase in cohRPGRorf15 (SEQ ID NO:1) transcript levels in XLRP-iPSC-derived photoreceptor cultures. Thirty days post-transduction, RNA extracted from XLRP-iPSC-derived photoreceptor cultures was subjected to droplet digital PCR after transduction with rAAV containing a capsid of pAAV-GRK-cohRPGRorf15-SV40 and SEQ ID NO:9 at an MOI of 50,000. hRPGR1-19 (internal control) and cohRPGRorf15 transcript levels were determined and quantified as copies/mL above a set threshold, plotted on a logarithmic scale. Following transduction, codon-optimized hRPGRorf15 (SEQ ID NO:1) transcript levels were statistically greater than hRPGR1-19. NT = Untransduced, MOI = Multiple of infection, hRPGR1-19 = Exons 1-19 of human retinitis pigmentosa GTPase regulator, constitutive isoform, cohRPGRorf15 = Codon-optimized open reading frame 15 of human retinitis pigmentosa GTPase regulator, retina-specific isoform of SEQ ID NO:1. * p ≤ 0.05 compared with MOI 50,000 hRPGR1-19, p ≤ 0.05 compared with NT cohRPGRorf15. Error bars ± standard deviation. n = 3 per patient. Y-axis is expressed on a logarithmic scale.

图4用包含在hGRK1启动子的控制下的SEQ ID NO:1的密码子优化的RPGRorf15的rAAV的转导提高了XLRP光感受器培养物中的hRPGRorf15蛋白水平。以50,000的MOI转导XLRP-iPSC来源的光感受器培养物，并在转导后30天收获蛋白裂解物。SDS-PAGE和蛋白印迹显示，对两个患者而言，与未转导细胞(NT)相比，在127kDa处的hRPGRorf15(相对于加载的对照α-微管蛋白归一化)增加。量化条带强度并在患者之间取平均值。用rAAV转导产生hRPGRorf15蛋白的显著增加。*与NT相比p≤0.05。误差条±标准偏差。每个患者n＝3。Figure 4. Transduction with rAAV of RPGORf15, codon-optimized to SEQ ID NO:1 and controlled by the hGRK1 promoter, increased hRPGRorf15 protein levels in XLRP photoreceptor cultures. XLRP-iPSC-derived photoreceptor cultures were transduced at 50,000 MOI, and protein lysates were harvested 30 days post-transduction. SDS-PAGE and Western blot analysis showed an increase in hRPGRorf15 at 127 kDa (normalized relative to loaded control α-tubulin) in both patients compared to untransduced cells (NT). Band intensities were quantified and averaged across patients. Transduction with rAAV produced a significant increase in hRPGRorf15 protein. *p ≤ 0.05 compared to NT. Error bars ± standard deviation. n = 3 per patient.

图5在XLRP光感受器培养物中用包含在hGRK启动子控制下的SEQ ID NO:1的密码子优化的RPGRorf15的rAAV转导后hRPGRorf15的谷氨酰化。以50,000的MOI转导XLRP-iPSC来源的光感受器培养物，并在转导后30天收获蛋白裂解物。SDS-PAGE和蛋白印迹分析显示，对两个患者而言，与未转导(NT)对照相比，127kDa蛋白hRPGRorf15的谷氨酰化(相对于加载的对照α-微管蛋白归一化)增加，量化条带强度并在患者之间取平均值。用rAAV转导产生hRPGRorf15蛋白的谷氨酰化的显著增加。GT335＝抗谷氨酰化抗体，NT＝未转导，MOI＝感染复数，hRPGRorf15＝人视网膜色素变性GTP酶调节因子开放阅读框15，视网膜特异性同工型。*与NT相比p≤0.05。误差条±标准偏差。每个患者n＝3。Figure 5. Glutonylation of hRPGRorf15 after rAAV transduction with codon-optimized RPGRorf15 containing the hGRK promoter in XLRP photoreceptor cultures. XLRP-iPSC-derived photoreceptor cultures were transduced at an MOI of 50,000, and protein lysates were harvested 30 days post-transduction. SDS-PAGE and Western blot analysis showed increased glutonylation of the 127 kDa protein hRPGRorf15 (normalized relative to loaded control α-tubulin) in both patients compared to the untransduced (NT) control, with quantified band intensities averaged across patients. The significant increase in hRPGRorf15 glutonylation was observed during rAAV transduction. GT335 = anti-glutamylating antibody, NT = untransduced, MOI = multiplicity of infection, hRPGRorf15 = human retinitis pigmentosa GTPase regulator open reading frame 15, retina-specific isoform. * p ≤ 0.05 compared to NT. Error bars ± standard deviation. n = 3 per patient.

图6组成型启动子驱动XLRP光感受器培养物中hRPGRorf15蛋白和谷氨酰化的增加。用包含在PGK启动子的控制下的SEQ ID NO:1的密码子优化的RPGRorf15的rAAV以5,000、10,000和20,000的MOI转导XLRP-iPSC来源的光感受器培养物。在转导后30天收获蛋白裂解物。SDS-PAGE和蛋白印迹显示，对患者78而言，与未转导(NT)对照相比，hRPGRorf15和127kDa处的谷氨酰化(相对于加载的对照α-微管蛋白归一化)增加。量化条带强度。转导产生hRPGRorf15蛋白的显著增加。NT＝未转导，MOI＝感染复数，hRPGRorf15＝人视网膜色素变性GTP酶调节因子开放阅读框15，视网膜特异性同工型，GT335＝抗谷氨酰化抗体。*与NT相比p≤0.05。误差条±标准偏差。n＝3。Figure 6. Constitutive promoter-driven increase in hRPGRorf15 protein and glutamylation in XLRP photoreceptor cultures. XLRP-iPSC-derived photoreceptor cultures were transduced with rAAV of RPGRorf15 codon-optimized to SEQ ID NO:1, controlled by the PGK promoter, at MOIs of 5,000, 10,000, and 20,000. Protein lysates were harvested 30 days post-transduction. SDS-PAGE and Western blot analysis showed increased hRPGRorf15 and glutamylation at 127 kDa (normalized relative to loaded control α-tubulin) in patient 78 compared to the untransduced (NT) control. Band intensity was quantified. Transduction resulted in a significant increase in hRPGRorf15 protein. NT = Untransduced, MOI = Multiple of infection, hRPGRorf15 = Human retinitis pigmentosa GTPase regulatory factor open reading frame 15, retinal-specific isoform, GT335 = Anti-glutamylating antibody. *p≤0.05 compared to NT. Error bars ± standard deviation. n=3.

图7是SEQ ID NO:1的密码子优化的序列和编码的氨基酸序列。Figure 7 shows the codon-optimized sequence and encoded amino acid sequence of SEQ ID NO:1.

图8是下文实施例中描述的rAAV中所含转基因盒的示意图。转基因盒包含5'AAV2ITR、人视紫红质激酶(亦称hGRK)启动子、SEQ ID NO:1的密码子优化的人RPGRorf15cDNA、晚期SV40多聚腺苷酸化信号和3'AAV2 ITR，并具有SEQ ID NO:5的核苷酸序列。Figure 8 is a schematic diagram of the transgenic cassette contained in the rAAV described in the examples below. The transgenic cassette contains a 5'AAV2ITR, a human rhodopsin kinase (hGRK) promoter, codon-optimized human RPGRorf15 cDNA of SEQ ID NO:1, a late SV40 polyadenylation signal, and a 3'AAV2ITR, and has the nucleotide sequence of SEQ ID NO:5.

图9示出了如通过房水闪辉、房水细胞和玻璃体细胞所评估的那样，通过定量眼部炎症的4D-125(包含图8中显示的转基因盒和SEQ ID NO:9的衣壳蛋白)的安全性。在高剂量下观察到短暂性轻度眼部炎症的眼底征。这些变化响应于全身类固醇治疗的增加。没有被认为与4D-125相关的不利发现。在不同的检查间隔下，所有动物的IOP值均在正常限度内。ERG值和包括黄斑形态的OCT图像也在正常限度内。Figure 9 illustrates the safety of 4D-125 (containing the transgenic cassette shown in Figure 8 and the capsid protein of SEQ ID NO: 9) for quantifying ocular inflammation, as assessed by aqueous flare, aqueous cells, and vitreous cells. Transient, mild ocular inflammation with fundus signs was observed at high doses. These changes were in response to increased systemic steroid treatment. No adverse findings were considered to be associated with 4D-125. IOP values in all animals were within normal limits at different examination intervals. ERG values and OCT images including macular morphology were also within normal limits.

图10示出了在玻璃体内施用4D-125的NHP中在3个尸检时间点处通过qPCR测得的所选视网膜、眼和非眼组织中的载体基因组生物分布。LOD＝检测下限；为了可视化目的，所有样品“BLOD”以LOD值绘制。Figure 10 shows the vector genomic biodistribution in selected retinal, ocular, and non-ocular tissues at three autopsy time points after intravitreal administration of 4D-125 NHP. LOD = Limit of Detection; for visualization purposes, all samples are plotted as LOD values.

图11示出了在玻璃体内施用4D-125的NHP中在3个尸检时间点处通过RT-qPCR测得的所选视网膜、眼和非眼组织中的RPGR转基因mRNA表达。LOD＝检测下限；为了可视化目的，所有样品“BLOD”以LOD值绘制。Figure 11 shows the expression of RPGR transgenic mRNA in selected retinal, ocular, and non-ocular tissues at three autopsy time points after intravitreal administration of 4D-125 NHP, as measured by RT-qPCR. LOD = Limit of Detection; for visualization purposes, all samples are plotted as LOD values.

发明详述Invention Details

定义definition

本文中所用的“密码子适应指数”是指密码子使用偏性的量度。密码子适应指数(CAI)度量给定蛋白编码基因序列相对于参考基因集的偏差(Sharp P M和Li W H,NucleicAcids Res.15(3):1281-95(1987))。通过确定与基因序列长度上的每个密码子相关的权重的几何平均值(以密码子测量)来计算CAI：The term "codon fitness index" as used in this paper refers to a measure of codon usage bias. The codon fitness index (CAI) measures the deviation of a given protein-coding gene sequence from a reference gene set (Sharp PM and Li WH, Nucleic Acids Res. 15(3):1281-95(1987)). CAI is calculated by determining the geometric mean (in codon measure) of the weights associated with each codon along the gene sequence length:

对每个氨基酸，将其密码子的每一个的权重(以CAI计)计算为该氨基酸的所观察到的密码子频率(fi)和同义密码子频率(fj)之间的比率：For each amino acid, the weight of each of its codons (in CAI) is calculated as the ratio between the observed codon frequency (fi) and the synonymous codon frequency (fj) for that amino acid:

术语“分离的”表示已经从其原始环境(其天然存在的环境)中取出的生物材料(细胞、核酸或蛋白)。例如，以天然状态存在于植物或动物中的多核苷酸不是分离的，但是与其天然存在的相邻核酸分离的相同多核苷酸被认为是“分离的”。The term "isolated" refers to biological material (cells, nucleic acids, or proteins) that has been removed from its original environment (the environment in which it naturally exists). For example, polynucleotides that exist naturally in plants or animals are not isolated, but the same polynucleotides isolated from their naturally occurring neighboring nucleic acids are considered "isolated".

术语“4D-125”是指重组AAV粒子，其包含(i)含有SEQ ID NO:9的氨基酸序列的衣壳蛋白和含有SEQ ID NO:5的核苷酸序列的异源核酸。The term "4D-125" refers to recombinant AAV particles comprising (i) a capsid protein containing the amino acid sequence of SEQ ID NO:9 and a heterologous nucleic acid containing the nucleotide sequence of SEQ ID NO:5.

术语“R100”是指包含SEQ ID NO:9的氨基酸序列的变体AAV衣壳蛋白。The term "R100" refers to the variant AAV capsid protein containing the amino acid sequence of SEQ ID NO:9.

如本文中所用，“编码区”或“编码序列”是由可翻译成氨基酸的密码子组成的多核苷酸的一部分。尽管“终止密码子”(TAG、TGA或TAA)通常不翻译成氨基酸，但其可以被认为是编码区的一部分，但任何侧翼序列，例如启动子、核糖体结合位点、转录终止子、内含子等，不是编码区的一部分。编码区的边界通常由编码所得多肽的氨基末端的5'末端处的起始密码子和编码所得多肽的羧基末端的3'末端处的翻译终止密码子来确定。两个或更多个编码区可以存在于单个多核苷酸构建体中，例如在单个载体上，或存在于分开的多核苷酸构建体中，例如在分开的(不同的)载体上。因此，单个载体可以仅含有单个编码区，或包含两个或更多个编码区。As used herein, a “coding region” or “coding sequence” is a portion of a polynucleotide consisting of codons that can be translated into amino acids. While “stop codons” (TAG, TGA, or TAA) are not typically translated into amino acids, they can be considered part of a coding region, but any flanking sequences, such as promoters, ribosome binding sites, transcription terminators, introns, etc., are not part of a coding region. The boundaries of a coding region are typically determined by the start codon at the 5' end of the amino terminus of the resulting polypeptide and the translation stop codon at the 3' end of the carboxyl terminus of the resulting polypeptide. Two or more coding regions can exist in a single polynucleotide construct, such as on a single vector, or in separate polynucleotide constructs, such as on separate (different) vectors. Therefore, a single vector can contain only a single coding region or contain two or more coding regions.

如本文中所用，术语“调控区”是指位于编码区的上游(5'非编码序列)、内部或下游(3'非编码序列)的核苷酸序列，并且其影响相关编码区的转录、RNA加工、稳定性或翻译。调控区可包括启动子、翻译前导序列、内含子、多聚腺苷酸化识别序列、RNA加工位点、效应子结合位点和茎-环结构。如果编码区意在在真核细胞中表达，则多聚腺苷酸化信号和转录终止序列将通常位于编码序列的3'。As used herein, the term "regulatory region" refers to a nucleotide sequence located upstream (5' non-coding sequence), inside, or downstream (3' non-coding sequence) of a coding region that influences transcription, RNA processing, stability, or translation of the relevant coding region. Regulatory regions may include promoters, translational leader sequences, introns, polyadenylation recognition sequences, RNA processing sites, effector binding sites, and stem-loop structures. If the coding region is intended for expression in eukaryotic cells, the polyadenylation signal and transcription termination sequence will typically be located at the 3' end of the coding sequence.

本文中使用的术语“核酸”可与“多核苷酸”或“核酸分子”互换，并且意在指核苷酸的聚合物。The term “nucleic acid” as used in this article may be used interchangeably with “polynucleotide” or “nucleic acid molecule” and is intended to refer to a polymer of nucleotides.

编码基因产物(例如多肽)的多核苷酸可包括与一个或多个编码区可操作地关联的启动子和/或其他转录或翻译控制元件。在可操作关联中，基因产物(例如多肽)的编码区与一个或多个调控区以将基因产物的表达置于一个或多个调控区的影响或控制下的方式关联。例如，如果启动子功能的诱导导致编码由编码区编码的基因产物的mRNA的转录，并且如果启动子和编码区之间的连接的性质不干扰启动子指导基因产物表达的能力或不干扰DNA模板转录的能力，则编码区和启动子“可操作地关联”。除启动子外，其他转录控制元件(例如增强子、操纵子、阻遏子(repressor)和转录终止信号)也可以与编码区可操作地关联以指导基因产物表达。Polynucleotides encoding gene products (e.g., polypeptides) may include promoters and/or other transcriptional or translational control elements operatively associated with one or more coding regions. In an operative association, the coding region of a gene product (e.g., a polypeptide) is associated with one or more regulatory regions in a manner that places the expression of the gene product under the influence or control of one or more regulatory regions. For example, the coding region and the promoter are “operatively associated” if induction of promoter function leads to transcription of mRNA encoding the gene product encoded by the coding region, and if the nature of the connection between the promoter and the coding region does not interfere with the promoter’s ability to direct gene product expression or the ability to transcribe a DNA template. In addition to promoters, other transcriptional control elements (e.g., enhancers, operons, repressors, and transcription termination signals) may also be operatively associated with coding regions to direct gene product expression.

“转录控制序列”是指诸如启动子、增强子、终止子等DNA调控序列，其提供编码序列在宿主细胞中的表达。多种转录控制区是本领域技术人员已知的。这些包括但不限于在脊椎动物细胞中起作用的转录控制区，诸如但不限于来自巨细胞病毒(与内含子A结合的即刻早期启动子)、猿猴病毒40(早期启动子)和逆转录病毒(诸如劳斯肉瘤病毒)的启动子和增强子区段。其他转录控制区包括来源于脊椎动物基因(诸如肌动蛋白、热休克蛋白、牛生长激素和兔β-珠蛋白)的那些，以及能够控制真核细胞中基因表达的其他序列。另外的合适转录控制区包括组织特异性启动子和增强子，以及淋巴因子诱导型启动子(例如可由干扰素或白介素诱导的启动子)。"Transcriptional control sequences" refer to DNA regulatory sequences such as promoters, enhancers, and terminators that provide for the expression of coding sequences in host cells. Various transcriptional control regions are known to those skilled in the art. These include, but are not limited to, transcriptional control regions that function in vertebrate cells, such as, but not limited to, promoter and enhancer segments from cytomegalovirus (an immediate early promoter that binds to intron A), simian virus 40 (an early promoter), and retroviruses (such as Rous sarcoma virus). Other transcriptional control regions include those derived from vertebrate genes (such as actin, heat shock protein, bovine growth hormone, and rabbit β-globin), as well as other sequences capable of controlling gene expression in eukaryotic cells. Additional suitable transcriptional control regions include tissue-specific promoters and enhancers, and lymphokine-inducible promoters (e.g., promoters that can be induced by interferon or interleukin).

类似地，多种翻译控制元件是本领域普通技术人员已知的。这些包括但不限于核糖体结合位点、翻译起始和终止密码子以及来源于小核糖核酸病毒的元件(特别是内部核糖体进入位点或IRES，也称为CITE序列)。Similarly, a variety of translation control elements are known to those skilled in the art. These include, but are not limited to, ribosome binding sites, translation start and stop codons, and elements derived from microRNAs (particularly internal ribosome entry sites or IRES, also known as CITE sequences).

本文中使用的术语“表达”是指多核苷酸产生基因产物(例如RNA或多肽)的过程。其包括但不限于将多核苷酸转录成信使RNA(mRNA)、转运RNA(tRNA)、小发夹RNA(shRNA)、小干扰RNA(siRNA)或任何其他RNA产物，以及将mRNA翻译成多肽。表达产生“基因产物”。如本文中所用，基因产物可以是核酸(例如通过基因转录产生的信使RNA)，或从转录本翻译的多肽。本文中描述的基因产物进一步包括具有转录后修饰(例如多聚腺苷酸化或剪接)的核酸，或具有翻译后修饰(例如甲基化、糖基化、添加脂质、与其他蛋白亚基缔合、或溶蛋白性裂解(proteolytic cleavage))的多肽。As used herein, the term “expression” refers to the process by which polynucleotides produce gene products (e.g., RNA or polypeptides). This includes, but is not limited to, the transcription of polynucleotides into messenger RNA (mRNA), transfer RNA (tRNA), small hairpin RNA (shRNA), small interfering RNA (siRNA), or any other RNA product, and the translation of mRNA into polypeptides. Expression produces “gene products.” As used herein, gene products can be nucleic acids (e.g., messenger RNA produced through gene transcription) or polypeptides translated from transcripts. Gene products described herein further include nucleic acids with post-transcriptional modifications (e.g., polyadenylation or splicing) or polypeptides with post-translational modifications (e.g., methylation, glycosylation, addition of lipids, association with other protein subunits, or proteolytic cleavage).

“载体”是指用于将核酸克隆和/或转移到宿主细胞中的任何媒介。载体可以是复制子，另一核酸区段可以与其连接以引起所连接的区段的复制。术语“载体”包括用于体外、离体或体内将核酸引入到细胞中的病毒和非病毒媒介。大量载体是本领域已知和使用的，包括例如质粒、修饰的真核病毒或修饰的细菌病毒。将多核苷酸插入合适的载体可以通过将合适的多核苷酸片段连接到具有互补粘性末端的所选载体中来实现。"Vector" refers to any medium used to clone and/or transfer nucleic acids into host cells. A vector can be a replicon to which another nucleic acid segment can be ligated to induce replication of the ligated segment. The term "vector" includes viral and nonviral media for introducing nucleic acids into cells in vitro, ex vivo, or in vivo. A large number of vectors are known and used in the art, including, for example, plasmids, modified eukaryotic viruses, or modified bacterial viruses. Insertion of a polynucleotide into a suitable vector can be achieved by ligating a suitable polynucleotide fragment into a selected vector having complementary sticky ends.

载体可以工程化以编码选择性标志物或报告子，所述标志物或报告子提供已掺入载体的细胞的选择或鉴定。选择性标志物或报告子的表达使得能够鉴定和/或选择掺入并表达载体上所含其他编码区的宿主细胞。本领域中已知和使用的选择性标志物基因的实例包括：提供对氨苄青霉素、链霉素、庆大霉素、卡那霉素、潮霉素、双丙氨膦除草剂、磺胺药物等的抗性的基因；以及用作表型标志物的基因，即花青素调控基因、异戊烯基转移酶(isopentanyl transferase)基因等。本领域中已知和使用的报告子的实例包括：荧光素酶(Luc)、绿色荧光蛋白(GFP)、氯霉素乙酰基转移酶(CAT)、-半乳糖苷酶(LacZ)、-葡糖醛酸糖苷酶(Gus)等。选择性标志物也可以被认为是报告子。Vectors can be engineered to encode selective markers or reporters that provide selection or identification of cells incorporated into the vector. Expression of selective markers or reporters enables the identification and/or selection of host cells that incorporate and express other coding regions contained on the vector. Examples of selective marker genes known and used in the art include genes that provide resistance to ampicillin, streptomycin, gentamicin, kanamycin, hygromycin, diammonium phosphate herbicide, sulfonamides, etc.; and genes used as phenotypic markers, such as anthocyanin regulatory genes, isopentanyl transferase genes, etc. Examples of reporters known and used in the art include luciferase (Luc), green fluorescent protein (GFP), chloramphenicol acetyltransferase (CAT), β-galactosidase (LacZ), β-glucuronidase (Gus), etc. Selective markers can also be considered as reporters.

可以使用的真核病毒载体包括但不限于腺病毒载体、逆转录病毒载体、腺相关病毒载体、痘病毒(例如痘苗病毒载体)、杆状病毒载体或疱疹病毒载体。非病毒载体包括质粒、脂质体、带电脂质(electrically charged lipid)(细胞转染素)、DNA-蛋白复合物和生物聚合物。Eukaryotic viral vectors that can be used include, but are not limited to, adenovirus vectors, retrovirus vectors, adeno-associated virus vectors, poxviruses (e.g., vaccinia virus vectors), baculovirus vectors, or herpesvirus vectors. Non-viral vectors include plasmids, liposomes, electrically charged lipids (cytotransfectants), DNA-protein complexes, and biopolymers.

“启动子”和“启动子序列”可互换使用，并且是指能够控制编码序列或功能性RNA的表达的DNA序列。通常，编码序列位于启动子序列的3'。启动子可以全部来源于天然基因，或由来源于自然界中发现的不同启动子的不同元件构成，或甚至包含合成DNA区段。本领域技术人员应当理解的是，不同的启动子可以指导基因在不同组织或细胞类型中、或在不同发育阶段、或响应于不同的环境或生理条件的表达。引起基因在大多数时间在大多数细胞类型中表达的启动子通常被称为“组成型启动子”。引起基因在特定细胞类型中表达的启动子通常被称为“细胞特异性启动子”或“组织特异性启动子”。引起基因在特定的发育阶段或细胞分化阶段表达的启动子通常被称为“发育特异性启动子”或“细胞分化特异性启动子”。在用诱导启动子的试剂、生物分子、化学品、配体、光等对细胞进行暴露或处理后被诱导并引起基因表达的启动子通常被称为“诱导型启动子”或“可调节启动子”。进一步认识到，由于在大多数情况下调控序列的确切边界尚未完全确定，因此不同长度的DNA片段可具有相同的启动子活性。The terms "promoter" and "promoter sequence" are used interchangeably and refer to the DNA sequence that controls the expression of a coding sequence or functional RNA. Typically, the coding sequence is located at the 3' end of the promoter sequence. Promoters can be entirely derived from natural genes, or composed of different elements from different promoters found in nature, or even contain synthetic DNA segments. Those skilled in the art will understand that different promoters can direct gene expression in different tissues or cell types, at different developmental stages, or in response to different environmental or physiological conditions. Promoters that cause gene expression in most cell types most of the time are generally called "constitutive promoters." Promoters that cause gene expression in specific cell types are generally called "cell-specific promoters" or "tissue-specific promoters." Promoters that cause gene expression at specific developmental or cell differentiation stages are generally called "development-specific promoters" or "cell differentiation-specific promoters." Promoters that are induced to express genes after exposure or treatment of cells with promoter-inducing agents, biomolecules, chemicals, ligands, light, etc., are generally called "inducible promoters" or "regulatory promoters." It was further recognized that, since the exact boundaries of the regulatory sequence are not fully determined in most cases, DNA fragments of different lengths can have the same promoter activity.

术语“质粒”是指染色体外元件，其通常携带并非细胞中心代谢的一部分的基因，并通常为环状双链DNA分子的形式。此类元件可以是来源于任何来源的单链或双链DNA或RNA的线性、环状或超螺旋的自主复制序列、基因组整合序列、噬菌体或核苷酸序列，其中许多核苷酸序列已经连接或重组成独特的构建体，该构建体能够将启动子片段和所选基因产物的DNA序列与适当的3'非翻译序列一起引入到细胞中。The term "plasmid" refers to an extrachromosomal element that typically carries a gene that is not part of the cell's central metabolism and is usually in the form of a circular double-stranded DNA molecule. Such elements can be linear, circular, or supercoiled autonomously replicating sequences, genome-integrated sequences, bacteriophage sequences, or nucleotide sequences derived from any source of single-stranded or double-stranded DNA or RNA. Many of these nucleotide sequences have been linked or reassembled into a unique construct capable of introducing promoter fragments and the DNA sequence of selected gene products into the cell along with appropriate 3' untranslated sequences.

多核苷酸或多肽与另一多核苷酸或多肽具有一定百分比的“序列同一性”，意味着在比对时，当比较两个序列时，该百分比的碱基或氨基酸是相同的。序列相似性可以以许多不同的方式确定。为了确定序列同一性，可以使用包括可通过万维网在ncbi.nlm.nih.gov/BLAST/获得的BLAST的方法和计算机程序来比对序列。另一比对算法是FASTA，其可以在来自Madison,Wis.,USA的Genetics Computing Group(GCG)包中获得。用于比对的其他技术描述在Methods in Enzymology,第266卷:Computer Methods for MacromolecularSequence Analysis(1996),ed.Doolittle,Academic Press,Inc中。特别引人关注的是允许序列中的空位的比对程序。Smith-Waterman是允许序列比对中的空位的一种类型的算法。参见Meth.Mol.Biol.70:173-187(1997)。此外，使用Needleman和Wunsch比对方法的GAP程序可用于比对序列。参见J.Mol.Biol.48:443-453(1970)。A certain percentage of “sequence identity” between a polynucleotide or polypeptide and another polynucleotide or polypeptide means that, during alignment, that percentage of bases or amino acids are identical when comparing the two sequences. Sequence similarity can be determined in many different ways. To determine sequence identity, sequences can be aligned using methods and computer programs including BLAST, which is available online at ncbi.nlm.nih.gov/BLAST/. Another alignment algorithm is FASTA, which is available in the Genetics Computing Group (GCG) package from Madison, Wis., USA. Other techniques for alignment are described in Methods in Enzymology, Vol. 266: Computer Methods for Macromolecular Sequence Analysis (1996), ed. Doolittle, Academic Press, Inc. Of particular interest are alignment procedures that allow for gaps in the sequence. Smith-Waterman is one type of algorithm that allows for gaps in sequence alignment. See Meth.Mol.Biol.70:173-187 (1997). Additionally, the GAP procedure using the Needleman and Wunsch alignment methods can be used for sequence alignment. See J.Mol.Biol.48:443-453 (1970).

在一个实施方案中，本发明提供了包含编码SEQ ID NO:2的多肽(人RGPGR ORF15)的核苷酸序列的修饰核酸分子，其中核酸序列已经经过密码子优化。在另一个实施方案中，编码SEQ ID NO:2的多肽并进行密码子优化的起始核酸序列具有SEQ ID NO:3所示的核苷酸序列。在优选实施方案中，编码SEQ ID NO:2的多肽的序列针对人表达进行密码子优化。SEQ ID NO:1是针对人表达进行优化的SEQ ID NO:3的密码子优化版本：ATGAGAGAACCCGAGGAACTGATGCCCGACTCTGGCGCCGTGTTTACCTTCGGCAAGAGCAAGTTCGCCGAGAACAACCCCGGCAAGTTCTGGTTCAAGAACGACGTGCCAGTGCACCTGAGCTGCGGAGATGAACACTCTGCCGTGGTCACCGGCAACAACAAGCTGTACATGTTCGGCAGCAACAACTGGGGCCAGCTCGGCCTGGGATCTAAGTCTGCCATCAGCAAGCCTACCTGCGTGAAGGCCCTGAAGCCTGAGAAAGTGAAACTGGCCGCCTGCGGCAGAAATCACACCCTGGTTTCTACCGAAGGCGGCAATGTGTATGCCACCGGCGGAAACAATGAGGGACAGCTTGGACTGGGCGACACCGAGGAAAGAAACACCTTCCACGTGATCAGCTTTTTCACCAGCGAGCACAAGATCAAGCAGCTGAGCGCCGGCTCTAATACCTCTGCCGCTCTGACAGAGGACGGCAGACTGTTTATGTGGGGCGACAATTCTGAGGGCCAGATCGGACTGAAGAACGTGTCCAATGTGTGCGTGCCCCAGCAAGTGACAATCGGCAAGCCTGTGTCTTGGATCAGCTGCGGCTACTACCACAGCGCCTTTGTGACAACCGATGGCGAGCTGTATGTGTTCGGCGAGCCAGAGAATGGCAAGCTGGGACTGCCTAACCAGCTGCTGGGCAATCACAGAACCCCTCAGCTGGTGTCTGAGATCCCCGAAAAAGTGATCCAGGTGGCCTGTGGCGGAGAGCACACAGTGGTGCTGACAGAGAATGCCGTGTACACCTTTGGCCTGGGCCAGTTTGGACAACTCGGACTGGGAACCTTCCTGTTCGAGACAAGCGAGCCCAAAGTGATCGAGAACATCCGGGACCAGACCATCAGCTACATCAGCTGTGGCGAGAACCACACAGCCCTGATCACAGACATCGGCCTGATGTACACATTCGGCGACGGAAGGCATGGAAAGCTCGGACTTGGCCTGGAAAACTTCACCAACCACTTCATCCCTACGCTGTGCAGCAACTTCCTGCGGTTCATTGTGAAGCTGGTGGCCTGCGGAGGATGCCACATGGTGGTTTTTGCTGCCCCTCACAGAGGCGTGGCCAAAGAGATTGAGTTCGACGAGATCAACGATACCTGCCTGAGCGTGGCCACCTTCCTGCCTTACAGCAGCCTGACATCTGGCAACGTGCTGCAGAGGACACTGAGCGCCAGAATIn one embodiment, the present invention provides a modified nucleic acid molecule comprising the nucleotide sequence encoding the polypeptide (human RGPGR ORF15) of SEQ ID NO:2, wherein the nucleic acid sequence has been codon-optimized. In another embodiment, the starting nucleic acid sequence encoding the polypeptide of SEQ ID NO:2 and codon-optimized has the nucleotide sequence shown in SEQ ID NO:3. In a preferred embodiment, the sequence encoding the polypeptide of SEQ ID NO:2 is codon-optimized for human expression. SEQ ID NO:1 is a codon-optimized version of SEQ ID NO:3 optimized for human expression: ATGAGAGAACCCGAGGAACTGATGCCCGACTCTGGCGCCGTGTTTACCTTCGGCAAGAGCAAGTTCGCCGAGAACAACCCCGGCAAGTTCTGGTTCAAGAACGACGTGCCAGTGCACCTGAGCTGCGGAGATGAACACTCTGCCGTGGTCACCGGCAACAACAAGCTGTACATGTTCGGCAGCAACAACTGGGGCCAGCTCGGCCTGGGATCTAAGTCTGCCATCAGCAAGCCTACCTGCGTGAAGGCCCTGAAGCCTGAGAAAGTGAAACTGGC CGCCTGCGGCAGAAATCACACCCTGGTTTCTACCGAAGGCGGCAATGTGTATGCCACCGGCGGAAACAATGAGGGACAGCTTGGACTGGGCGACACCGAGGAAAGAAACACCTTCCACGTGATCAGCTTTTTTCACCAGCGAGCACAAGATCAAGCAGCT GAGCGCCGGCTCTAATACCTCTGCCGCTCTGACAGAGGACGGCAGACTGTTTATGTGGGGCGACAATTCTGAGGGCCAGATCGGACTGAAGAACGTGTCCAATGTGGTGCGTGCCCCAGCAAGTGACAATCGGCAAGCCTGTGTCTTGGATCAGCTGCGG CTACTACCACAGCGCCTTTGTGACAACCGATGGCGAGCTGTATGTGTTCGGCGAGCCAGAGAATGGCAAGCTGGGACTGCCTAACCAGCTGCTGGGCAATCACAGAACCCCTCAGCTGGTGTCTGAGATCCCCGAAAAAGTGATCCAGGTGGCCTGTGG CGGAGAGCACACAGTGGTGCTGACAGAGAATGCCGTGTACACCTTTGGCCTGGGCCAGTTTGGACAACTCGGACTGGGAACCTTCCTGTTCGAGACAAGCGAGCCCAAAGTGATCGAGAACATCCGGGACCAGACCATCAGCTACATCAGCTGTGGCGA GAACCACACAGCCCTGATCACAGACATCGGCCTGATGTACACATTCGGCGACGGAAGGCATGGAAAGCTCGGACTTGGCCTGGAAAACTTCACCAACCACTTCATCCCTACGCTGTGCAGCAACTTCCTGCGGTTTCATTGTGAAGCTGGTGGCCTGCGG AGGATGCCACATGGTGGTTTTTGCTGCCCCTCACAGAGGCGTGGCCAAAGAGATTGAGTTCGACGAGATCAACGATAACCTGCCTGAGCGTGGCCACCTTCCTGCCTTACAGCAGCCTGACATCTGGCAACGTGCTGCAGAGGACACTGAGCGCCAGAAT

GCGCAGACGGGAAAGAGAGAGAAGCCCCGACAGCTTCAGCATGAGAAGAACCCTGCCGCGCAGACGGGAAAGAGAGAGAAGCCCCGACAGCTTCAGCATGAGAAGAACCCTGCC

TCCAATCGAGGGCACACTGGGCCTGTCTGCCTGCTTTCTGCCTAACAGCGTGTTCCCCATCCAATCGAGGGCACACTGGGCCTGTCTGCCTGCTTTCTGCCTAACAGCGTGTTCCCCA

GATGCAGCGAGAGAAACCTGCAAGAGAGCGTGCTGAGCGAGCAGGATCTGATGCAGCCGATGCAGCGAGAGAAACCTGCAAGAGAGCGTGCTGAGCGAGCAGGATCTGATGCAGCC

TGAGGAACCCGACTACCTGCTGGACGAGATGACCAAAGAGGCCGAGATCGACAACAGCTGAGGAACCCGACTACCTGCTGGACGAGATGACCAAAGAGGCCGAGATCGACAACAGC

AGCACAGTGGAAAGCCTGGGCGAGACAACCGACATCCTGAACATGACCCACATCATGAAGCACAGTGGAAAGCCTGGGCGAGACAACCGACATCCTGAACATGACCCACATCATGA

GCCTGAACAGCAACGAGAAGTCTCTGAAGCTGAGCCCCGTGCAGAAGCAGAAGAAGCGCCTGAACAGCAACGAGAAGTCTCTGAAGCTGAGCCCCGTGCAGAAGCAGAAGAAGC

AGCAGACCATCGGCGAGCTGACACAGGATACTGCCCTGACCGAGAACGACGACAGCGAAGCAGACCATCGGCGAGCTGACACAGGATACTGCCCTGACCGAGAACGACGACAGCGA

CGAGTACGAAGAGATGAGCGAGATGAAGGAAGGCAAGGCCTGCAAGCAGCACGTGTCCGAGTACGAAGAGATGAGCGAGATGAAGGAAGGCAAGGCCTGCAAGCAGCACGTGTC

CCAGGGCATCTTTATGACCCAGCCTGCCACCACCATCGAGGCCTTTTCCGACGAGGAAGCCAGGGCATCTTTATGACCCAGCCTGCCACCACCATCGAGGCCTTTTCCGACGAGGAAG

TGGAAATCCCCGAGGAAAAAGAGGGCGCCGAGGACAGCAAAGGCAACGGCATTGAGGTGGAAATCCCCGAGGAAAAAGAGGGCGCCGAGGACAGCAAAGGCAACGGCATTGAGG

AACAAGAGGTGGAAGCCAACGAAGAGAACGTGAAGGTGCACGGCGGACGGAAAGAAAACAAGAGGTGGAAGCCAACGAAGAGAACGTGAAGGTGCACGGCGGACGGAAAGAA

AAGACCGAGATCCTGAGCGACGACCTGACCGATAAGGCCGAGGTTTCCGAGGGCAAAGAAGACCGAGATCCTGAGCGACGACCTGACCGATAAGGCCGAGGTTTCCGAGGGCAAAG

CCAAGTCTGTGGGAGAAGCCGAGGATGGACCTGAAGGCCGCGGAGATGGAACCTGTGACCAAGTCTGTGGGAGAAGCCGAGGATGGACCTGAAGGCCGCGGAGATGGAACCTGTGA

AGAAGGATCTAGCGGAGCCGAGCACTGGCAGGATGAGGAACGCGAGAAGGGCGAGAAAGAAGGATCTAGCGGAGCCGAGCACTGGCAGGATGAGGAACGCGAGAAGGGCGAGAA

AGACAAAGGCAGAGGCGAGATGGAAAGACCCGGCGAGGGCGAAAAAGAGCTGGCCGAGACAAAGGCAGAGGCGAGATGGAAAGACCCGGCGAGGGCGAAAAAGAGCTGGCCG

AGAAAGAGGAATGGAAGAAACGCGACGGCGAAGAACAAGAGCAGAAAGAAAGAGAGAGAAAGAGGAATGGAAGAAACGCGACGGCGAAGAACAAGAGCAGAAAGAAAGAGAG

CAGGGCCACCAGAAAGAACGGAATCAAGAGATGGAAGAAGGCGGCGAGGAAGAACACCAGGCCACCAGAAAGAACGGAATCAAGAGATGGAAGAAGGCGGCGAGGAAGAACAC

GGCGAAGGGGAAGAAGAGGAAGGCGACCGAGAGGAAGAAGAAGAGAAAGAAGGCGGGCGAAGGGGAAGAAGAGGAAGGCGACCGAGAGGAAGAAGAAGAGAAAGAAGGCG

AAGGCAAAGAAGAAGGCGAGGGCGAAGAGGTGGAAGGCGAGCGTGAAAAAGAAGAGAAGGCAAAGAAGAAGGCGAGGGCGAAGAGGTGGAAGGCGAGCGTGAAAAAGAAGAG

GGCGAACGCAAGAAAGAAGAACGCGCCGGAAAAGAGGAAAAAGGCGAGGAAGAGGGGGCGAACGCAAGAAAGAAGAACGCGCCGGAAAAGAGGAAAAAGGCGAGGAAGAGGG

CGACCAAGGCGAAGGCGAGGAAGAAGAAACTGAAGGCAGAGGGGAAGAGAAAGAGGCGACCAAGGCGAAGGCGAGGAAGAAGAAACTGAAGGCAGAGGGGAAGAGAAAGAGG

AAGGCGGCGAAGTCGAAGGCGGAGAGGTTGAAGAAGGCAAAGGCGAGCGAGAAGAGAAGGCGGCGAAGTCGAAGGCGGAGAGGTTGAAGAAGGCAAAGGCGAGCGAGAAGAG

GAAGAAGAAGAAGGCGAAGGCGAGGAAGAGGAAGGCGAAGGCGAAGAGGAAGAAGGAAGAAGAAGAAGGCGAAGGCGAGGAAGAGGAAGGCGAAGGCGAAGAGGAAGAAG

GCGAAGGGGAAGAAGAAGAAGGCGAAGGCAAGGGCGAAGAGGAGGGCGAAGAAGGCGCGAAGGGGAAGAAGAAGAAGGCGAAGGCAAGGGCGAAGAGGAGGGCGAAGAAGGC

GAGGGCGAAGAGGAGGGCGAAGAAGGCGAAGGCGAGGGCGAAGAAGAAGAAGGCGAGAGGGCGAAGAGGAGGGCGAAGAAGGCGAAGGCGAGGGCGAAGAAGAAGAAGGCGA

AGGCGAAGGCGAGGAAGAAGGCGAAGGCGAAGGGGAAGAAGAGGAAGGCGAAGGCGAGGCGAAGGCGAGGAAGAAGGCGAAGGCGAAGGGGAAGAAGAAGGAAGCGAAGGCG

AAGGCGAAGAAGAAGGCGAAGGCGAGGGCGAAGAGGAAGAAGGCGAAGGCAAAGGGAAGGCGAAGAAGAAGGCGAAGGCGAGGGCGAAGAGGAAGAAGGCGAAGGCAAAGGG

GAAGAAGAAGGCGAGGAAGGCGAAGGCGAAGGCGAGGAAGAAGAAGGCGAAGGCGAGAAGAAGAAGGCGAGGAAGGCGAAGGCGAAGGCGAGGAAGAAGAAGGCGAAGGCGA

GGGCGAAGATGGCGAAGGCGAAGGCGAAGAGGAAGAGGGCGAGTGGGAGGGCGAAGGGGCGAAGATGGCGAAGGCGAAGGCGAAGAGGAAGAGGGCGAGTGGGAGGGCGAAG

AAGAGGAAGGCGAAGGCGAGGGCGAAGAGGAAGGCGAAGGCGAGGGCGAAGAAGGCAAGAGGAAGGCGAAGGCGAGGGCGAAGAGGAAGGCGAAGGCGAGGGCGAAGAAGGC

GAAGGCGAAGGCGAGGAAGAGGAAGGCGAAGGCGAAGGGGAAGAAGAAGAGGGCGGAAGGCGAAGGCGAGGAAGAGGAAGGCGAAGGCGAAGGGGAAGAAGAAGAGGGCG

AAGAAGAAGGCGAAGAGGAAGGCGAAGGGGAAGAAGAAGGCGAAGGCGAAGGCGAAAGAAGAAGGCGAAGAGGAAGGCGAAGGGGAAGAAGAAGGCGAAGGCGAAGGCGA

AGAAGAGGAAGAGGGCGAAGTTGAAGGCGAGGTTGAGGGCGAAGAAGGCGAAGGCGAGAAGAGGAAGAGGGCGAAGTTGAAGGCGAGGTTGAGGGCGAAGAAGGCGAAGGCG

AAGGGGAAGAAGAAGAAGGCGAGGAAGAAGGGGAAGAGAGAGAAAAAGAAGGCGAAAGGGGAAGAAGAAGAAGGCGAGGAAGAAGGGGAAGAGAGAGAAAAAGAAGGCGA

GGGCGAAGAAAACCGCCGGAACCGCGAAGAGGAAGAGGAAGAAGAGGGCAAGTACCGGGCGAAGAAAACCGCCGGAACCGCGAAGAGGAAGAGGAAGAAGAGGGCAAGTACC

AAGAGACTGGCGAGGAAGAGAACGAGCGGCAGGATGGCGAAGAGTACAAGAAGGTGTAAGAGACTGGCGAGGAAGAGAACGAGCGGCAGGATGGCGAAGAGTACAAGAAGGTGT

CCAAGATCAAGGGCAGCGTGAAGTACGGCAAGCACAAGACCTACCAGAAGAAGTCCGTCCAAGATCAAGGGCAGCGTGAAGTACGGCAAGCACAAGACCTACCAGAAGAAGTCCGT

CACCAACACGCAAGGCAATGGAAAAGAACAGCGGAGCAAGATGCCCGTGCAGTCCAACACCAACACGCAAGGCAATGGAAAAGAACAGCGGAGCAAGATGCCCGTGCAGTCCAA

GAGGCTGCTGAAGAATGGCCCTAGCGGCAGCAAGAAATTCTGGAACAATGTGCTGCCCCACTACCTCGAGCTGAAGTGA (SEQ ID NO:1)。GAGGCTGCTGAAGAATGGCCCTAGCGGCAGCAAGAAATTCTGGAACAATGTGCTGCCCCACTACCTCGAGCTGAAGTGA (SEQ ID NO: 1).

在一些实施方案中，提供了编码人RPGR ORF15的密码子优化的序列，其缺少SEQID NO:1的TGA终止密码子(即由SEQ ID NO:1的核苷酸1-3456组成)。In some implementations, a codon-optimized sequence encoding human RPGR ORF15 is provided, which lacks the TGA stop codon of SEQ ID NO:1 (i.e., consists of nucleotides 1-3456 of SEQ ID NO:1).

在一方面，本公开提供了包含SEQ ID NO:1的核苷酸序列的多核苷酸，或包含与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少96％、至少97％、至少98％、或至少99％同一性的核苷酸序列的多核苷酸，并且所述多核苷酸编码具有SEQ ID NO:2的氨基酸序列的人RPGR多肽：In one aspect, this disclosure provides a polynucleotide comprising the nucleotide sequence of SEQ ID NO:1, or a polynucleotide comprising a nucleotide sequence having at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity with the nucleotide sequence of SEQ ID NO:1, and said polynucleotide encoding a human RPGR polypeptide having the amino acid sequence of SEQ ID NO:2:

MREPEELMPDSGAVFTFGKSKFAENNPGKFWFKNDVPVHLSCGDEHSAVVTGNNKLYMFGMREPEELMPDSGAVFFTFGKSKFAENNPGKFWFKNDVPVHLSCGDEHSAVVTGNNKLYMFG

SNNWGQLGLGSKSAISKPTCVKALKPEKVKLAACGRNHTLVSTEGGNVYATGGNNEGQLGSNNWGQLGLGSKSAISKPTCVKALKPEKVKLAACGRNHTLVSTEGGNVYATGGNNEGQLG

LGDTEERNTFHVISFFTSEHKIKQLSAGSNTSAALTEDGRLFMWGDNSEGQIGLKNVSNVCLGDTEERNTFHVISFFTSEHKIKQLSAGSNTSAALTEDGRLFMWGDNSEGQIGLKNVSNVC

VPQQVTIGKPVSWISCGYYHSAFVTTDGELYVFGEPENGKLGLPNQLLGNHRTPQLVSEIPEVPQQVTIGKPVSWISCGYYHSAFVTTDGELYVFGEPENGKLGLPNQLLGNHRTPQLVSEIPE

KVIQVACGGEHTVVLTENAVYTFGLGQFGQLGLGTFLFETSEPKVIENIRDQTISYISCGENHKVIQVACGGEHTVVLTENAVYTFGLGQFGQLGLGTFLFETSEPKVIENIRDQTISYISCGENH

TALITDIGLMYTFGDGRHGKLGLGLENFTNHFIPTLCSNFLRFIVKLVACGGCHMVVFAAPHTALITDIGLMYTFGDGRHGKLGLGLENFTNHFIPTLCSNFLRFIVKLVACGGCHMVVFAAPH

RGVAKEIEFDEINDTCLSVATFLPYSSLTSGNVLQRTLSARMRRRERERSPDSFSMRRTLPPIERGVAKEIEFDEINDTCLSVATFLPYSSLTSGNVLQRTLSARMRRRERERSPDSFSMRRTLPPIE

GTLGLSACFLPNSVFPRCSERNLQESVLSEQDLMQPEEPDYLLDEMTKEAEIDNSSTVESLGGTLGLSACFLPNSVFPRCSERNLQESVLSEQDLMQPEEPDYLLDEMTKEAEIDNSSTVESLG

ETTDILNMTHIMSLNSNEKSLKLSPVQKQKKQQTIGELTQDTALTENDDSDEYEEMSEMKEETTDILNMTHIMSLNSNEKSLKLSPVQKQKKQQTIGELTQDTALTENDDSDEYEEMSEMKE

GKACKQHVSQGIFMTQPATTIEAFSDEEVEIPEEKEGAEDSKGNGIEEQEVEANEENVKVHGGKACKQHVSQGIFMTQPATTIEAFSDEEVEIPEEKEGAEDSKGNGIEEQEVEANEENVKVHG

GRKEKTEILSDDLTDKAEVSEGKAKSVGEAEDGPEGRGDGTCEEGSSGAEHWQDEEREKGGRKEKTEILSDDLTDKAEVSEGKAKSVGEAEDGPEGRGDGTCEEGSSGAEHWQDEEREKG

EKDKGRGEMERPGEGEKELAEKEEWKKRDGEEQEQKEREQGHQKERNQEMEEGGEEEHEKDKGRGEMERPGEGEKELAEKEEWKKRDGEEQEQKEREQGHQKERNQEMEEGGEEEH

GEGEEEEGDREEEEEKEGEGKEEGEGEEVEGEREKEEGERKKEERAGKEEKGEEEGDQGEGEGEEEEGDREEEEEKEGEGKEEGEGEEVEGEREKEEGERKKEERAGKEEKGEEEGDQGE

GEEEETEGRGEEKEEGGEVEGGEVEEGKGEREEEEEEGEGEEEEGEGEEEEGEGEEEEGEGGEEEETEGRGEEKEEGGEVEGGEVEEGKGEREEEEEEGEGEEEEGEGEEEEGEGEEEEGEG

KGEEEGEEGEGEEEGEEGEGEGEEEEGEGEGEEEGEGEGEEEEGEGEGEEEGEGEGEEEEGKGEEEGEEGEGEEEGEEGEGEGEEEEGEGEGEEEGEGEGEEEEGEGEGEEEGEGEGEEEEG

EGKGEEEGEEGEGEGEEEEGEGEGEDGEGEGEEEEGEWEGEEEEGEGEGEEEGEGEGEEGEGKGEEEGEEGEGEGEEEEGEGEGEDGEGEGEEEEGEWEGEEEEGEGEGEEEGEGEGEEG

EGEGEEEEGEGEGEEEEGEEEGEEEGEGEEEGEGEGEEEEEGEVEGEVEGEEGEGEGEEEEEGEGEEEEGEGEGEEEEGEEEGEEEGEGEEEGEGEGEEEEEEGEVEGEVEGEEGEGEGEEEE

GEEEGEEREKEGEGEENRRNREEEEEEEGKYQETGEEENERQDGEEYKKVSKIKGSVKYGKHKTYQKKSVTNTQGNGKEQRSKMPVQSKRLLKNGPSGSKKFWNNVLPHYLELK(SEQ ID NO:2)。GEEEGEEREKEGEGEENRRNREEEEEEEGKYQETGEEENERQDGEEYKKVSKIKGSVKYGKHKTYQKKSVTNTQGNGKEQRSKMPVQSKRLLKNGPSGSKKFWNNVLPHYLELK(SEQ ID NO:2).

术语“密码子优化的”在其涉及用于转化各种宿主的核酸分子的基因或编码区时，是指改变核酸分子的基因或编码区中的密码子以反映宿主生物体的典型密码子使用而不改变由DNA编码的多肽。此类优化包括用在该生物体的基因中更频繁使用的一个或多个密码子替换至少一个、或超过一个、或大量的密码子。The term "codon-optimized," when applied to the genes or coding regions of nucleic acid molecules used to transform various hosts, refers to altering the codons in the genes or coding regions of a nucleic acid molecule to reflect the typical codon usage of the host organism without changing the polypeptide encoded by DNA. Such optimization involves replacing at least one, more than one, or a large number of codons with one or more codons that are more frequently used in the organism's genes.

包含编码任何多肽链的氨基酸的密码子的核苷酸序列中的偏差允许编码基因的序列中的变化。由于每个密码子由三个核苷酸组成，并且构成DNA的核苷酸限于四个特定碱基，因此存在64种可能的核苷酸组合，其中61种编码氨基酸(其余三个密码子编码终止翻译的信号)。显示哪些密码子编码哪些氨基酸的“遗传密码”作为表1转载于本文中。因此，许多氨基酸由超过一个密码子指定。例如，氨基酸丙氨酸和脯氨酸由四个三联体编码，丝氨酸和精氨酸由六个，而色氨酸和蛋氨酸仅由一个三联体编码。这种简并性允许DNA碱基组成在宽范围内变化，而不改变由DNA编码的蛋白的氨基酸序列。Deviations in the nucleotide sequence containing codons that encode amino acids in any polypeptide chain allow for variations in the sequence encoding genes. Since each codon consists of three nucleotides, and the nucleotides that make up DNA are limited to four specific bases, there are 64 possible nucleotide combinations, 61 of which encode amino acids (the remaining three codons encode signals that terminate translation). The “genetic code” showing which codons encode which amino acids is reproduced in Table 1 herein. Therefore, many amino acids are designated by more than one codon. For example, the amino acids alanine and proline are encoded by four triplets, serine and arginine by six, while tryptophan and methionine are encoded by only one triplet. This degeneracy allows for a wide range of variations in the base composition of DNA without altering the amino acid sequence of the protein encoded by the DNA.

表-US-00001表1标准遗传密码T C A G T TTT Phe(F)TCT Ser(S)TAT Tyr(Y)TGTCys(C)TTC Phe(F)TCC Ser(S)TAC Tyr(Y)TGC TTA Leu(L)TCA Ser(S)TAA Stop TGA StopTTG Leu(L)TCG Ser(S)TAG Stop TGG Trp(W)C CTT Leu(L)CCT Pro(P)CAT His(H)CGTArg(R)CTC Leu(L)CCC Pro(P)CAC His(H)CGC Arg(R)CTALeu(L)CCA Pro(P)CAA Gln(Q)CGA Arg(R)CTG Leu(L)CCG Pro(P)CAG Gln(Q)CGG Arg(R)A ATT Ile(I)ACT Thr(T)AATAsn(N)AGT Ser(S)ATC Ile(I)ACC Thr(T)AAC Asn(N)AGC Ser(S)ATA Ile(I)ACA Thr(T)AAA Lys(K)AGA Arg(R)ATG Met(M)ACG Thr(T)AAG Lys(K)AGG Arg(R)G GTT Val(V)GCTAla(A)GAT Asp(D)GGT Gly(G)GTC Val(V)GCC Ala(A)GAC Asp(D)GGC Gly(G)GTA Val(V)GCA Ala(A)GAA Glu(E)GGA Gly(G)GTG Val(V)GCG Ala(A)GAG Glu(E)GGG Gly(G)。Table US-00001 Table 1 Standard Genetic Code T C A G T TTT Phe(F) TCT Ser(S) TAT Tyr(Y) TTG TCys(C) TTC Phe(F) TCC Ser(S) TAC Tyr(Y) TGC TTA Leu(L) TCA Ser(S) TAA Stop TGA Stop TTG Leu(L) TCG Ser(S) TAG Stop p TGG Trp(W)C CTT Leu(L)CCT Pro(P)CAT His(H)CGTArg(R)CTC Leu(L)CCC Pro(P)CAC His(H )CGC Arg(R)CTALeu(L)CCA Pro(P)CAA Gln(Q)CGA Arg(R)CTG Leu(L)CCG Pro(P)CAG Gln(Q)CGG Arg(R)A ATT Ile(I)ACT Thr(T)AATAsn(N)AGT Ser(S)ATC Ile(I)ACC Thr(T)AAC Asn(N)AGC Ser(S)ATA Ile(I)ACA Thr(T)AAA Lys(K)AGA Arg(R)ATG Met(M)ACG Thr(T)AAG Lys(K)AGG Arg (R)G GTT Val(V)GCTAla(A)GAT Asp(D)GGT Gly(G)GTC Val(V)GCC Ala(A)GAC Asp(D)GGC Gly( G)GTA Val(V)GCA Ala(A)GAA Glu(E)GGA Gly(G)GTG Val(V)GCG Ala(A)GAG Glu(E)GGG Gly(G).

许多生物体表现出使用特定密码子来编码特定氨基酸在生长的肽链中的插入的偏性。密码子偏爱性或密码子偏性(生物体之间密码子使用的差异)由遗传密码的简并性提供，并在许多生物体中得到充分证明。密码子偏性通常与信使RNA(mRNA)的翻译效率相关，而信使RNA(mRNA)的翻译效率又被认为尤其取决于被翻译的密码子的性质和特定转运RNA(tRNA)分子的可用性。细胞中所选tRNA的优势通常反映了肽合成中最频繁使用的密码子。因此，可以基于密码子优化来定制基因以便在给定生物体中获得最佳基因表达。Many organisms exhibit a bias in using specific codons to encode the insertion of specific amino acids into the growing peptide chain. Codon bias, or codon skewness (the difference in codon use between organisms), is provided by the degeneracy of the genetic code and is well-documented in many organisms. Codon bias is generally associated with the translation efficiency of messenger RNA (mRNA), which is thought to depend in particular on the nature of the codons being translated and the availability of specific transfer RNA (tRNA) molecules. The dominance of the selected tRNA in a cell generally reflects the most frequently used codons in peptide synthesis. Therefore, genes can be tailored based on codon optimization to achieve optimal gene expression in a given organism.

鉴于可用于多种动物、植物和微生物物种的大量基因序列，已经计算了密码子使用的相对频率。密码子使用表可以在例如“密码子使用数据库(Codon Usage Database)”处获得，所述“密码子使用数据库”可以在www.kazusa.or.jp/codon/(2012年6月18日访问)处获得。参见Nakamura,Y.等人,Nucl.Acids Res.28:292(2000)。Given the vast number of gene sequences available for use in a wide variety of animal, plant, and microbial species, the relative frequencies of codon usage have been calculated. Codon usage tables are available, for example, at the "Codon Usage Database," which can be found at www.kazusa.or.jp/codon/ (accessed June 18, 2012). See Nakamura, Y. et al., Nucl. Acids Res. 28:292 (2000).

以优化的频率随机分配密码子以编码给定的多肽序列可以通过计算每个氨基酸的密码子频率，并随后将密码子随机分配给多肽序列来手动完成。此外，可以使用各种算法和计算机软件程序来计算最佳序列。Randomly assigning codons at optimized frequencies to encode a given polypeptide sequence can be done manually by calculating the codon frequency for each amino acid and then randomly assigning codons to the polypeptide sequence. Alternatively, various algorithms and computer software programs can be used to calculate the optimal sequence.

非病毒载体Non-viral vector

在一些实施方案中，提供了包含本文中描述的修饰核酸的非病毒载体(例如表达质粒)。优选地，非病毒载体是包含SEQ ID NO:1的核酸序列或与其具有至少90％同一性的序列的质粒。In some embodiments, a non-viral vector (e.g., an expression plasmid) comprising the modified nucleic acid described herein is provided. Preferably, the non-viral vector is a plasmid comprising the nucleic acid sequence of SEQ ID NO:1 or a sequence having at least 90% identity with it.

病毒载体Viral vector

在优选实施方案中，提供了包含本文中描述的修饰(密码子优化)核酸的病毒载体。优选地，病毒载体包含可操作地连接至表达控制序列的SEQ ID NO:1的核酸序列，或与其具有至少90％同一性的序列。合适的病毒载体的实例包括但不限于腺病毒、逆转录病毒、慢病毒、疱疹病毒和腺相关病毒(AAV)载体。In a preferred embodiment, a viral vector comprising the modified (codon-optimized) nucleic acid described herein is provided. Preferably, the viral vector comprises a nucleic acid sequence operatively linked to, or having at least 90% identity with, the expression control sequence of SEQ ID NO:1. Examples of suitable viral vectors include, but are not limited to, adenovirus, retrovirus, lentivirus, herpesvirus, and adeno-associated virus (AAV) vectors.

在一个优选实施方案中，病毒载体包括细小病毒基因组的一部分，诸如rep和cap基因缺失和/或被修饰的RPGRorf15基因序列及其相关表达控制序列替换的AAV基因组。修饰的人RPGRorf15基因序列通常邻近一个或两个(即侧接)对病毒复制足够的AAV TR或TR元件插入(Xiao等人,1997,J.Virol.71(2):941-948)，替代编码病毒rep和cap蛋白的核酸。还可以包括适合用于促进修饰的RPGRorf15基因序列在靶细胞中的组织特异性表达的其他调控序列。In a preferred embodiment, the viral vector comprises a portion of the parvovirus genome, such as a deleted and/or modified RPGrf15 gene sequence and its associated expression control sequences replacing the AAV genome. The modified human RPGrf15 gene sequence typically has one or two adjacent (i.e., flanked) AAV TR or TR element insertions sufficient for viral replication (Xiao et al., 1997, J.Virol. 71(2):941-948), replacing the nucleic acids encoding the viral rep and cap proteins. Additional regulatory sequences suitable for promoting tissue-specific expression of the modified RPGrf15 gene sequence in target cells may also be included.

在一些优选实施方案中，AAV病毒载体包含核酸，所述核酸从5'至3'包含：(a)AAV2末端重复，(b)hGRK启动子，(c)如本文中所述的密码子优化的RPGRorf15基因，(d)多聚腺苷酸化序列，和(e)AAV2末端重复。在一个特别优选的实施方案中，AAV病毒载体包含核酸(转基因盒)，所述核酸(转基因盒)包含SEQ ID NO:5的序列或与其具有至少90％、至少95％、至少98％或至少99％同一性的序列：In some preferred embodiments, the AAV viral vector comprises nucleic acid comprising, from 5' to 3': (a) an AAV2 terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf15 gene as described herein, (d) a polyadenylated sequence, and (e) an AAV2 terminal repeat. In a particularly preferred embodiment, the AAV viral vector comprises nucleic acid (transgenic cassette) comprising the sequence of SEQ ID NO:5 or a sequence having at least 90%, at least 95%, at least 98%, or at least 99% identity with it.

(SEQ ID NO:5)。(SEQ ID NO:5).

在下表2中鉴定了SEQ ID NO:5的转基因盒的组件及其相应的位置：The components of the transgenic cassette of SEQ ID NO:5 and their corresponding locations are identified in Table 2 below:

表2Table 2

位置(bp)Location (bp) 组件Components 长度(bp)Length (bp) 1-1451-145 5’ITR5’ITR 145145 170-368170-368 GRK启动子GRK promoter 199199 398-3856398-3856 RPGRorf15 cDNARPGRorf15 cDNA 34593459 3899-41433899-4143 SV40多聚ASV40 Poly A 245245 4159-43044159-4304 3’ITR3’ITR 145145

5'ITR具有以下序列：The 5'ITR has the following sequence:

TTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAACTCCATCACTAGGGGTTCCT(SEQ ID NO:6)TTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAACTCCATCACTAGGGGTTCCT (SEQ ID NO: 6)

3'ITR具有以下序列：The 3'ITR has the following sequence:

AGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGCGTCGGGCGACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAA(SEQ ID NO:7)AGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGCGTCGGGCGACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAA(SEQ ID NO:7)

SV40多聚腺苷酸化序列具有以下序列：The SV40 polyadenylated sequence has the following sequence:

GGGGATCCAGACATGATAAGATACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGGTGTGGGAGGTTTTTTAAAGCAAGTAAAACCTCTACAAATGTGGTATGGCTGATTATGATCA(SEQ ID NO:8)。GGGGATCCAGACATGATAAGATACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAAT AAACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGGTGTGGGAGGTTTTTTAAAGCAAGTAAAACCTCTACAAATGTGGTATGGCTGATTATGATCA (SEQ ID NO: 8).

本领域技术人员将理解的是，包含转基因且缺少病毒复制所需的病毒蛋白(例如cap和rep)的AAV载体不能复制，因为此类蛋白是病毒复制和包装所必需的。辅助病毒通常包括腺病毒或单纯疱疹病毒。备选地，如下所述，可以向包装细胞提供辅助功能(E1a、E1b、E2a、E4和VA RNA)，包括通过用编码各种辅助元件的一种或多种核酸转染该细胞和/或该细胞可包含编码辅助蛋白的核酸。例如，HEK 293通过用腺病毒5DNA转化人细胞来产生，并且其现在表达大量的腺病毒基因，包括但不限于E1和E3(参见例如Graham等人,1977,J.Gen.Virol.36:59-72)。由此，那些辅助功能可以由HEK 293包装细胞来提供，而不需要通过例如编码它们的质粒来将它们供应给细胞。Those skilled in the art will understand that AAV vectors containing transgenes but lacking viral proteins (e.g., cap and rep) required for viral replication cannot replicate, as such proteins are essential for viral replication and packaging. Helper viruses typically include adenoviruses or herpes simplex viruses. Alternatively, as described below, helper functions (E1a, E1b, E2a, E4, and VA RNA) can be provided to packaging cells, including by transfecting the cells with one or more nucleic acids encoding various helper elements and/or by the cells containing nucleic acids encoding helper proteins. For example, HEK 293 is produced by transforming human cells with adenovirus 5 DNA and now expresses a large number of adenovirus genes, including but not limited to E1 and E3 (see, for example, Graham et al., 1977, J. Gen. Virol. 36:59-72). Thus, those helper functions can be provided by HEK 293 packaging cells without the need to supply them to the cells via, for example, plasmids encoding them.

病毒载体可以是任何合适的核酸构建体，诸如DNA或RNA构建体，并且可以是单链、双链或双链体的(即如WO 2001/92551中所述自身互补的)。The viral vector can be any suitable nucleic acid construct, such as DNA or RNA constructs, and can be single-stranded, double-stranded, or double-stranded (i.e., self-complementary as described in WO 2001/92551).

包装的病毒载体的病毒衣壳组件可以是细小病毒衣壳。AAV Cap和嵌合衣壳是优选的。例如，病毒衣壳可以是AAV衣壳(例如AAV1、AAV2、AAV3、AAV4、AAV5、AAV6、AAV7、AAV8、AAV9、AAV10、AAV11、AAV12、AAV1.1、AAV2.5、AAV6.1、AAV6.3.1、AAV9.45、AAVrh10、AAVrh74、RHM4-1、AAV2-TT、AAV2-TT-S312N、AAV3B-S312N、AAV-LK03、蛇AAV、鸟类AAV、牛AAV、犬AAV、马AAV、绵羊AAV、山羊AAV、虾AAV和目前已知或以后发现的任何其他AAV)。参见例如Fields等人,VIROLOGY,第2卷,第69章(4.sup.th ed.,Lippincott-Raven Publishers)。The viral capsid assembly of the packaged viral vector can be a small viral capsid. AAV caps and chimeric capsids are preferred. For example, the viral capsid can be an AAV capsid (e.g., AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV1.1, AAV2.5, AAV6.1, AAV6.3.1, AAV9.45, AAVrh10, AAVrh74, RHM4-1, AAV2-TT, AAV2-TT-S312N, AAV3B-S312N, AAV-LK03, snake AAV, bird AAV, bovine AAV, canine AAV, horse AAV, sheep AAV, goat AAV, shrimp AAV, and any other AAV currently known or later discovered). See, for example, Fields et al., VIROLOGY, Vol. 2, Chapter 69 (4.sup.th ed., Lippincott-Raven Publishers).

在一些实施方案中，包装的病毒载体的病毒衣壳组件是天然AAV衣壳的变体(即相对于天然AAV衣壳包含一个或多个修饰)。在一些实施方案中，衣壳是AAV2、AAV5或AAV8衣壳的变体。在优选实施方案中，衣壳是AAV2衣壳的变体，诸如美国专利申请公开号2019/0255192A1中描述的那些(例如包含SEQ ID NO:42-59中任一种的氨基酸序列)。在一个特别优选的实施方案中，衣壳包含具有以下氨基酸序列的VP1衣壳蛋白：MAADGYLPDWLEDTLSEGIRQWWKLKPGPPPPKAAERHKDDSRGLVLPGYKYLGPFNGLDKGEPVNEADAAALEHDKAYDRQLDSGDNPYLKYNHADAEFQERLKEDTSFGGNLGRAVFQAKKRVLEPLGLVEEPVKTAPGKKRPVEHSPVEPDSSSGTGKAGQQPARKRLNFGQTGDADSVPDPQPLGQPPAAPSGLGTNTMATGSGAPMADNNEGADGVGNSSGNWHCDSTWMGDRVITTSTRTWALPTYNNHLYKQISSQSGASNDNHYFGYSTPWGYFDFNRFHCHFSPRDWQRLINNNWGFRPKRLNFKLFNIQVKEVTQNDGTTTIANNLTSTVQVFTDSEYQLPYVLGSAHQGCLPPFPADVFMVPQYGYLTLNNGSQAVGRSSFYCLEYFPSQMLRTGNNFTFSYTFEDVPFHSSYAHSQSLDRLMNPLIDQYLYYLSRTNTPSGTTTQSRLQFSQAGASDIRDQSRNWLPGPCYRQQRVSKTSADNNNSEYSWTGATKYHLNGRDSLVNPGPAMASHKDDEEKFFPQSGVLIFGKQGSEKTNVDIEKVMITDEEEIRTTNPVATEQYGSVSTNLQRGNLAISDQTKHARQAATADVNTQGVLPGMVWQDRDVYLQGPIWAKIPHTDGHFHPSPLMGGFGLKHPPPQILIKNTPVPANPSTTFSAAKFASFITQYSTGQVSVEIEWELQKENSKRWNPEIQYTSNYNKSVNVDFTVDTNGVYSEPRPIGTRYLTRNL(SEQ ID NO:9)。In some embodiments, the viral capsid component of the packaged viral vector is a variant of the natural AAV capsid (i.e., containing one or more modifications relative to the natural AAV capsid). In some embodiments, the capsid is a variant of the AAV2, AAV5, or AAV8 capsid. In a preferred embodiment, the capsid is a variant of the AAV2 capsid, such as those described in U.S. Patent Application Publication No. 2019/0255192A1 (e.g., containing the amino acid sequence of any of SEQ ID NO: 42-59). In a particularly preferred embodiment, the capsid comprises a VP1 capsid protein having the following amino acid sequence: MAADGYLPDWLEDTLSEGIRQWWKLKPGPPPPKAAERHKDDSRGLVLPGYKYLGPFNGLDKGEPVNEADAAALEHDKAYDRQLDSGDNPYLKYNHADAEFQERLKEDTSFGGNLGRAVFQAKKRVLEPLGLVEEPVKTAPGKKRPVEHSPVEPDSSSGTGKAGQQPARKRLNFGQTGDADSVPDPQPLGQPPAAPSGLGTNTMATGSGAPMADNNEGADGVGNSSGNWHCDSTWMGDRVITTSTRTWALPTYNNHLYKQISSQSGASNDNHYFGYSTPWGYFDFNRFHCHFSPRDWQRLINNNWGFRPKRLNFKLFNIQVKEVTQNDGTTTIANNLTSTVQVFTDSEYQLPYVLGSAHQGC LPPFPADVFMVPQYGYLTLNNGSQAVGRSSFYCLEYFPSQMLRTGNNFTFSYTFEDVPFHSSYAHSQSLDRLMNPLIDQYLYYLSRTNTPSGTTTQSRL QFSQAGASDIRDQSRNWLPGPCYRQQRVSKTSADNNNSEYSWTGATKYHLNGRDSLVNPGPAMASHKDDEEKFFPQSGVLIFGKQGSEKTNVDIEKVMIT DEEEIRTTNPVATEQYGSVSTNLQRGNLAISDQTKHARQAATADVNTQGVLPGMVWQDRDVYLQGPIWAKIPHTDGHFHPSPLMGGFGLKHPPPQILIK NTPVPANPSTTFSAAKFASFITQYSTGQVSVEIEWELQKENSKRWNPEIQYTSNYNKSVNVDFTVDTNGVYSEPRPIGTRYLTRNL(SEQ ID NO:9).

SEQ ID NO:9的变体AAV衣壳蛋白相对于天然AAV2衣壳含有下列修饰：(i)位于组装的衣壳(仅VP1蛋白)内部的氨基酸位置34处的脯氨酸(P)至丙氨酸(A)突变，和(ii)存在于VP1、VP2和VP3中的氨基酸位置588处的10个氨基酸(亮氨酸-丙氨酸-异亮氨酸-丝氨酸-天冬氨酸-谷氨酰胺-苏氨酸-赖氨酸-组氨酸-丙氨酸/LAISDQTKHA)的插入。The variant AAV capsid protein of SEQ ID NO:9 contains the following modifications relative to the native AAV2 capsid: (i) a proline (P) to alanine (A) mutation at amino acid position 34 inside the assembled capsid (VP1 protein only), and (ii) an insertion of 10 amino acids (leucine-alanine-isoleucine-serine-aspartic acid-glutamine-threonine-lysine-histidine-alanine/LAISDQTKHA) at amino acid position 588 in VP1, VP2 and VP3.

完整的AAV Cap蛋白包括VP1、VP2和VP3。包含编码AAV VP衣壳蛋白的核苷酸序列的ORF可以包含少于完整的AAV Cap蛋白或可以提供完整的AAV Cap蛋白。The complete AAV Cap protein consists of VP1, VP2, and VP3. An ORF containing the nucleotide sequence encoding the AAV VP capsid protein may contain less than the complete AAV Cap protein or may provide the complete AAV Cap protein.

在再一个实施方案中，本发明提供了祖先AAV载体用于治疗性体内基因疗法的用途。具体而言，从头合成了基于计算机来源的序列并表征了其生物活性。这一努力导致了九个功能推定的祖先AAV的产生和AAV血清型1、2、8和9的预测祖先Anc80的鉴定(Zinn等人,2015,Cell Reports 12:1056-1068)。除了组装到病毒颗粒中之外，还可以通过使用WO2015/054653中描述的方法来实现此类祖先序列的预测和合成，其内容通过引用并入本文。值得注意的是，与当代病毒或其部分相比，使用由祖先病毒序列组装的病毒粒子可以表现出对当前人群中预先存在的免疫的降低的敏感性。In another embodiment, the present invention provides the use of ancestral AAV vectors for therapeutic in vivo gene therapy. Specifically, computer-derived sequences were synthesized de novo and their biological activity was characterized. This effort resulted in the generation of nine functionally presumed ancestral AAVs and the identification of the predicted ancestor Anc80 for AAV serotypes 1, 2, 8, and 9 (Zinn et al., 2015, Cell Reports 12:1056-1068). In addition to assembly into viral particles, the prediction and synthesis of such ancestral sequences can also be achieved using the methods described in WO2015/054653, the contents of which are incorporated herein by reference. Notably, viral particles assembled from ancestral viral sequences can exhibit reduced sensitivity to pre-existing immunity in the current population compared to contemporary viruses or portions thereof.

本发明包括被“宿主细胞”涵盖的包装细胞，其可以进行培养以产生本发明的包装病毒载体。本发明的包装细胞通常包括具有异源(1)病毒载体功能、(2)包装功能、和(3)辅助功能的细胞。在以下章节中讨论这些组件功能的每一个。This invention includes packaging cells encompassed by a “host cell”, which can be cultured to produce the packaging viral vector of this invention. The packaging cells of this invention typically comprise cells having heterologous (1) viral vector function, (2) packaging function, and (3) helper function. Each of these component functions is discussed in the following sections.

初始，可以通过技术人员已知的几种方法来制造载体(参见例如WO 2013/063379)。一种优选的方法描述在Grieger等人,2015,Molecular Therapy 24(2):287-297中，其内容出于所有目的通过引用并入本文。简而言之，将HEK293细胞的有效转染用作起始点，其中使用来自合格临床主细胞库的贴壁HEK293细胞系在摇瓶和允许快速和规模可变的rAAV生产的WAVE生物反应器中的无动物组分悬浮条件下生长。使用三重转染方法(例如WO96/40240)，悬浮HEK293细胞系在转染后48小时收获时产生超过10⁵个含有载体基因组的粒子(vg)/细胞或超过10¹⁴vg/L的细胞培养物。更具体而言，三重转染是指用三种质粒转染包装细胞的事实：一种质粒编码AAV rep和cap基因，另一种质粒编码各种辅助功能(例如腺病毒或HSV蛋白，诸如E1a、E1b、E2a、E4和VA RNA)，并且另一种质粒编码转基因及其各种控制元件(例如修饰的RPGRorf15基因和hGRK启动子)。Initially, the vector can be manufactured using several methods known to those skilled in the art (see, for example, WO 2013/063379). A preferred method is described in Grieger et al., 2015, Molecular Therapy 24(2):287-297, the contents of which are incorporated herein by reference for all purposes. Briefly, efficient transfection of HEK293 cells is used as the starting point, wherein adherent HEK293 cell lines from a qualified clinical master cell bank are grown in shake flasks and in a WAVE bioreactor that allows for rapid and scalable production of rAAV under animal component-free suspension conditions. Using a triple transfection method (e.g., WO96/40240), suspension HEK293 cell lines yield more than ^10⁵ particles (vg)/cell containing the vector genome or more than ^10¹⁴ vg/L of cell culture at harvest 48 hours post-transfection. More specifically, triple transfection refers to the fact that packaging cells are transfected with three different plasmids: one plasmid encodes the AAV rep and cap genes, another plasmid encodes various auxiliary functions (such as adenovirus or HSV proteins, such as E1a, E1b, E2a, E4, and VA RNA), and a third plasmid encodes the transgene and its various control elements (such as the modified RPGRorf15 gene and the hGRK promoter).

为了达到所需产量，优化了许多变量，诸如支持生长和转染两者的兼容无血清悬浮培养基的选择、转染试剂、转染条件和细胞密度的选择。还开发了基于离子交换色谱法的通用纯化策略，其产生AAV血清型1-6、8、9和各种嵌合衣壳的高纯度载体制备。这个便于使用的过程可以在一周内完成，产生高实壳颗粒(full particle)与空壳颗粒(emptyparticle)比(>90％实壳颗粒)，提供适合于临床应用的纯化后产量(>1×10¹³vg/L)和纯度，并且对于所有血清型和嵌合粒子是通用的。已经利用这种规模可变的制造技术来制造用于视网膜新血管形成(AAV2)、血友病B(scAAV8)、巨轴突神经病(scAAV9)和视网膜色素变性(AAV2)的GMP I期临床AAV载体，其已施用到患者中。此外，通过实施需要在转染后的许多个时间点从培养基中收获rAAV的灌流法，总载体产生最少增加5倍。To achieve the desired yield, numerous variables were optimized, such as the selection of a compatible serum-free suspension medium supporting both growth and transfection, transfection reagents, transfection conditions, and cell density. A universal purification strategy based on ion-exchange chromatography was also developed, yielding high-purity carriers for AAV serotypes 1–6, 8, 9, and various chimeric capsids. This user-friendly process can be completed within one week, producing a high full-particle to empty-particle ratio (>90% full-particle), providing purified yields (>1 × ^10¹³ vg/L) and purity suitable for clinical applications, and is universal for all serotypes and chimeric particles. This scalable manufacturing technology has been used to produce GMP Phase I clinical AAV carriers for retinal neovascularization (AAV2), hemophilia B (scAAV8), giant axonal neuropathy (scAAV9), and retinitis pigmentosa (AAV2), which have been administered to patients. Furthermore, by implementing a perfusion method that requires harvesting rAAV from the culture medium at multiple time points after transfection, the total vector production was increased by at least 5 times.

包装细胞包括病毒载体功能，以及包装和载体功能。病毒载体功能通常包括细小病毒基因组的一部分，诸如AAV基因组，其中rep和cap缺失并被修饰的RPGRorf15序列及其相关表达控制序列替换。病毒载体功能包括足够的表达控制序列，以产生用于包装的病毒载体的复制。通常，病毒载体包括细小病毒基因组的一部分，诸如rep和cap缺失并被转基因及其相关表达控制序列替换的AAV基因组。转基因通常侧接两个AAV TR，其代替缺失的病毒rep和cap ORF。包括适当的表达控制序列，诸如组织特异性启动子和适合用于促进转基因在靶细胞中的组织特异性表达的其他调控序列。转基因通常是可以表达以产生治疗性多肽或标志物多肽的核酸序列。Packaging cells encompass viral vector functions, as well as packaging and vector functions. Viral vector functions typically include a portion of the parvovirus genome, such as the AAV genome, where the rep and cap sequences are deleted and replaced by a modified RPGrf15 sequence and its associated expression control sequences. Viral vector functions include sufficient expression control sequences to generate replication of the viral vector for packaging. Typically, a viral vector comprises a portion of the parvovirus genome, such as the AAV genome, where the rep and cap sequences are deleted and replaced by a transgene and its associated expression control sequences. The transgene typically side-mounts two AAV TRs, replacing the deleted viral rep and cap ORFs. Appropriate expression control sequences are included, such as tissue-specific promoters and other regulatory sequences suitable for promoting tissue-specific expression of the transgene in target cells. The transgene is typically a nucleic acid sequence that can be expressed to produce a therapeutic peptide or biomarker peptide.

选择用于病毒载体的末端重复(TR)(可拆分(resolvable)和不可拆分的)优选是AAV序列，其中血清型1、2、3、4、5和6是优选的。可拆分的AAV TR无需具有野生型TR序列(例如可以通过插入、缺失、截短或错义突变来改变野生型序列)，只要TR介导所需功能，例如病毒包装、整合和/或原病毒拯救等即可。TR可以是用作AAV反向末端重复的合成序列，诸如Samulski等人的美国专利号5,478,745中描述的“双重D序列”，其全部公开内容通过引用整体并入本文。通常但不一定，TR来自相同的细小病毒，例如，两个TR序列均来自AAV2。The terminal repeats (TRs) used for the viral vector (both resolvable and non-resolvable) are preferably AAV sequences, with serotypes 1, 2, 3, 4, 5, and 6 being preferred. Resolvable AAV TRs do not need to have a wild-type TR sequence (e.g., the wild-type sequence can be altered by insertion, deletion, truncation, or missense mutations), as long as the TR mediates the desired function, such as viral packaging, integration, and/or provirus rescue. TRs can be synthetic sequences used as inverted terminal repeats of AAVs, such as the “double D sequence” described in U.S. Patent No. 5,478,745 to Samulski et al., the entire disclosure of which is incorporated herein by reference. Typically, but not necessarily, TRs originate from the same parvovirus; for example, both TR sequences may be from AAV2.

包装功能包括衣壳组件。衣壳组件优选来自细小病毒衣壳，诸如AAV衣壳或嵌合AAV衣壳功能。合适的细小病毒衣壳组件的实例是来自细小病毒科(Parvoviridae)，诸如自主性细小病毒或依赖病毒属(Dependovirus)的衣壳组件。例如，衣壳组件可选自AAV衣壳，例如AAV1、AAV2、AAV3、AAV4、AAV5、AAV6、AAV7、AAV8、AAV9、AAV10、AAV11、AAV12、AAVrh10、AAVrh74、RHM4-1、RHM15-1、RHM15-2、RHM15-3/RHM15-5、RHM15-4、RHM15-6、AAV Hu.26、AAV1.1、AAV2.5、AAV6.1、AAV6.3.1、AAV9.45、AAV2i8、AAV2G9、AAV2i8G9、AAV2-TT、AAV2-TT-S312N、AAV3B-S312N和AAV-LK03，以及尚未鉴定或来自非人灵长类动物来源的其他新衣壳。衣壳组件可包括来自两种或更多种AAV衣壳的组件。The packaging function includes a capsid assembly. The capsid assembly is preferably derived from a parvovirus capsid, such as an AAV capsid or a chimeric AAV capsid. Examples of suitable parvovirus capsid assemblies are those from the Parvoviridae family, such as autonomous parvoviruses or dependent parvoviruses. For example, the capsid assembly may be selected from AAV capsids, such as AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAVrh10, AAVrh74, RHM4-1, RHM15-1, RHM15-2, RHM15-3/RHM15-5, RHM15-4, RHM15-6, AAV Hu.26, AAV1.1, AAV2.5, AAV6.1, AAV6.3.1, AAV9.45, AAV2i8, AAV2G9, AAV2i8G9, AAV2-TT, AAV2-TT-S312N, AAV3B-S312N, and AAV-LK03, as well as other novel capsids that have not yet been identified or are derived from non-human primate sources. The shell assembly may include components from two or more AAV shells.

包装的病毒载体通常包括侧翼为TR元件的修饰的RPGRorf15基因序列和表达控制序列，在本文中称为“转基因”或“转基因表达盒”，其足以导致载体DNA的包装和修饰的RPGRorf15基因序列在转导的细胞中的后续表达。病毒载体功能可以例如作为质粒或扩增子的组件供应给细胞。病毒载体功能可以在细胞系内染色体外存在和/或可以整合到细胞的染色体DNA中。Packaged viral vectors typically include a modified RPGORf15 gene sequence flanked by TR elements and an expression control sequence, referred to herein as a “transgenic” or “transgenic expression cassette,” which is sufficient to result in the packaging of vector DNA and subsequent expression of the modified RPGORf15 gene sequence in transduced cells. Viral vector functions can be supplied to cells, for example, as components of plasmids or amplicones. Viral vector functions can be present extrachromosomally within cell lines and/or can be integrated into the chromosomal DNA of cells.

可以采用将携带病毒载体功能的核苷酸序列引入到细胞宿主中以进行复制和包装的任何方法，包括但不限于电穿孔、磷酸钙沉淀、显微注射、阳离子或阴离子脂质体、以及与核定位信号组合的脂质体。在通过使用病毒载体的转染来提供病毒载体功能的实施方案中；可以使用产生病毒感染的标准方法。Any method can be used to introduce a nucleotide sequence carrying viral vector function into a cell host for replication and packaging, including but not limited to electroporation, calcium phosphate precipitation, microinjection, cationic or anionic liposomes, and liposomes combined with nuclear localization signals. In embodiments that provide viral vector function through transfection using a viral vector, standard methods for generating viral infection can be used.

包装功能包括用于病毒载体复制和包装的基因。由此，例如，包装功能可以按需包括病毒基因表达、病毒载体复制、从整合状态拯救病毒载体、病毒基因表达、以及将病毒载体包装成病毒粒子所必需的功能。包装功能可以使用遗传构建体(诸如质粒或扩增子、杆状病毒(Baculovirus)、或HSV辅助构建体)一起或分开供应给包装细胞。包装功能可以在包装细胞内染色体外存在，但优选整合到细胞的染色体DNA中。实例包括编码AAV Rep和Cap蛋白的基因。Packaging functions include genes for viral vector replication and packaging. Thus, for example, packaging functions can include, as needed, functions necessary for viral gene expression, viral vector replication, rescuing the viral vector from its integrated state, viral gene expression, and packaging the viral vector into viral particles. Packaging functions can be supplied to packaging cells together or separately using genetic constructs such as plasmids or amplicones, baculoviruses, or HSV helper constructs. Packaging functions can exist extrachromosomally within packaging cells, but are preferably integrated into the cell's chromosomal DNA. Examples include genes encoding AAV Rep and Cap proteins.

辅助功能包括建立包装细胞的主动感染(其为起始病毒载体的包装所需)所需的辅助病毒元件。实例包括来源于腺病毒、杆状病毒和/或疱疹病毒的足以导致病毒载体包装的功能。例如，腺病毒辅助功能通常将包括腺病毒组件E1a、E1b、E2a、E4和VA RNA。包装功能可以通过用所需病毒感染包装细胞来供给。包装功能可以使用遗传构建体(诸如质粒或扩增子)一起或分开供应给包装细胞。参见例如Rabinowitz等人,2002,J.Virol.76:791中描述的pXR辅助质粒，以及Grimm等人,1998,Human Gene Therapy 9:2745-2760中描述的pDG质粒。包装功能可以在包装细胞内染色体外存在，但优选整合到细胞的染色体DNA中(例如HEK 293细胞中的E1或E3)。Helper functions include auxiliary viral elements required for the establishment of active infection of packaging cells (which is necessary for the packaging of the initiating viral vector). Examples include functions derived from adenovirus, baculovirus, and/or herpesvirus sufficient to lead to viral vector packaging. For example, adenoviral helper functions would typically include adenoviral components E1a, E1b, E2a, E4, and VA RNA. Packaging functions can be supplied by infecting packaging cells with the desired virus. Packaging functions can be supplied to packaging cells together or separately using genetic constructs such as plasmids or amplicon. See, for example, the pXR helper plasmid described in Rabinowitz et al., 2002, J.Virol. 76:791, and the pDG plasmid described in Grimm et al., 1998, Human Gene Therapy 9:2745-2760. Packaging functions can be present extrachromosomally within packaging cells, but are preferably integrated into the cell's chromosomal DNA (e.g., E1 or E3 in HEK 293 cells).

可以采用任何合适的辅助病毒功能。例如，当包装细胞是昆虫细胞时，杆状病毒可以充当辅助病毒。疱疹病毒也可用作AAV包装方法中的辅助病毒。编码一种或多种AAV Rep蛋白的杂合疱疹病毒可以有利地促进更为规模可变的AAV载体产生方案。Any suitable helper virus function can be employed. For example, baculoviruses can act as helper viruses when the packaging cells are insect cells. Herpesviruses can also be used as helper viruses in AAV packaging methods. Hybrid herpesviruses encoding one or more AAV Rep proteins can advantageously facilitate more scalable AAV vector production protocols.

可以采用将携带辅助功能的核苷酸序列引入到细胞宿主中以进行复制和包装的任何方法，包括但不限于电穿孔、磷酸钙沉淀、显微注射、阳离子或阴离子脂质体、以及与核定位信号组合的脂质体。在通过使用病毒载体的转染或使用辅助病毒的感染来提供辅助功能的实施方案中；可以使用产生病毒感染的标准方法。Any method can be used to introduce a nucleotide sequence carrying an auxiliary function into the cell host for replication and packaging, including but not limited to electroporation, calcium phosphate precipitation, microinjection, cationic or anionic liposomes, and liposomes combined with nuclear localization signals. In embodiments that provide the auxiliary function by using transfection with a viral vector or infection with a helper virus, standard methods for generating viral infection can be used.

本领域中已知的任何合适的感受细胞或包装细胞可用于生产包装的病毒载体。哺乳动物细胞或昆虫细胞是优选的。在本发明的实践中可用于生产包装细胞的细胞的实例包括例如人细胞系，诸如VERO、WI38、MRC5、A549、HEK293细胞(其在组成型启动子的控制下表达功能性腺病毒E1)、B-50或任何其他HeLa细胞、HepG2、Saos-2、Huh7和HT1080细胞系。在一方面，包装细胞能够在悬浮培养中生长，更优选地，细胞能够在无血清培养中生长。在一个实施方案中，包装细胞是在无血清培养基中悬浮生长的HEK293。在另一个实施方案中，包装细胞是美国专利号9,441,206中描述并保藏为ATCC号PTA 13274的HEK293细胞。许多rAAV包装细胞系在本领域中是已知的，包括但不限于WO 2002/46359中公开的那些。在另一方面，包装细胞以细胞培养室(例如用HEK293细胞接种的10层细胞培养室)的形式培养。Any suitable sensory or packaging cells known in the art can be used to produce packaged viral vectors. Mammalian or insect cells are preferred. Examples of cells that can be used to produce packaging cells in the practice of this invention include, for example, human cell lines such as VERO, WI38, MRC5, A549, HEK293 cells (which express functional adenovirus E1 under the control of a constitutive promoter), B-50 or any other HeLa cells, HepG2, Saos-2, Huh7, and HT1080 cell lines. In one aspect, the packaging cells are capable of growth in suspension culture, more preferably, in serum-free culture. In one embodiment, the packaging cells are HEK293 cells grown in suspension in serum-free medium. In another embodiment, the packaging cells are HEK293 cells described in U.S. Patent No. 9,441,206 and deposited with ATCC No. PTA 13274. Many rAAV packaging cell lines are known in the art, including but not limited to those disclosed in WO 2002/46359. On the other hand, the packaging cells are cultured in the form of cell culture chambers (e.g., 10-layer cell culture chambers seeded with HEK293 cells).

用作包装细胞的细胞系包括昆虫细胞系。可以根据本发明使用允许AAV复制并可以在培养中维持的任何昆虫细胞。实例包括草地贪夜蛾(Spodoptera frugiperda)，诸如Sf9或Sf21细胞系、果蝇属(Drosophila spp.)细胞系或蚊细胞系，例如白纹伊蚊(Aedesalbopictus)来源的细胞系。优选的细胞系是草地贪夜蛾Sf9细胞系。以下参考文献因其涉及昆虫细胞用于表达异源多肽的用途、将核酸引入到此类细胞中的方法和在培养中维持此类细胞的方法的教导而并入本文：Methods in Molecular Biology,ed.Richard,HumanaPress,N J(1995)；O'Reilly等人,Baculovirus Expression Vectors:A LaboratoryManual,Oxford Univ.Press(1994)；Samulski等人,1989,J.Virol.63:3822-3828；Kajigaya等人,1991,Proc.Nat'l.Acad.Sci.USA 88:4646-4650；Ruffing等人,1992,J.Virol.66:6922-6930；Kimbauer等人,1996,Virol.219:37-44；Zhao等人,2000,Virol.272:382-393；和Samulski等人,美国专利号6,204,059。Cell lines used as packaging cells include insect cell lines. Any insect cell that allows AAV replication and can be maintained in culture can be used according to the present invention. Examples include *Spodoptera frugiperda* cell lines such as Sf9 or Sf21, *Drosophila* spp. cell lines, or mosquito cell lines, such as those derived from *Aedes albopictus*. A preferred cell line is the *Spodoptera frugiperda* Sf9 cell line. The following references are incorporated herein by reference for their teachings concerning the use of insect cells for the expression of heterologous polypeptides, methods for introducing nucleic acids into such cells, and methods for maintaining such cells in culture: Methods in Molecular Biology, ed. Richard, Humana Press, N J (1995); O'Reilly et al., Baculovirus Expression Vectors: A Laboratory Manual, Oxford Univ. Press (1994); Sa Mullski et al., 1989, J. Virol. 63:3822-3828; Kajigaya et al., 1991, Proc. Nat'l. Acad. Sci. USA 88:4646-4650; Ruffing et al., 1992, J. Virol. 66:6922-6930; Kimbauer et al., 1996, Virol. 219:37-44; Zhao et al., 2000, Virol. 272:382-393; and Samulski et al., U.S. Patent No. 6,204,059.

根据本发明的病毒衣壳可以使用本领域已知的任何方法产生，例如通过从杆状病毒表达(Brown等人,(1994)Virology 198:477-488)。作为进一步的备选方案，例如由Urabe等人,2002,Human Gene Therapy 13:1935-1943所述使用杆状病毒载体来递送rep/cap基因和rAAV模板，可以在昆虫细胞中产生本发明的病毒载体。The viral capsid according to the invention can be produced using any method known in the art, for example by expression from baculoviruses (Brown et al., (1994) Virology 198:477-488). As a further alternative, the viral vector of the invention can be produced in insect cells, for example, using a baculovirus vector to deliver the rep/cap gene and the rAAV template, as described by Urabe et al., 2002, Human Gene Therapy 13:1935-1943.

在另一方面，本发明提供了在昆虫细胞中生产rAAV的方法，其中可以通过将AAVRep和Cap工程化到杆状病毒载体的多角体蛋白编码区中并通过转染到宿主细胞中产生病毒重组体来构建携带这些基因编码区的杆状病毒包装系统或载体。值得注意的是，当使用杆状病毒生产AAV时，优选AAV DNA载体产物是不使用对AAV ITR的突变的自身互补的AAV样分子。这似乎是由于功能性Rep酶活性的缺少而导致昆虫细胞中无效率的AAV rep切割的副产物，其产生自身互补的DNA分子。宿主细胞是杆状病毒感染的细胞，或者其中已经引入了编码杆状病毒辅助功能的另外的核酸，或其中包括这些杆状病毒辅助功能。这些杆状病毒可以表达AAV组件并随后促进衣壳的产生。On the other hand, the present invention provides a method for producing rAAV in insect cells, wherein a baculovirus packaging system or vector carrying these gene coding regions can be constructed by engineering AAVRep and Cap into the polyhedral protein coding region of a baculovirus vector and generating a viral recombinant by transfection into a host cell. Notably, when using baculovirus to produce AAV, the preferred AAV DNA vector product is a self-complementary AAV-like molecule without mutations to the AAV ITR. This appears to be due to the lack of functional Rep enzyme activity, resulting in inefficient AAV rep cleavage byproducts in insect cells, which produce self-complementary DNA molecules. The host cell is a baculovirus-infected cell, or a cell in which additional nucleic acids encoding baculovirus helper functions have been introduced, or which include these baculovirus helper functions. These baculoviruses can express AAV components and subsequently promote capsid production.

在生产过程中，包装细胞通常包括一种或多种足以导致病毒载体的复制和包装的病毒载体功能以及辅助功能和包装功能。这些不同的功能可以使用遗传构建体(诸如质粒或扩增子)一起或分开供应给包装细胞，并且它们可以在细胞系内染色体外存在或整合到细胞的染色体中。During production, packaging cells typically include one or more viral vector functions sufficient to cause replication and packaging of viral vectors, as well as helper and packaging functions. These different functions can be supplied to packaging cells together or separately using genetic constructs such as plasmids or amplicones, and they can be present extrachromosomally within the cell line or integrated into the cell's chromosome.

可以供应具有任何一种或多种已经掺入的所述功能的细胞，例如具有一种或多种染色体外掺入或整合到细胞的染色体DNA中的载体功能的细胞系、具有一种或多种染色体外掺入或整合到细胞的染色体DNA中的包装功能的细胞系、或具有染色体外掺入或整合到细胞的染色体DNA中的辅助功能的细胞系。Cells with any one or more of the aforementioned functions can be supplied, such as cell lines with one or more extrachromosomal incorporations or integrations into the chromosomal DNA of the cell as carriers, cell lines with one or more extrachromosomal incorporations or integrations into the chromosomal DNA of the cell as packaging cells, or cell lines with extrachromosomal incorporations or integrations into the chromosomal DNA of the cell as auxiliary cells.

rAAV载体可以通过本领域中标准的方法(诸如通过柱色谱或氯化铯梯度)进行纯化。用于纯化rAAV载体的方法在本领域中是已知的，并包括Clark等人,1999,Human GeneTherapy 10(6):1031-1039；Schenpp和Clark,2002,Methods Mol.Med.69:427-443；美国专利号6,566,118和WO 98/09657中描述的方法。The rAAV vector can be purified using methods standard in the art, such as column chromatography or cesium chloride gradient. Methods for purifying the rAAV vector are known in the art and include those described in Clark et al., 1999, Human Gene Therapy 10(6):1031-1039; Schenpp and Clark, 2002, Methods Mol. Med. 69:427-443; U.S. Patent Nos. 6,566,118 and WO 98/09657.

治疗方法Treatment

在某些实施方案中，提供了用于在需要此类治疗的受试者中治疗XLRP的方法，所述方法通过向该受试者施用治疗有效量的具有与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列的核酸，或包含此类核酸和至少一种药学上可接受的赋形剂的药物组合物来进行。In some embodiments, a method is provided for treating XLRP in a subject requiring such treatment, the method being carried out by administering to the subject a therapeutically effective amount of a nucleic acid having a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1, or a pharmaceutical composition comprising such nucleic acid and at least one pharmaceutically acceptable excipient.

在相关方面，提供了用于治疗XLRP的包含与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列的核酸。In a related aspect, a nucleic acid comprising a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1 is provided for the treatment of XLRP.

在另一些相关方面，提供了包含与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列的核酸用于制造药物的用途。In other related aspects, the use of nucleic acids comprising a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1 for the manufacture of a pharmaceutical is provided.

在另一些相关方面，提供了包含与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列的核酸用于制造用于治疗XLRP的药物的用途。In other related aspects, the use of nucleic acids comprising a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1 for the manufacture of a medicament for the treatment of XLRP is provided.

在一些方面，与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列可操作地连接至表达控制序列。在一些实施方案中，SEQID NO:1的核苷酸序列可操作地连接至人G蛋白偶联受体视紫红质激酶1(hGRK)启动子。在一些优选实施方案中，hGRK启动子具有SEQ ID NO:4的序列。In some aspects, a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1 is operatively linked to an expression control sequence. In some embodiments, the nucleotide sequence of SEQ ID NO:1 is operatively linked to a human G protein-coupled receptor rhodopsin kinase 1 (hGRK) promoter. In some preferred embodiments, the hGRK promoter has the sequence of SEQ ID NO:4.

在一些实施方案中，与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列构成表达盒的一部分。在一些方面，表达盒从5'至3'包含：(a)AAV2末端重复，(b)hGRK启动子，(c)SEQ ID NO:1的密码子优化的RPGRorf15基因，(d)SV40多聚腺苷酸化序列，和(e)AAV2末端重复。在优选实施方案中，5'AAV2末端重复具有如SEQ ID NO:6所示的核苷酸序列和/或hGRK启动子具有如SEQ ID NO:4所示的核苷酸序列和/或SV40多聚腺苷酸化序列具有如SEQ ID NO:8所示的核苷酸序列和/或3'AAV2末端重复具有如SEQ ID NO:7所示的核苷酸序列。在一个特别优选的实施方案中，表达盒包含含有SEQ ID NO:5的核苷酸序列或与其具有至少80％、至少81％、至少82％、至少83％、至少84％、至少85％、至少86％、至少87％、至少88％、至少89％、至少90％、至少91％、至少92％、至少93％、至少94％、至少95％、至少96％、至少97％、至少98％或至少99％同一性的序列的核酸。In some embodiments, a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1 constitutes part of the expression cassette. In some aspects, the expression cassette from 5' to 3' comprises: (a) an AAV2 terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf15 gene of SEQ ID NO:1, (d) an SV40 polyadenylated sequence, and (e) an AAV2 terminal repeat. In a preferred embodiment, the 5' AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:6 and/or the hGRK promoter has a nucleotide sequence as shown in SEQ ID NO:4 and/or the SV40 polyadenylated sequence has a nucleotide sequence as shown in SEQ ID NO:8 and/or the 3' AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:7. In a particularly preferred embodiment, the expression cassette comprises a nucleic acid containing the nucleotide sequence of SEQ ID NO:5 or a sequence having at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity with it.

在其他实施方案中，提供了用于在需要此类治疗的受试者中治疗XLRP的方法，所述方法通过向该受试者施用治疗有效量的重组AAV(rAAV)病毒粒子，或包含其的药物组合物来进行，rAAV病毒粒子包含(i)可操作地连接至表达控制序列的核酸，所述核酸具有与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列，和(ii)AAV衣壳。In other embodiments, a method is provided for treating XLRP in a subject requiring such treatment, the method being carried out by administering to the subject a therapeutically effective amount of recombinant AAV (rAAV) viral particles, or a pharmaceutical composition comprising thereto, the rAAV viral particles comprising (i) a nucleic acid operatively linked to an expression control sequence having a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1, and (ii) an AAV capsid.

在相关实施方案中，提供了重组AAV(rAAV)病毒粒子用于治疗XLRP的用途，所述rAAV病毒粒子包含(i)可操作地连接至表达控制序列的核酸，所述核酸具有与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列，和(ii)AAV衣壳。In related embodiments, the use of recombinant AAV (rAAV) viral particles for the treatment of XLRP is provided, the rAAV viral particles comprising (i) a nucleic acid operatively linked to an expression control sequence having a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1, and (ii) an AAV capsid.

在另一些相关实施方案中，提供了重组AAV(rAAV)病毒粒子用于制造用于治疗XLRP的药物的用途，所述rAAV病毒粒子包含(i)可操作地连接至表达控制序列的核酸，所述核酸具有与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列，和(ii)AAV衣壳。In other related embodiments, the use of recombinant AAV (rAAV) viral particles for manufacturing a medicament for treating XLRP is provided, said rAAV viral particles comprising (i) a nucleic acid operatively linked to an expression control sequence having a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1, and (ii) an AAV capsid.

在一些实施方案中，rAAV病毒粒子包含天然AAV2、AAV4、AAV5或AAV8衣壳。在另一些实施方案中，rAAV病毒粒子包含变体AAV衣壳，其相对于AAV2、AAV4、AAV5或AAV8包含一个或多个修饰。在一个优选实施方案中，AAV衣壳包含含有SEQ ID NO:9的序列的衣壳蛋白。In some embodiments, the rAAV viral particle comprises a natural AAV2, AAV4, AAV5, or AAV8 capsid. In other embodiments, the rAAV viral particle comprises a variant AAV capsid that includes one or more modifications relative to AAV2, AAV4, AAV5, or AAV8. In a preferred embodiment, the AAV capsid comprises a capsid protein containing the sequence SEQ ID NO:9.

在一些实施方案中，rAAV病毒粒子包含(i)天然AAV2衣壳或其变体，和(ii)表达盒，所述表达盒从5'至3'包含：(a)AAV2末端重复，(b)hGRK启动子，(c)SEQ ID NO:1的密码子优化的RPGRorf15基因，(d)SV40多聚腺苷酸化序列，和(e)AAV2末端重复。在优选实施方案中，rAAV包含(i)含有SEQ ID NO:9的衣壳蛋白的衣壳，和(ii)含有SEQ ID NO:6的5'AAV2末端重复、SEQ ID NO:4的hGRK启动子、SEQ ID NO:8的SV40多聚腺苷酸化序列和SEQ IDNO:7的3'AAV2末端重复的核酸。在一个特别优选的实施方案中，rAAV包含(i)含有SEQ IDNO:9的衣壳蛋白的衣壳，和(ii)含有SEQ ID NO:5的核苷酸序列的表达盒。In some embodiments, the rAAV viral particle comprises (i) a natural AAV2 capsid or a variant thereof, and (ii) an expression cassette comprising, from 5' to 3',: (a) an AAV2 terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf15 gene of SEQ ID NO:1, (d) an SV40 polyadenylated sequence, and (e) an AAV2 terminal repeat. In a preferred embodiment, the rAAV comprises (i) a capsid containing the capsid protein of SEQ ID NO:9, and (ii) nucleic acid containing a 5' AAV2 terminal repeat of SEQ ID NO:6, an hGRK promoter of SEQ ID NO:4, an SV40 polyadenylated sequence of SEQ ID NO:8, and a 3' AAV2 terminal repeat of SEQ ID NO:7. In a particularly preferred embodiment, rAAV comprises (i) a capsid containing the capsid protein of SEQ ID NO: 9, and (ii) an expression cassette containing the nucleotide sequence of SEQ ID NO: 5.

在特别优选的实施方案中，提供了rAAV在治疗XLRP中或用于制造用于治疗XLRP的药物的用途，其中rAAV包含(i)含有SEQ ID NO:5的核苷酸序列的核酸，和(ii)含有具有SEQID NO:9的氨基酸序列的衣壳蛋白的衣壳。在一些方面，rAAV通过玻璃体内注射来施用。In a particularly preferred embodiment, use of rAAV in the treatment of XLRP or in the manufacture of a medicament for the treatment of XLRP is provided, wherein rAAV comprises (i) a nucleic acid having the nucleotide sequence of SEQ ID NO:5, and (ii) a capsid having the amino acid sequence of SEQ ID NO:9. In some aspects, rAAV is administered by intravitreal injection.

在另一些特别优选的实施方案中，提供了治疗XLRP的方法，其包括向受试者施用有效量的rAAV，所述rAAV包含(i)含有SEQ ID NO:5的核苷酸序列的核酸，和(ii)含有具有SEQ ID NO:9的氨基酸序列的衣壳蛋白的衣壳。在一些方面，rAAV通过玻璃体内注射施用于受试者。In some other particularly preferred embodiments, a method for treating XLRP is provided, comprising administering to a subject an effective amount of rAAV, said rAAV comprising (i) a nucleic acid having the nucleotide sequence of SEQ ID NO:5, and (ii) a capsid having a capsid protein having the amino acid sequence of SEQ ID NO:9. In some aspects, rAAV is administered to the subject via intravitreal injection.

在另一些方面，提供了药物组合物，其包含任选可操作地连接至表达控制序列的核酸，和至少一种药学上可接受的赋形剂，所述核酸具有与SEQ ID NO:1的核苷酸序列具有至少90％、至少95％、至少98％同一性或100％同一性的核苷酸序列。In other aspects, pharmaceutical compositions are provided comprising a nucleic acid optionally operably linked to an expression control sequence, and at least one pharmaceutically acceptable excipient, said nucleic acid having a nucleotide sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:1.

在一些实施方案中，药物组合物包含可操作地连接至组成型启动子(优选具有与SEQ ID NO:4的核苷酸序列具有至少90％、至少95％至少98％同一性或100％同一性的序列的hGRK启动子)的核酸，所述核酸包含SEQ ID NO:1的核苷酸序列。In some embodiments, the pharmaceutical composition comprises a nucleic acid operatively linked to a constitutive promoter (preferably an hGRK promoter having a sequence having at least 90%, at least 95%, at least 98%, or 100% identity with the nucleotide sequence of SEQ ID NO:4), said nucleic acid comprising the nucleotide sequence of SEQ ID NO:1.

在另一些方面，提供了药物组合物，其包含至少一种药学上可接受的赋形剂和传染性rAAV，所述传染性rAAV包含(i)AAV衣壳和(ii)核酸，所述核酸从5'至3'包含：(a)AAV2末端重复，(b)hGRK启动子，(c)SEQ ID NO:1的密码子优化的RPGRorf15基因，(d)SV40多聚腺苷酸化序列，和(e)AAV2末端重复。在相关实施方案中，药物组合物包含10⁹至10¹⁴vg，优选10¹⁰至10¹³vg的rAAV，更优选包含3×10¹¹vg或1×10¹²vg的rAAV。In other aspects, pharmaceutical compositions are provided comprising at least one pharmaceutically acceptable excipient and infectious rAAV, said infectious rAAV comprising (i) an AAV capsid and (ii) a nucleic acid, said nucleic acid comprising, from 5' to 3': (a) an AAV 2-terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf15 gene of SEQ ID NO:1, (d) an SV40 polyadenylated sequence, and (e) an AAV 2-terminal repeat. In related embodiments, the pharmaceutical composition comprises ^10⁹ to ^10¹⁴ vg, preferably ^10¹⁰ to ^10¹³ vg of rAAV, more preferably 3 × ^10¹¹ vg or 1 × ^10¹² vg of rAAV.

在优选实施方案中，药物组合物包含rAAV，所述rAAV包含(i)含有衣壳蛋白的衣壳，所述衣壳蛋白包含SEQ ID NO:9的序列或由其组成，和(ii)包含SEQ ID NO:6的5'AAV2末端重复和/或SEQ ID NO:4的hGRK启动子和/或SEQ ID NO:8的SV40多聚腺苷酸化序列和/或SEQ ID NO:7的AAV2末端重复的核酸。在相关实施方案中，药物组合物包含10⁹vg至10¹⁴vg，优选10¹⁰vg至10¹³vg的rAAV，更优选包含大约3×10¹¹vg或大约1×10¹²vg的rAAV。In a preferred embodiment, the pharmaceutical composition comprises rAAV, said rAAV comprising (i) a capsid containing a capsid protein comprising or consisting of the sequence of SEQ ID NO:9, and (ii) a nucleic acid comprising the 5' AAV2 terminal repeat of SEQ ID NO:6 and/or the hGRK promoter of SEQ ID NO:4 and/or the SV40 polyadenylated sequence of SEQ ID NO:8 and/or the AAV2 terminal repeat of SEQ ID NO:7. In related embodiments, the pharmaceutical composition comprises ^10⁹ vg to ^10¹⁴ vg, preferably ^10¹⁰ vg to ^10¹³ vg of rAAV, more preferably approximately 3 × ^10¹¹ vg or approximately 1 × ^10¹² vg of rAAV.

在一些实施方案中，提供了用于在人受试者的一种或多种感光细胞中表达RPGR的方法，其包括向人受试者施用有效量的如本文中描述的传染性rAAV，其中RPGR在一种或多种感光细胞中表达。在一些优选实施方案中，传染性rAAV的有效量是10⁹至10¹⁴vg/眼和/或向人受试者玻璃体内施用(双侧或单侧)单剂量的rAAV和/或rAAV包含SEQ ID NO:9的衣壳和/或rAAV包含含有SEQ ID NO:5的核苷酸序列的异源核酸。In some embodiments, a method for expressing RPGR in one or more photoreceptor cells of a human subject is provided, comprising administering an effective amount of infectious rAAV as described herein to the human subject, wherein RPGR is expressed in one or more photoreceptor cells. In some preferred embodiments, an effective amount of infectious rAAV is ^10⁹ to ^10¹⁴ vg/eye and/or administering a single dose of rAAV (bilateral or unilateral) into the vitreous cavity of a human subject and/or the rAAV comprises a capsid of SEQ ID NO:9 and/or the rAAV comprises a heterologous nucleic acid containing the nucleotide sequence of SEQ ID NO:5.

在一个特别优选的实施方案中，提供了药物组合物，其包含至少一种药学上可接受的赋形剂和传染性rAAV，所述传染性rAAV包含(i)含有衣壳蛋白的衣壳，所述衣壳蛋白包含SEQ ID NO:9的序列或由其组成，和(ii)含有SEQ ID NO:5的核苷酸序列或由其组成的核酸。在相关实施方案中，药物组合物包含10¹⁰vg至10¹³vg的rAAV，优选包含大约3×10¹¹vg或大约1×10¹²vg的rAAV。In a particularly preferred embodiment, a pharmaceutical composition is provided comprising at least one pharmaceutically acceptable excipient and an infectious rAAV, said infectious rAAV comprising (i) a capsid containing a capsid protein comprising or consisting of the sequence of SEQ ID NO:9, and (ii) a nucleic acid containing or consisting of the nucleotide sequence of SEQ ID NO:5. In related embodiments, the pharmaceutical composition comprises 10 ^{× 10⁻¹⁰} vg to 10 ^{× 13} vg of rAAV, preferably approximately 3 × ^10¹¹ vg or approximately 1 × ^10¹² vg of rAAV.

在一些实施方案中，通过眼周或眼内(玻璃体内、脉络膜上或视网膜下)注射向患有XLRP的人施用本文中描述的核酸或传染性rAAV，由此在受试者中治疗XLRP。在另一些实施方案中，将本文中描述的核酸或传染性rAAV视网膜下或玻璃体内施用于患有XLRP的人，由此在受试者中治疗XLRP。在优选实施方案中，向患有XLRP的人受试者施用本文中描述的rAAV的单次玻璃体内注射(双侧或单侧)。In some embodiments, XLRP is treated in a subject by administering the nucleic acid or infectious rAAV described herein to a person with XLRP via periocular or intraocular injection (intravitreal, suprachoroidal, or subretinal). In other embodiments, XLRP is treated in a subject by administering the nucleic acid or infectious rAAV described herein subretinally or intravitreally. In a preferred embodiment, a single intravitreal injection (bilateral or unilateral) of the rAAV described herein is administered to a subject with XLRP.

在相关方面，在受治疗的受试者中治疗XLRP包括在治疗后例如6个月、12个月或24个月(i)相对于对照(例如相对于治疗前受治疗的患者的基线测量值，如果核酸或rAAV是单侧施用的话，则相对于未治疗的眼，或相对于未治疗的XLRP患者的同期或历史对照组)在视觉功能或功能性视觉方面得到改善(即增益)和/或(ii)与对照(例如同一患者或未治疗的对照组的未治疗的眼)相比，受治疗的眼的视觉功能丧失和/或视网膜变性减少。这些改善可以通过适当的眼科测试来评估，所述眼科测试包括但不限于视觉敏锐度测试、微视野检查和其他视野测试、解剖测试，诸如光学相干断层扫描和眼底自发荧光成像、视网膜电生理学和/或生活质量(QoL)评估。In relevant respects, treatment of XLRP in treated subjects includes, for example, at 6, 12, or 24 months post-treatment, (i) improvement (i.e., gain) in visual function or functional vision relative to a control (e.g., relative to baseline measurements of the treated patient before treatment, or relative to an untreated eye if the nucleic acid or rAAV was administered unilaterally, or relative to a concurrent or historical control group of untreated XLRP patients) and/or (ii) reduction in visual function loss and/or retinal degeneration in the treated eye compared to a control (e.g., an untreated eye of the same patient or an untreated control group). These improvements can be assessed by appropriate ophthalmic tests, including but not limited to visual acuity tests, micro-field examinations and other visual field tests, anatomical tests such as optical coherence tomography and fundus autofluorescence imaging, retinal electrophysiology, and/or quality of life (QoL) assessments.

在一些方面，本文中描述的核酸或rAAV(或包含其的药物组合物)的有效量是在人患者中有效治疗XLRP的量。在相关方面，本文中描述的rAAV的有效量是10⁹至10¹⁴个rAAV粒子(或载体基因组(vg))/眼，优选10¹⁰至10¹³vg/眼、或1×10¹¹vg/眼至5×10¹²vg/眼，更优选为大约3×10¹¹vg/眼或大约1×10¹²vg/眼。在一些优选实施方案中，将大约3×10¹¹vg/眼或大约1×10¹²vg/眼的单一剂量玻璃体内施用于患有XLRP的人患者，由此治疗XLRP。In some aspects, the effective amount of the nucleic acid or rAAV (or pharmaceutical composition comprising it) described herein is an amount effective in treating XLRP in human patients. In related aspects, the effective amount of rAAV described herein is ^10⁹ to ^10¹⁴ rAAV particles (or vector genome (vg))/eye, preferably ^10¹⁰ to ^10¹³ vg/eye, or 1 × ^10¹¹ vg/eye to 5 × ^10¹² vg/eye, more preferably about 3 × ^10¹¹ vg/eye or about 1 × ^10¹² vg/eye. In some preferred embodiments, a single dose of about 3 × ^10¹¹ vg/eye or about 1 × ^10¹² vg/eye is administered intravitreally to a human patient with XLRP, thereby treating XLRP.

实施例Example

以下实施例示出了本发明的优选实施方案，并且并非意在以任何方式限制本发明的范围。虽然本发明已就其优选实施方案进行了描述，但通过阅读本申请，其各种修改对本领域技术人员将是显而易见的。The following examples illustrate preferred embodiments of the invention and are not intended to limit the scope of the invention in any way. While the invention has been described with reference to its preferred embodiments, various modifications will be apparent to those skilled in the art upon reading this application.

实施例1——具有改善的稳定性的RPGRorf15 cDNA序列的密码子优化人视网膜色素变性GTP酶调节因子开放阅读框15(hRPGRorf15)序列含有高度重复的、富含嘌呤的区域，这导致转基因盒克隆和质粒扩增过程中的序列不稳定性。hRPGRorf15 cDNA序列(NCBI参考序列NM_001034853.1)经过密码子优化以生成具有提高的在人细胞中的表达以及改善的序列稳定性的RPGRorf15 cDNA序列。Example 1 – Codon Optimization of the RPGORf15 cDNA Sequence with Improved Stability The open reading frame 15 (hRPGRorf15) sequence of the human retinitis pigmentosa GTPase regulator contains highly repetitive, purine-rich regions, leading to sequence instability during transgenic cassette cloning and plasmid amplification. The hRPGRorf15 cDNA sequence (NCBI reference sequence NM_001034853.1) was codon-optimized to generate an RPGGRorf15 cDNA sequence with improved expression in human cells and enhanced sequence stability.

密码子优化的核苷酸序列如下所示：The codon-optimized nucleotide sequence is shown below:

ATGAGAGAGCCTGAAGAGCTGATGCCTGATAGCGGAGCAGTGTTTACCTTTGGGAAGAATGAGAGAGCCTGAAGAGCTGATGCCTGATAGCGGAGCAGTGTTTACCTTTGGGAAGA

GCAAGTTCGCAGAGAATAACCCTGGGAAATTCTGGTTTAAGAACGACGTGCCCGTGCACGCAAGTTCGCAGAGAATAACCCTGGGAAATTCTGGTTTAAGAACGACGTGCCCGTGCAC

CTGAGCTGTGGCGATGAGCACTCCGCCGTGGTGACAGGCAACAATAAGCTGTACATGTTCTGAGCTGTGGCGATGAGCACTCCGCCGTGGTGACAGGCAACAATAAGCTGTACATGTT

CGGCTCTAACAATTGGGGACAGCTGGGCCTGGGAAGCAAGTCCGCCATCAGCAAGCCACGGCTCTAACAATTGGGGACAGCTGGGCCTGGGAAGCAAGTCCGCCATCAGCAAGCCA

ACCTGCGTGAAGGCCCTGAAGCCCGAGAAGGTGAAGCTGGCCGCCTGTGGCAGAAACACCTGCGTGAAGGCCCTGAAGCCCGAGAAGGTGAAGCTGGCCGCCTGTGGCAGAAAC

CACACACTGGTGAGCACCGAGGGAGGAAACGTGTACGCAACAGGAGGCAACAATGAACACACACTGGTGAGCACCGAGGGAGGAAACGTGTACGCAACAGGAGGCAACAATGAA

GGCCAGCTGGGCCTGGGCGACACAGAGGAGAGGAATACCTTTCACGTGATCAGCTTCTTGGCCAGCTGGGCCTGGGCGACACAGAGGAGAGGAATACCTTTCACGTGATCAGCTTCTT

TACCTCCGAGCACAAGATCAAGCAGCTGTCCGCCGGCTCTAACACAAGCGCCGCCCTGTACCTCCGAGCACAAGATCAAGCAGCTGTCCGCCGGCTCTAACACAAGCGCCGCCCTG

ACCGAGGACGGCCGCCTGTTCATGTGGGGCGATAATAGCGAGGGCCAGATCGGCCTGAACCGAGGACGGCCGCCTGTTCATGTGGGGCGATAATAGCGAGGGCCAGATCGGCCTGA

AGAACGTGTCCAACGTGTGCGTGCCTCAGCAGGTGACCATCGGCAAGCCAGTGTCCTGAGAACGTGTCCAACGTGTGCGTGCCTCAGCAGGTGACCATCGGCAAGCCAGTGTCCTG

GATCTCTTGTGGCTACTATCACAGCGCCTTCGTGACCACAGATGGCGAGCTGTACGTGTTGATCTCTTGTGGCTACTATCACAGCGCCTTCGTGACCACAGATGGCGAGCTGTACGTGTT

TGGAGAGCCAGAGAACGGCAAGCTGGGCCTGCCTAACCAGCTGCTGGGCAATCACCGGTGGAGAGCCAGAGAACGGCAAGCTGGGCCTGCCTAACCAGCTGCTGGGCAATCACCGG

ACACCCCAGCTGGTGTCCGAGATCCCTGAGAAAGTGATCCAGGTGGCATGCGGAGGAGACACCCCAGCTGGTGTCCGAGATCCCTGAGAAAGTGATCCAGGTGGCATGCGGAGGAG

AGCACACAGTGGTGCTGACCGAGAATGCCGTGTATACCTTCGGCCTGGGACAGTTTGGAAGCACACAGTGGTGCTGACCGAGAATGCCGTGTATACCTTCGGCCTGGGACAGTTTGGA

CAGCTGGGCCTGGGCACATTCCTGTTTGAGACAAGCGAGCCAAAAGTGATCGAGAACACAGCTGGGCCTGGGCACATTCCTGTTTGAGACAAGCGAGCCAAAAGTGATCGAGAACA

TCCGCGACCAGACAATCAGCTACATCTCCTGCGGCGAGAATCACACAGCCCTGATCACCTCCGCGACCAGACAATCAGCTACATCTCCTGCGGCGAGAATCACACAGCCCTGATCACC

GACATCGGCCTGATGTATACCTTTGGCGATGGCCGGCACGGCAAGCTGGGCCTGGGCCTGACATCGGCCTGATGTATACCTTTGGCGATGGCCGGCACGGCAAGCTGGGCCTGGGCCT

GGAGAACTTCACAAATCACTTTATCCCCACCCTGTGCTCTAACTTCCTGCGGTTCATCGTGGAGAACTTCACAAATCACTTTATCCCACCCTGTGCTCTAACTTCCTGCGGTTCATCGT

GAAGCTGGTGGCCTGCGGCGGCTGTCACATGGTGGTGTTCGCAGCACCTCACAGGGGAGAAGCTGGTGGCCTGCGGCGGCTGTCACATGGTGGTGTTCGCAGCACCTCACAGGGGA

GTGGCCAAGGAGATCGAGTTTGACGAGATCAACGATACATGCCTGTCCGTGGCCACCTTGTGGCCAAGGAGATCGAGTTTGACGAGATCAACGATACATGCCTGTCCGTGGCCACCTT

CCTGCCATACAGCTCCCTGACATCCGGCAATGTGCTGCAGCGCACCCTGTCTGCCAGGACCTGCCATACAGCTCCCTGACATCCGGCAATGTGCTGCAGCGCACCCTGTCTGCCAGGA

TGCGGAGAAGGGAGAGGGAGCGGTCCCCTGACTCTTTCAGCATGAGGCGGACACTGCCTGCGGAGAAGGGAGAGGGAGCGGTCCCCTGACTCTTTCAGCATGAGGCGGACACTGCC

ACCTATCGAGGGCACCCTGGGCCTGTCTGCCTGCTTCCTGCCTAACAGCGTGTTCCCAAACCTATCGAGGGCACCCTGGGCCTGTCTGCCTGCTTCCTGCCTAACAGCGTGTTCCCAA

GATGTAGCGAGAGGAATCTGCAGGAGTCTGTGCTGAGCGAGCAGGATCTGATGCAGCCGATGTAGCGAGAGGAATCTGCAGGAGTCTGTGCTGAGCGAGCAGGATCTGATGCAGCC

AGAGGAGCCCGACTACCTGCTGGATGAGATGACAAAGGAGGCCGAGATCGACAACTCTAGAGGAGCCCGACTACCTGCTGGATGAGATGACAAAGGAGGCCGAGATCGACAACTCT

AGCACCGTGGAGAGCCTGGGCGAGACAACAGATATCCTGAATATGACACACATCATGTCAGCACCGTGGAGAGCCTGGGCGAGACAACAGATATCCTGAATATGACACACATCATGTC

CCTGAACTCTAATGAGAAGTCTCTGAAGCTGAGCCCAGTGCAGAAGCAGAAGAAGCAGCCTGAACTCTAATGAGAAGTCTCTGAAGCTGAGCCCAGTGCAGAAGCAGAAGAAGCAG

CAGACCATCGGCGAGCTGACCCAGGACACAGCCCTGACCGAGAACGACGATTCTGATGCAGACATCGGCGAGCTGACCCAGGACACAGCCCTGACCGAGAACGACGATTCTGATG

AGTATGAGGAGATGAGCGAGATGAAGGAGGGCAAGGCCTGTAAGCAGCACGTGTCCCAAGTATGAGGAGATGAGCGAGATGAAGGAGGGCAAGGCCTGTAAGCAGCACGTGTCCCA

GGGCATCTTCATGACCCAGCCAGCCACCACAATCGAGGCCTTTTCTGACGAAGAGGTGGGGGCATCTTCATGACCCAGCCAGCCACCACAATCGAGGCCTTTTCTGACGAAGAGGTGG

AGATCCCCGAGGAGAAGGAGGGCGCCGAGGATAGCAAGGGCAATGGCATCGAGGAGCAGATCCCCGAGGAGAAGGAGGGCGCCGAGGATAGCAAGGGCAATGGCATCGAGGAGC

AGGAGGTGGAGGCCAACGAGGAGAATGTGAAGGTGCACGGCGGCAGAAAGGAGAAGAGGAGGTGGAGGCCAACGAGGAGAATGTGAAGGTGCACGGCGGCAGAAAGGAGAAG

ACAGAGATCCTGTCCGACGATCTGACCGACAAGGCCGAGGTGTCCGAGGGCAAGGCCAACAGAGATCCTGTCCGACGATCTGACCGACAAGGCCGAGGTGTCCGAGGGCAAGGCCA

AGTCTGTGGGAGAGGCAGAGGACGGACCAGAGGGACGCGGCGATGGAACCTGCGAGGAGTCTGTGGGAGAGGCAGAGGACGGACCAGAGGGACGCGGCGATGGAACCTGCGAGG

AGGGATCCTCTGGAGCAGAGCACTGGCAGGACGAAGAAAGAGAGAAGGGCGAGAAGGAGGGATCCTCTGGAGCAGAGCACTGGCAGGACGAAGAAAGAGAGAAGGGCGAGAAGG

ATAAGGGCAGAGGAGAGATGGAGAGGCCTGGAGAGGGAGAGAAGGAGCTGGCAGAGAATAAGGGCAGAGGAGAGATGGAGAGGCCTGGAGAGGGAGAGAAGGAGCTGGCAGAGA

AGGAGGAGTGGAAGAAGAGGGACGGCGAGGAGCAGGAGCAGAAGGAGAGAGAGCAGAGGAGGAGTGGAAGAAGAGGGACGGCGAGGAGCAGGAGCAGAAGGAGAGAGAGCAG

GGCCACCAGAAGGAGAGGAACCAGGAGATGGAGGAGGGAGGAGAGGAGGAGCACGGGGCCACCAGAAGGAGAGGAACCAGGAGATGGAGGAGGGAGGAGAGGAGGAGCACGG

CGAGGGAGAGGAGGAGGAGGGCGATAGAGAGGAAGAAGAGGAGAAGGAGGGAGAGGCGAGGGAGAGGAGGAGGAGGGCGATAGAGAGGAAGAAGAGGAGAAGGAGGGAGAGG

GCAAGGAGGAAGGCGAGGGAGAGGAGGTGGAGGGAGAAAGGGAGAAGGAGGAGGGGCAAGGAGGAAGGCGAGGGAGAGGAGGTGGAGGGAGAAAGGGAGAAGGAGGAGGG

AGAGCGCAAGAAGGAAGAAAGAGCAGGCAAGGAAGAGAAGGGAGAGGAGGAGGGCAGAGCGCAAGAAGGAAGAAAGAGCAGGCAAGGAAGAGAAGGGAGAGGAGGAGGGC

GATCAGGGCGAAGGAGAGGAGGAGGAGACAGAGGGAAGGGGAGAGGAGAAGGAGGAGATCAGGGCGAAGGAGAGGAGGAGGAGACAGAGGGAAGGGGAGAGGAGAAGGAGGA

GGGAGGAGAGGTCGAAGGAGGAGAAGTGGAGGAGGGCAAGGGCGAAAGAGAAGAGGGGGAGGAGAGGTCGAAGGAGGAGAAGTGGAGGAGGGCAAGGGCGAAAGAGAAGAGG

AGGAGGAGGAAGGCGAGGGCGAAGAAGAGGAGGGCGAGGGCGAGGAAGAAGAGGGAGGAGGAGGAAGGCGAGGGCGAAGAAGAGGAGGGCGAGGGCGAGGAAGAAGAGGG

CGAGGGCGAAGAGGAAGAAGGCGAGGGCAAGGGCGAGGAGGAGGGCGAAGAAGGCGCGAGGGCGAAGAGGAAGAAGGCGAGGGCAAGGGCGAGGAGGAGGGCGAAGAAGGCG

AAGGGGAGGAGGAGGGCGAAGAGGGAGAGGGCGAGGGCGAGGAGGAAGAAGGCGAAAGGGGAGGAGGAGGGCGAAGAGGGAGAGGGCGAGGGCGAGGAGGAAGAAGGCGA

AGGCGAAGGCGAAGAAGAAGGAGAAGGAGAGGGCGAAGAGGAGGAAGGCGAAGGAAGGCGAAGGCGAAGAAGAAGGAGAAGGAGAGGGCGAAGAGGAGGAAGGCGAAGGA

GAAGGAGAGGAGGAAGGAGAAGGGGAGGGCGAAGAGGAGGAGGGAGAAGGCAAGGGAAGGAGAGGAGGAAGGAGAAGGGGAGGGCGAAGAGGAGGAGGAAGAAGGCAAGG

GAGAAGAAGAAGGCGAAGAAGGCGAGGGAGAAGGCGAGGAAGAAGAAGGCGAGGGGAGAAGAAGAAGGCGAAGAAGGCGAGGGAGAAGGCGAGGAAGAAGAAGGCGAGGG

AGAGGGAGAGGACGGCGAAGGCGAGGGCGAGGAAGAGGAAGGAGAGTGGGAGGGCGAGAGGGAGAGGACGGCGAAGGCGAGGGCGAGGAAGAGGAAGGAGAGTGGGAGGGCG

AGGAAGAGGAGGGAGAAGGAGAAGGCGAAGAAGAAGGGGAAGGAGAGGGCGAGGAAGGAAGAGGAGGGAGAAGGAGAAGGCGAAGAAGAAGGGGAAGGAGAGGGCGAGGA

AGGAGAAGGCGAAGGCGAAGAGGAGGAGGGGGAAGGGGAGGGCGAGGAGGAAGAGAGGAGAAGGCGAAGGCGAAGAGGAGGAGGGGGAAGGGGAGGGCGAGGAGGAAGAG

GGAGAAGAGGAAGGCGAAGAAGAGGGAGAAGGCGAAGAGGAAGGAGAAGGCGAGGGGAGAAGAGGAAGGCGAAGAAGAGGGAGAAGGCGAAGAGGAAGGAGAAGGCGAGG

GAGAAGAAGAGGAGGAGGGCGAGGTCGAAGGCGAGGTGGAGGGCGAAGAGGGGGAAGAGAAGAAGAGGAGGAGGGCGAGGTCGAAGGCGAGGTGGAGGGCGAAGAGGGGGAA

GGCGAAGGCGAGGAGGAGGAAGGGGAAGAAGAAGGCGAGGAGAGAGAGAAAGAAGGGCGAAGGCGAGGAGGAGGAAGGGGAAGAAGAAGGCGAGGAGAGAGAGAAAGAAG

GCGAGGGCGAGGAGAACAGAAGGAATCGCGAAGAAGAAGAGGAAGAAGAGGGCAAGGCGAGGGCGAGGAGAACAGAAGGAATCGCGAAGAAGAAGAGGAAGAAGAGGGCAAG

TACCAGGAGACAGGCGAGGAGGAGAACGAGCGGCAGGATGGCGAGGAGTATAAGAAGTACCAGGAGACAGGCGAGGAGGAGAACGAGCGGCAGGATGGCGAGGAGTATAAGAAG

GTGTCCAAGATCAAGGGCTCTGTGAAGTACGGCAAGCACAAGACCTATCAGAAGAAGAGTGTCCAAGATCAAGGGCTCTGTGAAGTACGGCAAGCACAAGACCTATCAGAAGAAGA

GCGTGACCAACACACAGGGCAATGGCAAGGAGCAGCGCAGCAAGATGCCTGTGCAGTCGCGTGACCAACACACAGGGCAATGGCAAGGAGCAGCGCAGCAAGATGCCTGTGCAGTC

CAAGCGGCTGCTGAAGAATGGCCCCTCTGGGAGCAAGAAGTTTTGGAATAATGTCCTGCCACACTACCTGGAGCTGAAATGA(SEQ ID NO:10)。CAAGCGGCTGCTGAAGAATGGCCCCTCTGGGAGCAAGAAGTTTTGGAATAATGTCCTGCCACACTACCTGGAGCTGAAATGA (SEQ ID NO: 10).

通过GenScript构建含有在人G蛋白偶联受体激酶1启动子(也称为人视紫红质激酶启动子(hGRK))或普遍存在的3-磷酸甘油酸激酶(PGK)启动子的控制下的密码子优化的hRPGRorf15基因(SEQ ID NO:10)的AAV质粒。AAV plasmids containing the codon-optimized hRPGRorf15 gene (SEQ ID NO:10) under the control of the human G protein-coupled receptor kinase 1 promoter (also known as the human rhodopsin kinase promoter (hGRK)) or the ubiquitous 3-phosphoglycerate kinase (PGK) promoter were constructed using GenScript.

20ng的AAV质粒DNA用于转化感受态大肠杆菌(E.coli)(Cat.#C3040H,NewEngland BioLabs,Ipswich,MA)，并将细胞涂布在卡那霉素50μg/ml板(#L1025,Teknova,Hollister,CA)上。从所得菌落生长微量制备培养物，用GeneJET质粒微量制备试剂盒(Cat.#0503,ThermoFisher,Waltham,MA)制备DNA并进行限制性消化以鉴定阳性克隆。20 ng of AAV plasmid DNA was used to transform competent *E. coli* (Cat. #C3040H, New England BioLabs, Ipswich, MA), and the cells were plated on kanamycin 50 μg/ml plates (#L1025, Teknova, Hollister, CA). Microcultures were prepared from the resulting colonies, and DNA was prepared using the GeneJET plasmid micropreparation kit (Cat. #0503, Thermo Fisher, Waltham, MA) and restriction digested to identify positive clones.

尽管经过密码子优化，在限制性消化后仍检测到质粒生产过程中密码子优化的hRPGRorf15(SEQ ID NO:10)的序列不稳定性。Despite codon optimization, sequence instability of the codon-optimized hRPGRorf15 (SEQ ID NO:10) was still detected after restriction digestion during plasmid production.

使用不同的优化算法开发第二密码子优化的hRPGRorf15序列，所述优化算法包含包括但不限于密码子使用偏性、GC含量、富含AT或富含GC的区域、mRNA二级结构、RNA不稳定性基序、隐蔽剪接位点、内部chi位点和核糖体结合位点、以及重复序列的参数。通过将密码子适应指数(CAI)提升至0.89来改变人中的密码子使用偏性。平均GC含量从天然序列中的59.16优化为优化序列中的57以延长mRNA的半衰期。所得密码子优化的核苷酸序列(如本文中SEQ ID NO:1所示)含有改善的密码子使用、改变的GC含量、更好的mRNA稳定性和负性顺式作用元件的修饰。A second codon-optimized hRPGRorf15 sequence was developed using different optimization algorithms, including but not limited to parameters related to codon use bias, GC content, AT-rich or GC-rich regions, mRNA secondary structure, RNA instability motifs, cryptic splicing sites, internal chi sites and ribosome binding sites, and repetitive sequences. Codon use bias in humans was altered by increasing the codon fitness index (CAI) to 0.89. The average GC content was optimized from 59.16 in the native sequence to 57 in the optimized sequence to extend the mRNA half-life. The resulting codon-optimized nucleotide sequence (as shown in SEQ ID NO:1 in this paper) contains improved codon use, altered GC content, better mRNA stability, and modifications with negative cis-acting elements.

构建包含SEQ ID NO:5(SEQ ID NO:5包含(i)5'AAV2 ITR(SEQ ID NO:6)；(ii)在hGRK启动子(SEQ ID NO:4)的控制下的密码子优化的hRPGRorf15 cDNA(SEQ ID NO:1)；(iii)SV40晚期多聚A元件(SEQ ID NO:8)和(iv)3'AAV2 ITR(SEQ ID NO:7))的核苷酸序列的AAV质粒(pAAV-GRK启动子-cohRPGRorf15-SV40)。Construct an AAV plasmid (pAAV-GRK promoter-cohRPGRorf15-SV40) containing the nucleotide sequence of SEQ ID NO:5 (SEQ ID NO:5 contains (i) 5'AAV2 ITR (SEQ ID NO:6); (ii) codon-optimized hRPGRorf15 cDNA (SEQ ID NO:1) under the control of the hGRK promoter (SEQ ID NO:4); (iii) SV40 late poly-A element (SEQ ID NO:8) and (iv) 3'AAV2 ITR (SEQ ID NO:7)).

如下制备pAAV-GRK启动子-cohRPGRorf15-SV40 DNA。来自GenScript的质粒DNA(20ng)用于转化感受态大肠杆菌(Cat.#C3040H,New England BioLabs,Ipswich,MA)，并将细胞涂布在卡那霉素50μg/ml板(#L1025,Teknova,Hollister,CA)上。从所得菌落生长微量制备培养物，用GeneJET质粒微量制备试剂盒(Cat.#0503,ThermoFisher,Waltham,MA)制备DNA并进行限制性消化以鉴定阳性克隆。从一个阳性克隆生长50mL Terrific Broth中的培养物，并用Qiagen EndoFree Plasmid Maxi试剂盒(Cat.#12362,Qiagen,Hilden,Germany)制备DNA。用多种限制酶消化pAAV-GRK-cohRPGRorf15-SV40的大量制备物以验证质粒的身份。限制性消化物和预期片段的凝胶电泳显示在图1中。所有实际片段与预期片段匹配。通过Sanger DNA测序验证表达盒的序列。pAAV-GRK promoter-cohRPGRorf15-SV40 DNA was prepared as follows. 20 ng of plasmid DNA from GenScript was used to transform competent *E. coli* (Cat. #C3040H, New England BioLabs, Ipswich, MA), and the cells were plated on kanamycin 50 μg/ml plates (#L1025, Teknova, Hollister, CA). Microcultures were grown from the resulting colonies, and DNA was prepared using the GeneJET Plasmid Microculture Kit (Cat. #0503, Thermo Fisher, Waltham, MA) and subjected to restriction digestion to identify positive clones. A culture was grown from a positive clone in 50 mL of Terrific Broth, and DNA was prepared using the Qiagen EndoFree Plasmid Maxi Kit (Cat. #12362, Qiagen, Hilden, Germany). Large quantities of pAAV-GRK-cohRPGRorf15-SV40 were digested with multiple restriction enzymes to verify the plasmid's identity. Gel electrophoresis of the restriction digests and the expected fragments is shown in Figure 1. All actual fragments matched the expected fragments. The expression cassette sequence was verified by Sanger DNA sequencing.

结论：通过限制性消化正确映射了pAAV-GRK-cohRPGRorf15-SV40的大量制备物，并通过Sanger DNA测序验证了其完整性。由此，如SEQ ID NO:1所示的密码子优化的hRPGRorf15序列相对于SEQ ID NO:3的天然序列和SEQ ID NO:10的密码子优化的序列表现出优异的稳定性。Conclusion: A large number of preparations of pAAV-GRK-cohRPGRorf15-SV40 were correctly mapped by restriction digestion, and their integrity was verified by Sanger DNA sequencing. Thus, the codon-optimized hRPGRorf15 sequence shown in SEQ ID NO:1 exhibits excellent stability compared to the native sequence in SEQ ID NO:3 and the codon-optimized sequence in SEQ ID NO:10.

实施例2——由SEQ ID NO:1的密码子优化的hRPGRorf15表达的人RPGRorf15蛋白的表达和活性Example 2 – Expression and activity of human RPGrf15 protein expressed by hRPGRorf15 codon optimized by SEQ ID NO:1

在转染的HEK293T细胞中评估由pAAV-GRK-cohRPGRorf15-SV40表达的人RPGRorf15蛋白的表达和活性。The expression and activity of human RPGRorf15 protein expressed by pAAV-GRK-cohRPGRorf15-SV40 were evaluated in transfected HEK293T cells.

简而言之，将HEK293T细胞以2.0x 10^5个细胞/孔接种在12孔板中的1.0mL DMEM/10％ FBS培养基中。由于HEK293T细胞的高可转染性和蛋白表达而使用其。第二天，将与3.0μl FuGene6(Cat.#E2691,Promega,Madison,WI)复合的1.0μg AAV质粒DNA添加到在一式两份的孔中的细胞中。转染后两天，用PBS洗涤细胞，并在0.25mL含有1×Halt蛋白酶抑制剂(ThermoFisher)的1×被动裂解缓冲液(Promega)中裂解，在室温下摇动15分钟。通过在微量离心机中在4℃下以12,000g离心10分钟来沉淀(pellet)细胞碎片。收集上清液并储存在-20℃下。转染中分别包括无质粒和pAAV-PGK启动子-cohRPGRorf15-SV40样品作为阴性和阳性对照。pAAV-PGK启动子-cohRPGRorf15-SV40与上述AAV载体相同，除了密码子优化的hRPGRorf15可操作地连接至普遍存在的启动子3-磷酸甘油酸激酶(PGK)启动子而不是hGRK启动子。In summary, HEK293T cells were seeded at 2.0 x 10^5 cells/well in 1.0 mL of DMEM/10% FBS medium in 12-well plates. This medium was used due to the high transfectivity and protein expression of HEK293T cells. The next day, 1.0 μg of AAV plasmid DNA, compounded with 3.0 μl of FuGene6 (Cat.#E2691, Promega, Madison, WI), was added to the cells in duplicate wells. Two days post-transfection, cells were washed with PBS and lysed in 0.25 mL of 1× passive lysis buffer (Promega) containing 1× Halt protease inhibitor (ThermoFisher), with shaking for 15 min at room temperature. Cell debris was pelleted by centrifugation at 12,000 g for 10 min at 4 °C. The supernatant was collected and stored at -20 °C. Transfection included samples without plasmid and pAAV-PGK promoter-cohRPGRorf15-SV40 as negative and positive controls, respectively. The pAAV-PGK promoter-cohRPGRorf15-SV40 is identical to the AAV vector described above, except that the codon-optimized hRPGRorf15 is operatively linked to the ubiquitous 3-phosphoglycerate kinase (PGK) promoter instead of the hGRK promoter.

将细胞裂解物(20μl)与10μl 4×LDS、4μl 10×还原剂和6μl水混合(最终体积＝40μl)并在70℃下变性10分钟。将样品加载到12孔Bolt 4-12％ Bis-Tris Plus聚丙烯酰胺凝胶(Invitrogen,NW04122BOX)上，并在1×MOPS缓冲液中在200V下运行32分钟。用iBlot 2装置(ThermoFisher)将分离的蛋白转移至硝酸纤维素滤膜10分钟并使用iBind Flex装置(ThermoFisher)用第一抗RPGR(Sigma HPA001593 1:2000和GenScript CT-15U1729DC260_16 1:500)和抗多聚谷氨酰化GT335(AG-20B-0020 1:500,Adipogen,San Diego,CA)抗体进行探测。二抗是HRP缀合的山羊抗兔(ThermoFisher 31460)(用于抗RPGR一抗)和HRP缀合的山羊抗小鼠(ThermoFisher 31430)(用于抗多聚谷氨酰化一抗)。用SuperSignal WestDura化学发光底物(ThermoFisher 34076)使蛋白可视化，并在ChemiDoc MP(BioRad,Hercules,CA)上成像。使用的所有抗体列举在下表3中。Cell lysate (20 μl) was mixed with 10 μl 4×LDS, 4 μl 10× reducing agent, and 6 μl water (final volume = 40 μl) and denatured at 70 °C for 10 min. The sample was loaded onto a 12-well Bolt 4-12% Bis-Tris Plus polyacrylamide gel (Invitrogen, NW04122BOX) and run at 200 V for 32 min in 1×MOPS buffer. The isolated proteins were transferred to a nitrocellulose membrane using an iBlot 2 device (ThermoFisher) for 10 minutes and detected using an iBind Flex device (ThermoFisher) with primary antibodies against RPGR (Sigma HPA001593 1:2000 and GenScript CT-15U1729DC260_16 1:500) and anti-polyglutamyl GT335 (AG-20B-0020 1:500, Adipogen, San Diego, CA). Secondary antibodies were HRP-conjugated goat anti-rabbit (ThermoFisher 31460) (for the anti-RPGR primary antibody) and HRP-conjugated goat anti-mouse (ThermoFisher 31430) (for the anti-polyglutamyl GT335 primary antibody). Proteins were visualized using the SuperSignal WestDura chemiluminescent substrate (ThermoFisher 34076) and imaged on a ChemiDoc MP (BioRad, Hercules, CA). All antibodies used are listed in Table 3 below.

表3：蛋白印迹抗体Table 3: Western blot antibodies

抗体Antibody 宿主物种host species 销售商Salesperson 目录#Table of contents# 稀释度Dilution 抗RPGR多克隆Anti-RPGR polyclonal 兔rabbit SigmaSigma HPA001593HPA001593 1:2,0001:2,000 抗CT-15Anti-CT-15 兔rabbit GenScriptGenScript U1729DC260_16U1729DC260_16 1:5001:500 抗多聚谷氨酰化GT335Anti-polyglutamylation GT335 小鼠mice AdipogenAdipogen AG-20B-0020AG-20B-0020 1:5001:500 HRP抗兔IgG(H+L)HRP anti-rabbit IgG (H+L) 山羊goat ThermoThermo 3146031460 1:5,0001:5,000 HRP抗小鼠IgG(H+L)HRP anti-mouse IgG (H+L) 山羊goat ThermoThermo 3143031430 1:5,0001:5,000

图2显示了来自转染的HEK293T细胞的裂解物的代表性蛋白印迹的图像。CT-15和Sigma抗体检测到似乎是RPGRorf15的相同135-140kD物类(species)，因为其存在于RPGR转染的裂解物中，但不存在于未转染的裂解物中，具有正确的尺寸并且被多聚谷氨酰化检测抗体GT335识别。当由普遍存在的PGK启动子(其并不优先在感光细胞中活跃)驱动时，表达更高。Figure 2 shows a representative Western blot image of lysates from transfected HEK293T cells. CT-15 and Sigma antibodies detected what appeared to be the same 135-140 kDa species as RPGRorf15, present in RPGR-transfected lysates but not in untransfected lysates, exhibiting the correct size and being recognized by the polyglutamylation detection antibody GT335. Expression was higher when driven by the ubiquitous PGK promoter (which is not preferentially active in photoreceptor cells).

结论——来自转染的HEK293T细胞的裂解物的蛋白印迹分析表明了由SEQ ID NO:1的密码子优化的hRPGRorf15表达的正确尺寸的hRPGRorf15蛋白的表达和多聚谷氨酰化。Conclusion—Western blot analysis of lysates from transfected HEK293T cells demonstrated the expression and polyglutamylation of hRPGRorf15 protein of the correct size, optimized by the codon of SEQ ID NO:1.

实施例3——hRPGRorf15在人XLRP体外模型中的功能性表达生成人体外模型系统，以评估用具有SEQ ID NO:1的核苷酸序列的密码子优化的人RPGRorf15核酸对X连锁视网膜色素变性(XLRP)疾病表型的校正。为此，构建AAV载体，其包含由人G蛋白偶联受体视紫红质激酶1(hGRK)启动子驱动的SEQ ID NO:1的核苷酸序列(即实施例1和2中描述的AAV载体骨架，具有SEQ ID NO:5的序列)和具有SEQ ID NO:9的氨基酸序列的变体衣壳蛋白。选择hGRK启动子以将RPGRorf15的表达限制于光感受器。Example 3 – Functional Expression of hRPGRorf15 in an In Vitro Human XLRP Model: A human in vitro model system was generated to evaluate the correction of the X-linked retinitis pigmentosa (XLRP) phenotype to codons optimized with the nucleotide sequence of SEQ ID NO:1. For this purpose, an AAV vector was constructed comprising the nucleotide sequence of SEQ ID NO:1 (i.e., the AAV vector backbone described in Examples 1 and 2, having the sequence of SEQ ID NO:5) driven by the human G protein-coupled receptor rhodopsin kinase 1 (hGRK) promoter and a variant capsid protein having the amino acid sequence of SEQ ID NO:9. The hGRK promoter was selected to restrict the expression of RPGrrf15 to photoreceptors.

从患有XLRP的个体抽取的全血中分离外周血单核细胞(PBMC)，并使用CytoTuneiPS 2.0Sendai重编程试剂盒(Thermo Fisher Scientific,Waltham,MA)重编程为诱导的多能干细胞(iPSC)。通过免疫细胞化学检查iPSC标志物(包括Sox2、Oct4和Nanog)来确认多能干细胞的多能性。随后通过Gonzalez-Cordero等人,Stem Cell Report,9,820:837(2017)；Gonzalez Cordero等人,Human Gene Therapy,29(1)(2018)；和Meyer等人,StemCells,29(8):1206-1218(2011)中描述的方法将诱导的多能干细胞分化为光感受器。通过免疫细胞化学检查特异性标志物(恢复蛋白和视紫红质)来确认光感受器分化。证实了光感受器缺少hRPGRorf15蛋白表达和已知赋予功能性的hRPGorf15蛋白的谷氨酰化。Peripheral blood mononuclear cells (PBMCs) were isolated from whole blood drawn from individuals with XLRP and reprogrammed into induced pluripotent stem cells (iPSCs) using the CytoTuneiPS 2.0 Sendai reprogramming kit (Thermo Fisher Scientific, Waltham, MA). The pluripotency of the iPSCs was confirmed by immunocytochemistry for the detection of iPSC markers, including Sox2, Oct4, and Nanog. The induced pluripotent stem cells were then differentiated into photoreceptors using methods described in Gonzalez-Cordero et al., Stem Cell Report, 9, 820:837 (2017); Gonzalez-Cordero et al., Human Gene Therapy, 29(1) (2018); and Meyer et al., Stem Cells, 29(8):1206-1218 (2011). Photoreceptor differentiation was confirmed by immunocytochemistry for the detection of specific markers, namely, recovery protein and rhodopsin. This confirmed the absence of hRPGorf15 protein expression in photoreceptors and the known functionalization of hRPGorf15 protein through glutamylation.

免疫细胞化学如下：将细胞用4％多聚甲醛(PFA)(Santa Cruz Biotechnologies,Dallas,TX)在4℃下固定15分钟。在含有0.2％ Triton-X100(Sigma-Aldrich)、2％牛血清白蛋白(Millipore Sigma,Burlington,MA)和5％山羊血清(Thermo Fisher Scientific)的PBS封闭溶液中进行所有抗体染色。一抗在4℃下进行温育过夜。随后将细胞与二抗在室温下温育一小时，并随后在室温下用DAPI(Sigma Aldrich)在PBS中复染五分钟。使用ZeissAxio Observer.D1荧光显微镜对细胞进行成像。使用Zeiss Zen 2软件(Carl ZeissMicroscopy LLC,White Plains,NY)进行图像处理。一抗和二抗的列表在下表4中提供：Immunocytochemistry was performed as follows: Cells were fixed with 4% paraformaldehyde (PFA) (Santa Cruz Biotechnologies, Dallas, TX) at 4°C for 15 minutes. All antibody staining was performed in PBS blocking solution containing 0.2% Triton-X100 (Sigma-Aldrich), 2% bovine serum albumin (Millipore Sigma, Burlington, MA), and 5% goat serum (Thermo Fisher Scientific). Primary antibodies were incubated overnight at 4°C. Cells were then incubated with secondary antibodies at room temperature for one hour, followed by counterstaining with DAPI (Sigma-Aldrich) in PBS at room temperature for five minutes. Cells were imaged using a Zeiss Axio Observer.D1 fluorescence microscope. Image processing was performed using Zeiss Zen 2 software (Carl Zeiss Microscopy LLC, White Plains, NY). A list of primary and secondary antibodies is provided in Table 4 below.

表4Table 4

为了评估在转导到XLRP-iPSC来源的患病的光感受器中后密码子优化的RPGRorf15转基因的转录本水平，用上述AAV载体以50,000的感染复数(MOI，每个细胞的病毒基因组)转导XLRP光感受器(PR)，以确保水平高于测定的检测限。转导后30天分离RNA并合成cDNA。对制备的样品运行微滴式数字PCR，并分析作为拷贝/mL值的每个微滴的转录本水平。检查含有引物/探针组的转录本的微滴数量(高于设定阈值)的定量。创建两个引物/探针组以特异性区分密码子优化的人RPGRorf15转基因与内源性人RPGR1-19组成型同工型(hRPGR1-19)。To assess the transcript levels of the codon-optimized RPGrrf15 transgene in diseased photoreceptors transduced into XLRP-iPSC-derived cells, XLRP photoreceptors (PRs) were transduced with the aforementioned AAV vector at a multiplicity of infection (MOI, viral genome per cell) of 50,000 to ensure levels above the detection limit. RNA was isolated and cDNA synthesized 30 days post-transduction. Droplet digital PCR was run on the prepared samples, and transcript levels were analyzed as copies/mL. Quantification of the number of droplets containing the primer/probe set (above a set threshold) was examined. Two primer/probe sets were created to specifically distinguish the codon-optimized human RPGrrf15 transgene from the endogenous human RPGR1-19 constitutive isoform (hRPGR1-19).

如预期的那样，未转导的XLRP患病细胞表达低的、背景水平的cohRPGRorf15转录本。在用AAV载体转导后，与hRPGR1-19相比，细胞显示出cohRPGRorf15转录本水平超过400倍的提高。与未转导的细胞cohRPGRorf15水平相比，转导的细胞在cohRPGRorf15转录本方面表现出超过1000倍的提高。未转导的细胞具有比cohRPGRorf15更高的hRPGR1-19水平。参见图3。一式三份进行分析，并将水平取平均值。用包含SEQ ID NO:1的密码子优化的hRPGRorf15的AAV载体的转导显著提高了光感受器培养物中cohRPGRorf15的转录本水平。As expected, untransduced XLRP-infected cells expressed low, background levels of the cohRPGRorf15 transcript. After transduction with the AAV vector, cells showed a more than 400-fold increase in cohRPGRorf15 transcript levels compared to hRPGR1-19. Transduced cells showed a more than 1000-fold increase in cohRPGRorf15 transcript levels compared to untransduced cells. Untransduced cells had higher hRPGR1-19 levels than cohRPGRorf15. See Figure 3. Analyses were performed in triplicate, and levels were averaged. Transduction with an AAV vector containing hRPGRorf15 codons optimized for SEQ ID NO:1 significantly increased cohRPGRorf15 transcript levels in photoreceptor cultures.

为了评估通过用AAV载体转导XLRP-iPSC来源的感光细胞产生的密码子优化的人RPGRorf15转基因的蛋白水平，以50,000vg/细胞的MOI转导XLRP-iPSC来源的患病光感受器。转导后30天收集细胞裂解物，并进行SDS-PAGE和蛋白印迹分析以评估hRPGRorf15蛋白水平。将条带强度量化并在图4中描绘为直方图。与未转导的细胞相比，用AAV载体的转导引起人RPGRorf15蛋白表达的显著增加。To assess the protein levels of codon-optimized human RPGrf15 transgenes generated by transduction of XLRP-iPSC-derived photoreceptors with an AAV vector, diseased photoreceptors derived from XLRP-iPSCs were transduced at an MOI of 50,000 vg/cell. Cell lysates were collected 30 days post-transduction, and SDS-PAGE and Western blot analysis were performed to assess hRPGRorf15 protein levels. Band intensities were quantified and plotted as a histogram in Figure 4. Transduction with the AAV vector resulted in a significant increase in human RPGrf15 protein expression compared to untransduced cells.

为了确定外源引入到光感受器中的cohRPGRorf15蛋白是否是功能性的，检测了谷氨酰化(功能的代表)。根据公开的著作，hRPGRorf15的谷氨酰化和蛋白功能强烈相关。(Fischer等人,2017；Rao等人,2016；Sun等人,2016)。以50,000vg/细胞的MOI转导XLRP-iPSC来源的患病PR。转导后30天收集细胞裂解物，并进行SDS-PAGE和蛋白印迹分析以评估表达的hRPGRorf15蛋白的谷氨酰化。通过用谷氨酰化特异性抗体GT335探测膜并在hRPGRorf15的尺寸127kDa处检查阳性条带模式来确定谷氨酰化。将条带强度量化并在图5中描绘为直方图。在两种XLRP患者来源的患病光感受器中，与未转导的细胞相比，用包含密码子优化的hRPGRorf15核苷酸序列的AAV载体转导PR细胞导致人RPGRorf15蛋白的谷氨酰化的显著增加。To determine whether the exogenously introduced cohRPGRorf15 protein in the photoreceptor was functional, gamma-glutamylation (a representative of function) was detected. According to published literature, hRPGRorf15 gamma-glutamylation is strongly correlated with protein function (Fischer et al., 2017; Rao et al., 2016; Sun et al., 2016). Diseased PR cells derived from XLRP-iPSCs were transduced at an MOI of 50,000 vg/cell. Cell lysates were collected 30 days post-transduction, and SDS-PAGE and Western blot analysis were performed to assess gamma-glutamylation of expressed hRPGRorf15 protein. Gamma-glutamylation was determined by probing the membrane with the gamma-glutamylation-specific antibody GT335 and examining the positive band pattern at the hRPGRorf15 size of 127 kDa. Band intensity was quantified and plotted as a histogram in Figure 5. In diseased photoreceptors derived from two XLRP patients, transduction of PR cells with an AAV vector containing a codon-optimized hRPGRorf15 nucleotide sequence resulted in a significant increase in glutamylation of the human RPGRorf15 protein compared to untransduced cells.

由于在使用高MOI的情况下在蛋白印迹中检测到低hRPGRorf15蛋白水平，因此验证了hRPGRorf15密码子优化的转基因(cohRPGRorf15)的剂量反应。为此，构建AAV载体，其包含可操作地连接至普遍存在的启动子3-磷酸甘油酸激酶(PGK)的SEQ ID NO:1的密码子优化的RPGRorf15序列和SEQ ID NO:9的衣壳(除启动子外，该AAV载体与上述AAV载体相同)。以三种MOI(5,000、10,000和20,000)转导患病的光感受器。转导后30天收集细胞裂解物，并进行SDS-PAGE和蛋白印迹分析以评估hRPGRorf15蛋白水平和谷氨酰化(GT335＝抗谷氨酰化抗体)。将条带强度量化并描绘为直方图(图6)。尽管由于培养物的异质性，因此存在高变异性，但使用组成型启动子来驱动cohRPGRorf15表达，在较低的MOI下观察到hRPGRorf15蛋白和hRPGRorf15的谷氨酰化。The dose-response of the codon-optimized transgene (cohRPGRorf15) was validated because low hRPGRorf15 protein levels were detected in Western blots at high MOIs. To this end, an AAV vector was constructed containing the codon-optimized RPGRorf15 sequence of SEQ ID NO:1 operably linked to the ubiquitous promoter 3-phosphoglycerate kinase (PGK) and the capsid of SEQ ID NO:9 (this AAV vector is identical to the AAV vector described above, except for the promoter). Diseased photoreceptors were transduced at three MOIs (5,000, 10,000, and 20,000). Cell lysates were collected 30 days post-transduction and subjected to SDS-PAGE and Western blot analysis to assess hRPGRorf15 protein levels and glutamylation (GT335 = anti-glutamylation antibody). Band intensities were quantified and plotted as histograms (Figure 6). Despite high variability due to the heterogeneity of the cultures, hRPGRorf15 protein and hRPGRorf15 glutamylation were observed at low MOIs when constitutive promoters were used to drive cohRPGRorf15 expression.

结论——用iPSC来源的光感受器进行的体外研究已经表明，AAV介导的SEQ IDNO:1的密码子优化的hRPGRorf15的递送恢复了人XLRP患病光感受器中的人RPGRorf15转录本和转基因表达。此外，在4D-125转导后表达的RPGRorf15蛋白被翻译后谷氨酰化。基于公开的文献，谷氨酰化赋予了RPGRorf15的功能性。Conclusion—In vitro studies using iPSC-derived photoreceptors have demonstrated that AAV-mediated delivery of codon-optimized hRPGRorf15 (SEQ ID NO:1) restores the human RPGORf15 transcript and transgene expression in human XLRP-affected photoreceptors. Furthermore, the RPGORf15 protein expressed after 4D-125 transduction was post-translational glutonylated. Based on published literature, glutonylation confers functionality to RPGORf15.

实施例4——在非人灵长类动物中经由玻璃体内施用由R100递送的密码子优化的RPGRorf15 cDNA序列的安全性和生物分布的评估Example 4 – Safety and biodistribution assessment of intravitreal administration of a codon-optimized RPGrf15 cDNA sequence delivered by R100 in non-human primates

材料和方法Materials and methods

GLP毒理学和生物分布研究GLP toxicology and biodistribution studies

2-14岁的雄性食蟹猴(cynomolgus macaque)(食蟹猴(macaca fascicularis))经由两次50μL玻璃体内注射通过巩膜向每只眼中给药，总剂量体积为100μL/眼。评估1×10¹¹vg/眼和1×10¹²vg/眼的剂量。用氯胺酮IM麻醉动物并给予局部眼用溶液以消除疼痛。注射后每周通过IM注射施用20-80mg的甲基强的松龙。在施用后第3周、第13周和第26周由受过训练的兽医人员进行安乐死。Male cynomolgus macaques (macaca fascicularis) aged 2–14 years were administered the drug to each eye via scleral injection twice, with a total dose volume of 100 μL/eye. Doses of 1 × ^10¹¹ vg/eye and 1 × ^10¹² vg/eye were evaluated. Animals were anesthetized with ketamine IM and given a topical ophthalmic solution to relieve pain. Methylprednisolone was administered weekly via IM at 20–80 mg. Euthanasia was performed by trained veterinarians at weeks 3, 13, and 26 post-administration.

使用经验证的符合GLP的qPCR测定评估所有主要眼部腔室(视网膜、视神经、睫状体、虹膜、小梁网)和主要全身器官(包括睾丸)中的4D-125(包含SEQ ID NO:9的衣壳蛋白和含有SEQ ID NO:5的核苷酸序列的异源核酸的rAAV)基因组生物分布。在检测到基因组的组织中，通过合格的符合GLP的RT-qPCR测定评估转基因表达。The biodistribution of the 4D-125 genome (rAAV containing the capsid protein of SEQ ID NO:9 and a heterologous nucleic acid containing the nucleotide sequence of SEQ ID NO:5) in all major ocular compartments (retina, optic nerve, ciliary body, iris, trabecular meshwork) and major systemic organs (including the testes) was assessed using validated, GLP-compliant qPCR assays. Transgenic expression was assessed in tissues where the genome was detected using qualified, GLP-compliant RT-qPCR assays.

在研究中进行的系列毒理学评估是：临床眼部评估(完整的眼科检查，包括SD-OCT成像和ERG)、全身评估、临床病理学、大体病理学和显微病理学。对测定进行验证以确定抗衣壳和抗转基因抗体反应。对ELISpot测定进行验证以检测对R100(包含SEQ ID NO:9的变体衣壳蛋白)衣壳和表达的蛋白的细胞反应。The series of toxicological assessments performed in the study included: clinical ophthalmic assessment (complete ophthalmic examination, including SD-OCT imaging and ERG), systemic assessment, clinicopathology, gross pathology, and micropathology. Assays were validated to determine anti-capsid and anti-GMO antibody responses. The ELISpot assay was validated to detect cellular responses to the R100 (a variant capsid protein containing SEQ ID NO:9) capsid and expressed proteins.

中和抗体测定Neutralizing antibody assay

在感染前24小时，以3×10⁴个细胞/孔的密度将2v6.11细胞铺板。在感染前，将编码由CAG启动子驱动的萤火虫荧光素酶的rAAV载体与个体血清样品一起在37℃下温育1小时，并随后以1,000的基因组MOI感染细胞。感染后48小时，使用Luc-Screen Extended-Glow荧光素酶报告基因测定系统(Invitrogen)或ONE-Glo荧光素酶测定系统(Promega)评估荧光素酶活性，并使用BioTek Cytation 3细胞成像多功能酶标仪和Gen5软件来量化。24 hours prior to infection, 2v6.11 cells were plated at a density of 3 × ^10⁴ cells/well. Prior to infection, the rAAV vector encoding firefly luciferase driven by the CAG promoter was incubated with individual serum samples at 37°C for 1 hour, followed by cell infection with a genomic MOI of 1,000. 48 hours post-infection, luciferase activity was assessed using either the Luc-Screen Extended-Glow luciferase reporter assay system (Invitrogen) or the ONE-Glo luciferase assay system (Promega), and quantified using a BioTek Cytation 3 multi-mode microplate reader and Gen5 software.

在参与研究之前，筛选非人灵长类动物(NHP)血清中针对R100的中和抗体的存在。当样品在1:10血清稀释度下导致AAV转导的中和小于50％时，将NHP纳入研究。Prior to inclusion in the study, the presence of neutralizing antibodies against R100 in non-human primate (NHP) serum was screened. NHPs were included in the study when samples resulted in less than 50% neutralization of AAV transduction at a 1:10 serum dilution.

AAV制造AAV Manufacturing

通过在HEK293细胞中瞬时转染产生重组R100病毒载体。细胞在补充有FBS的DMEM中培养，并维持在37℃、5％ CO₂环境中。使用聚乙烯亚胺(PEI)将细胞三次转染(有效负载、衣壳和辅助质粒)。转染后48-96小时，从细胞和/或上清液中收获病毒粒子，并经由微射流裂解细胞。酶促处理细胞裂解物和/或上清液以降解质粒和宿主细胞DNA，随后通过切向流过滤(TFF)澄清和浓缩。随后将TFF渗余物加载到亲和树脂柱上以便纯化。在pH梯度洗脱后，将亲和后材料进行缓冲液交换，随后通过阴离子交换色谱进一步纯化(如果需要的话)。随后将纯化的rAAV配制到含有0.001％聚山梨醇酯-20的DPBS中，无菌过滤，并灌装以产生rAAV药物产品。Recombinant R100 viral vectors were generated by transient transfection into HEK293 cells. Cells were cultured in DMEM supplemented with FBS and maintained at 37°C and 5% _CO2 . Cells were transfected three times (load, capsid, and helper plasmid) using polyethyleneimine (PEI). Viral particles were harvested from cells and/or supernatant 48–96 h post-transfection, and cells were lysed via microfluidic lysis. Cell lysates and/or supernatant were enzymatically treated to degrade plasmids and host cell DNA, followed by clarification and concentration by tangential flow filtration (TFF). The TFF residue was then loaded onto an affinity resin column for purification. After elution with a pH gradient, the post-affinity material was buffer-exchanged and then further purified by anion exchange chromatography (if desired). The purified rAAV was then reconstituted in DPBS containing 0.001% polysorbate-20, aseptically filtered, and filled to produce the rAAV drug product.

结果result

4D-125递送是安全的，并导致NHP中治疗性转基因的表达4D-125 delivery is safe and results in the expression of therapeutic transgenes in NHP.

4D-125(R100.GRK-cohRPGRorf15)已进入1-2期临床试验。该产品的研究性新药(IND)申报数据包括在6个月良好实验室规范(GLP)毒理学和生物分布研究中的评估(表5)。通过玻璃体内注射单眼施用来注射30个NHP的总计30只眼。4D-125 (R100.GRK-cohRPGRorf15) has entered Phase 1-2 clinical trials. The Investigational New Drug (IND) data for this product includes evaluations in a 6-month Good Laboratory Practice (GLP) toxicology and biodistribution study (Table 5). A total of 30 eyes were injected intravitreally via unilateral administration to 30 NHPs.

表5：良好实验室规范(GLP)毒理学和生物分布研究Table 5: Good Laboratory Practice (GLP) Toxicological and Biodistribution Studies

如通过临床观察、组织病理学、OCT或ERG所确定的那样，在任一剂量水平下用4D-110未观察到显著毒性。将4D-110施用到单只眼中导致仅最小至轻度的前葡萄膜炎，其限于施用后即刻时期并在第3周消退(图9)；在一些情况下，全身类固醇剂量瞬时提高。As determined by clinical observation, histopathology, OCT, or ERG, no significant toxicity was observed with 4D-110 at any dose level. Administration of 4D-110 to a single eye resulted in only minimal to mild anterior uveitis, which was limited to the immediate period after administration and subsided by week 3 (Figure 9); in some cases, the systemic steroid dose was instantaneously increased.

在所有时间点(第3周，左图；第13周，中图；第26周，右图)，在经处理的眼的视网膜中存在非常高水平的载体基因组，这表明载体在眼部组织中的持久性(图10)。除视网膜外，在所有时间点在经处理的眼中来自房水、玻璃体液、虹膜/睫状体和视神经的样品内均检测到载体基因组。非眼部组织通常没有可检测的载体基因组，除了肝、脾和淋巴结中的低水平(图10)。在来自低剂量组和高剂量组的经处理的视网膜和虹膜/睫状体中检测到R100载体来源的转基因表达(图11)。基因表达是剂量依赖性的，并且从第3周至第13周增加，并在第26周保持稳定(图11，分别为左图、中图和右图)。在第26周未检测到非眼部载体表达(图11)。Very high levels of the vector genome were present in the retina of the treated eyes at all time points (week 3, left panel; week 13, middle panel; week 26, right panel), indicating the persistence of the vector in ocular tissues (Fig. 10). In addition to the retina, the vector genome was detected in samples from the aqueous humor, vitreous humor, iris/ciliary body, and optic nerve of the treated eyes at all time points. The vector genome was generally not detectable in non-ocular tissues, except for low levels in the liver, spleen, and lymph nodes (Fig. 10). Transgenic expression derived from the R100 vector was detected in the treated retina and iris/ciliary body from both the low-dose and high-dose groups (Fig. 11). Gene expression was dose-dependent and increased from week 3 to week 13, remaining stable at week 26 (Fig. 11, left, middle, and right panels, respectively). No non-ocular vector expression was detected at week 26 (Fig. 11).

使用ELISpot测定来评估细胞免疫反应，没有动物对R100衣壳肽或转基因肽产生显著反应(数据未显示)。用4D-125给药的大多数动物在施用后产生抗衣壳抗体反应(数据未显示)。Cellular immune responses were assessed using the ELISpot assay; no animals showed a significant response to the R100 capsid peptide or the transgenic peptide (data not shown). Most animals administered 4D-125 developed anti-capsid antibody responses following administration (data not shown).

总结Summarize

4D-125(R100.GRK-cohRPGRorf15)近来已转化为遗传性视网膜疾病X连锁视网膜色素变性的临床试验(NCT04517149)。已经在GLP毒理学和生物分布研究中评估了该治疗产品(表5)。通过单眼施用注射总共30个NHP；总计注射30只NHP眼。没有报告显著的测试品相关不良事件或T细胞反应。观察到轻度至中度的一过性皮质类固醇反应性前葡萄膜炎。转基因表达定位于视网膜，并且在评估的全身器官的任一个中均未检测到表达。人类临床试验正在进行中，以便确定通过玻璃体内注射的该产品的安全性、药效学和功效(包括通过连续视野测试和光学相干断层扫描)。4D-125 (R100.GRK-cohRPGRorf15) has recently been converted to a clinical trial for the hereditary retinal disease X-linked retinitis pigmentosa (NCT04517149). This therapeutic product has been evaluated in GLP toxicology and biodistribution studies (Table 5). A total of 30 NHPs were treated with unilateral injection; a total of 30 NHP eyes were injected. No significant test-product-related adverse events or T-cell responses were reported. Mild to moderate transient corticosteroid-responsive anterior uveitis was observed. Transgenic expression was localized to the retina and was not detected in any of the systemic organs evaluated. Human clinical trials are ongoing to determine the safety, pharmacodynamics, and efficacy of this product via intravitreal injection (including by continuous visual field testing and optical coherence tomography).

实施例5——在人X连锁视网膜色素变性患者中经由玻璃体内施用由R100递送的密码子优化的RPGRorf15 cDNA序列的安全性的评估Example 5 – Safety assessment of intravitreal administration of a codon-optimized RPGrf15 cDNA sequence delivered by R100 in patients with human X-linked retinitis pigmentosa.

初始1期剂量递增安全性和耐受性数据汇总Summary of safety and tolerability data for initial phase 1 dose escalation

临床试验设计和招募Clinical trial design and recruitment

临床试验采用标准的“3+3”剂量递增，其设计用于评估在两个剂量水平(3E11或1E12 vg/眼)下单次玻璃体内注射4D-125的安全性、耐受性和生物活性。在剂量递增群组中招募总共6个患者，每种剂量水平三个。患者接受逐渐减量的标准免疫抑制方案；调整由研究者确定。所述结果基于施用后1-9个月之间的数据截止值。The clinical trial employed a standard “3+3” dose escalation design to evaluate the safety, tolerability, and bioactivity of a single intravitreal injection of 4D-125 at two dose levels (3E11 or 1E12 vg/eye). A total of six patients were recruited in the dose escalation cohort, three at each dose level. Patients received a standard immunosuppressive regimen with gradual dose reduction; adjustments were determined by the investigator. Results were based on data cutoffs between 1 and 9 months post-administration.

初始耐受性和不良事件概况Initial Tolerability and Adverse Events Overview

如在治疗中出现的不良事件(AE)汇总表(表6)中概述的那样，4D-125在整个评估期内耐受性良好：As outlined in the summary table of adverse events (AEs) that occurred during treatment (Table 6), 4D-125 was well tolerated throughout the evaluation period:

表6.不良事件汇总Table 6. Summary of Adverse Events

招募的患者#Recruited patients# 66 剂量dose 3E11或1E12 vg/眼3E11 or 1E12 vg/eye 数据截止时的随访(月)Follow-up (months) at the data cutoff point 4-9个月4-9 months 剂量限制性毒性(DLT)Dose-limiting toxicities (DLT) 0(0％)0 (0%) 严重AESevere AE 0(0％)0 (0%) 任何≥3级的CTCAEAny CTCAE of level 3 or higher 0(0％)0 (0%) 视网膜AE(任何等级)Retinal AE (any level) 0(0％)0 (0%) 葡萄膜炎CTCAE 2级(中度)Uveitis, CTCAE grade 2 (moderate) 1/6(17％)1/6 (17%) 葡萄膜炎CTCAE 1级(轻度)Uveitis, CTCAE grade 1 (mild) 2/6(33％)2/6 (33%)

临床评估Clinical assessment

使用微视野检查(MP)测量视网膜敏感性并使用SD-OCT测量椭圆体带面积(EZA)，来评估初步生物活性。七名受试者(中位年龄42.5岁；范围27-56岁)接受了4D-125(3×10¹¹vg/眼(n＝3)和1×10¹²vg/眼(n＝4))，进行4.2-12.5个月的随访。眼内炎症(4/7受试者)为轻度或中度、一过性(持续时间0.9-1.6个月)和类固醇反应性的。大多数受试者患有晚期疾病，仅2名在两只眼中在基线(BL)和至少4个月的随访时具有可测量的EZA和平均MP视网膜敏感性(mMPRS)。两名受试者在受治疗的眼对未治疗的眼中在mMPRS方面(在9个月时+1.65dB对+0.25dB和在4个月时+0.50dB对+0.10dB；BL值1.5-3.2dB)和获得≥7dB灵敏性的基因座数量方面(在9个月时6对1和在4个月时3对0)与BL相比均有更大的提高。对于两名受试者，EZA在受治疗的眼对未治疗的眼中与BL相比的相对降低较少(在9个月时-12.4％对-16.2％和在6个月时-20.2％对-28.7％)。Preliminary bioactivity was assessed using micro-field examination (MP) to measure retinal sensitivity and ellipsoidal zone area (EZA) to measure using SD-OCT. Seven subjects (median age 42.5 years; range 27–56 years) received 4D-125 (3 × ^10¹¹ vg/eye (n=3) and 1 × ^10¹² vg/eye (n=4)) and were followed up for 4.2–12.5 months. Intraocular inflammation (4/7 subjects) was mild to moderate, transient (duration 0.9–1.6 months), and steroid-responsive. Most subjects had advanced disease, and only 2 subjects had measurable EZA and mean MP retinal sensitivity (mMPRS) in both eyes at baseline (BL) and at least 4 months of follow-up. Both subjects showed greater improvements in mMPRS (+1.65 dB vs. +0.25 dB at 9 months and +0.50 dB vs. +0.10 dB at 4 months; BL values 1.5–3.2 dB) and the number of loci achieving ≥7 dB sensitivity (6 vs. 1 at 9 months and 3 vs. 0 at 4 months) compared to BL in the treated eye versus the untreated eye. For both subjects, EZA showed a smaller relative decrease compared to BL in the treated eye versus the untreated eye (-12.4% vs. -16.2% at 9 months and -20.2% vs. -28.7% at 6 months).

在1/2期研究过程中，密切监测患者的眼部和全身状态，包括详细的眼科评估和视网膜成像以及必要时的血液测试和全身检查。进行各种视觉功能和解剖学评估以检测任何初步功效信号。这些评估包括但不限于椭圆体带(EZ)面积的测量、眼底自发荧光、微视野检查、静态自动视野检查和最佳矫正视力(BCVA)。Throughout the Phase 1/2 study, patients' ocular and systemic conditions were closely monitored, including detailed ophthalmic evaluations and retinal imaging, as well as blood tests and systemic examinations where necessary. Various visual functional and anatomical assessments were performed to detect any preliminary efficacy signals. These assessments included, but were not limited to, measurement of the ellipsoidal zone (EZ) area, fundus autofluorescence, micro-field examination, static automated visual field testing, and best-corrected visual acuity (BCVA).

结论in conclusion

玻璃体内施用4D-125耐受性良好，具有轻度或中度、一过性和类固醇反应性的眼内炎症。基于微视野检查和SD-OCT，在2名可评估的剂量递增受试者中观察到生物活性的初步迹象。这些发现支持在正在进行的1/2期研究中在患有病情较轻的晚期疾病(lessadvanced disease)的XLRP受试者中以1×10¹²vg/眼的剂量进行剂量扩展。Intravitreal administration of 4D-125 was well tolerated, with mild to moderate, transient, and steroid-responsive intraocular inflammation. Preliminary indications of biological activity were observed in two evaluable dose-escalation subjects based on micro-field examination and SD-OCT. These findings support dose escalation at 1 × ^10¹² vg/eye in XLRP subjects with less advanced disease in an ongoing phase 1/2 study.

虽然已经就优选实施方案描述了本发明的材料和方法，对于本领域技术人员显而易见的是，在不脱离本发明的概念、精神和范围的情况下，可以对本文中描述的方法施以改变。对于本领域技术人员显而易见的所有此类类似的替代和修改均被认为在本发明的精神、范围和概念内。While the materials and methods of the invention have been described with reference to preferred embodiments, it will be apparent to those skilled in the art that changes may be made to the methods described herein without departing from the concept, spirit, and scope of the invention. All such similar alternatives and modifications that are apparent to those skilled in the art are considered to be within the spirit, scope, and concept of the invention.

<110> 4D Molecular Therapeutics, Inc.<110> 4D Molecular Therapeutics, Inc.

<120> 密码子优化的RPGRORF15基因及其用途<120> Codon-optimized RPGRORF15 gene and its applications

<130> 090400-5012 WO<130> 090400-5012 WO

<150> US 63/073,843<150> US 63/073,843

<151> 2020-09-02<151> 2020-09-02

<160> 10<160> 10

<170> PatentIn version 3.5<170> PatentIn version 3.5

<210> 1<210> 1

<211> 3459<211> 3459

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 密码子优化的RPGRorf15<223> Codon optimization RPGRorf15

<400> 1<400> 1

atgagagaac ccgaggaact gatgcccgac tctggcgccg tgtttacctt cggcaagagc 60atgagagaac ccgaggaact gatgcccgac tctggcgccg tgtttacctt cggcaagagc 60

aagttcgccg agaacaaccc cggcaagttc tggttcaaga acgacgtgcc agtgcacctg 120aagttcgccg agaacaaccc cggcaagttc tggttcaaga acgacgtgcc agtgcacctg 120

agctgcggag atgaacactc tgccgtggtc accggcaaca acaagctgta catgttcggc 180agctgcggag atgaacactc tgccgtggtc accggcaaca acaagctgta catgttcggc 180

agcaacaact ggggccagct cggcctggga tctaagtctg ccatcagcaa gcctacctgc 240agcaacaact ggggccagct cggcctggga tctaagtctg ccatcagcaa gcctacctgc 240

gtgaaggccc tgaagcctga gaaagtgaaa ctggccgcct gcggcagaaa tcacaccctg 300gtgaaggccc tgaagcctga gaaagtgaaa ctggccgcct gcggcagaaa tcacaccctg 300

gtttctaccg aaggcggcaa tgtgtatgcc accggcggaa acaatgaggg acagcttgga 360gtttctaccg aaggcggcaa tgtgtatgcc accggcggaa acaatgaggg acaagcttgga 360

ctgggcgaca ccgaggaaag aaacaccttc cacgtgatca gctttttcac cagcgagcac 420ctgggcgaca ccgaggaaag aaacaccttc cacgtgatca gctttttcac cagcgagcac 420

aagatcaagc agctgagcgc cggctctaat acctctgccg ctctgacaga ggacggcaga 480aagatcaagc agctgagcgc cggctctaat acctctgccg ctctgacaga ggacggcaga 480

ctgtttatgt ggggcgacaa ttctgagggc cagatcggac tgaagaacgt gtccaatgtg 540ctgtttatgt ggggcgacaa ttctgagggc cagatcggac tgaagaacgt gtccaatgtg 540

tgcgtgcccc agcaagtgac aatcggcaag cctgtgtctt ggatcagctg cggctactac 600tgcgtgcccc agcaagtgac aatcggcaag cctgtgtctt ggatcagctg cggctactac 600

cacagcgcct ttgtgacaac cgatggcgag ctgtatgtgt tcggcgagcc agagaatggc 660cacagcgcct ttgtgacaac cgatggcgag ctgtatgtgt tcggcgagcc agagaatggc 660

aagctgggac tgcctaacca gctgctgggc aatcacagaa cccctcagct ggtgtctgag 720aagctgggac tgcctaacca gctgctgggc aatcacagaa cccctcagct ggtgtctgag 720

atccccgaaa aagtgatcca ggtggcctgt ggcggagagc acacagtggt gctgacagag 780atccccgaaa aagtgatcca ggtggcctgt ggcggagagc acaagtggt gctgacagag 780

aatgccgtgt acacctttgg cctgggccag tttggacaac tcggactggg aaccttcctg 840aatgccgtgt aaccctttgg cctgggccag tttggacaac tcggactggg aaccttcctg 840

ttcgagacaa gcgagcccaa agtgatcgag aacatccggg accagaccat cagctacatc 900ttcgagacaa gcgagcccaa agtgatcgag aacatccggg accagaccat cagctacatc 900

agctgtggcg agaaccacac agccctgatc acagacatcg gcctgatgta cacattcggc 960agctgtggcg agaaccacac agccctgatc acagacatcg gcctgatgta cacattcggc 960

gacggaaggc atggaaagct cggacttggc ctggaaaact tcaccaacca cttcatccct 1020gacggaaggc atggaaagct cggacttggc ctggaaaact tcaccaacca cttcatccct 1020

acgctgtgca gcaacttcct gcggttcatt gtgaagctgg tggcctgcgg aggatgccac 1080acgctgtgca gcaacttcct gcggttcatt gtgaagctgg tggcctgcgg aggatgccac 1080

atggtggttt ttgctgcccc tcacagaggc gtggccaaag agattgagtt cgacgagatc 1140atggtggttt ttgctgcccc tcacagaggc gtggccaaag agattgagtt cgacgagatc 1140

aacgatacct gcctgagcgt ggccaccttc ctgccttaca gcagcctgac atctggcaac 1200aacgatacct gcctgagcgt ggccaccttc ctgccttaca gcagcctgac atctggcaac 1200

gtgctgcaga ggacactgag cgccagaatg cgcagacggg aaagagagag aagccccgac 1260gtgctgcaga ggacactgag cgccagaatg cgcagacggg aaagagagag aagccccgac 1260

agcttcagca tgagaagaac cctgcctcca atcgagggca cactgggcct gtctgcctgc 1320agcttcagca tgagaagaac cctgcctcca atcgagggca cactgggcct gtctgcctgc 1320

tttctgccta acagcgtgtt ccccagatgc agcgagagaa acctgcaaga gagcgtgctg 1380tttctgccta acagcgtgtt ccccagatgc agcgagagaa acctgcaaga gagcgtgctg 1380

agcgagcagg atctgatgca gcctgaggaa cccgactacc tgctggacga gatgaccaaa 1440agcgagcagg atctgatgca gcctgaggaa cccgactacc tgctggacga gatgaccaaa 1440

gaggccgaga tcgacaacag cagcacagtg gaaagcctgg gcgagacaac cgacatcctg 1500gaggccgaga tcgacaacag cagcacagtg gaaagcctgg gcgagacaac cgacatcctg 1500

aacatgaccc acatcatgag cctgaacagc aacgagaagt ctctgaagct gagccccgtg 1560aacatgaccc acatcatgag cctgaacagc aacgagaagt ctctgaagct gagccccgtg 1560

cagaagcaga agaagcagca gaccatcggc gagctgacac aggatactgc cctgaccgag 1620cagaagcaga agaagcagca gaccatcggc gagctgacac aggatactgc cctgaccgag 1620

aacgacgaca gcgacgagta cgaagagatg agcgagatga aggaaggcaa ggcctgcaag 1680aacgacgaca gcgacgagta cgaagagatg agcgagatga aggaaggcaa ggcctgcaag 1680

cagcacgtgt cccagggcat ctttatgacc cagcctgcca ccaccatcga ggccttttcc 1740cagcacgtgt cccagggcat ctttatgacc cagcctgcca ccaccatcga ggccttttcc 1740

gacgaggaag tggaaatccc cgaggaaaaa gagggcgccg aggacagcaa aggcaacggc 1800gacgaggaag tggaaatccc cgaggaaaaa gagggcgccg aggacagcaa aggcaacggc 1800

attgaggaac aagaggtgga agccaacgaa gagaacgtga aggtgcacgg cggacggaaa 1860attgaggaac aagaggtgga agccaacgaa gagaacgtga aggtgcacgg cggacggaaa 1860

gaaaagaccg agatcctgag cgacgacctg accgataagg ccgaggtttc cgagggcaaa 1920gaaaagaccg agatcctgag cgacgacctg accgataagg ccgaggtttc cgagggcaaa 1920

gccaagtctg tgggagaagc cgaggatgga cctgaaggcc gcggagatgg aacctgtgaa 1980gccaagtctg tgggagaagc cgaggatgga cctgaaggcc gcggagatgg aacctgtgaa 1980

gaaggatcta gcggagccga gcactggcag gatgaggaac gcgagaaggg cgagaaagac 2040gaaggatcta gcggagccga gcactggcag gatgaggaac gcgagaaggg cgagaaagac 2040

aaaggcagag gcgagatgga aagacccggc gagggcgaaa aagagctggc cgagaaagag 2100aaaggcagag gcgagatgga aagacccggc gaggcgaaa aagagctggc cgagaaagag 2100

gaatggaaga aacgcgacgg cgaagaacaa gagcagaaag aaagagagca gggccaccag 2160gaatggaaga aacgcgacgg cgaagaacaa gagcagaaag aaagagagca gggccaccag 2160

aaagaacgga atcaagagat ggaagaaggc ggcgaggaag aacacggcga aggggaagaa 2220aaagaacgga atcaagagat ggaagaaggc ggcgaggaag aacacggcga agggggaagaa 2220

gaggaaggcg accgagagga agaagaagag aaagaaggcg aaggcaaaga agaaggcgag 2280gaggaaggcg accgagagga agaagaagag aaagaaggcg aaggcaaaga agaaggcgag 2280

ggcgaagagg tggaaggcga gcgtgaaaaa gaagagggcg aacgcaagaa agaagaacgc 2340ggcgaagagg tggaaggcga gcgtgaaaaa gaagagggcg aacgcaagaa agaagaacgc 2340

gccggaaaag aggaaaaagg cgaggaagag ggcgaccaag gcgaaggcga ggaagaagaa 2400gccggaaaag aggaaaaagg cgaggaagag ggcgaccaag gcgaaggcga ggaagaagaa 2400

actgaaggca gaggggaaga gaaagaggaa ggcggcgaag tcgaaggcgg agaggttgaa 2460actgaaggca gaggggaaga gaaagaggaa ggcggcgaag tcgaaggcgg agaggttgaa 2460

gaaggcaaag gcgagcgaga agaggaagaa gaagaaggcg aaggcgagga agaggaaggc 2520gaaggcaaag gcgagcgaga agaggaagaa gaagaaggcg aaggcgagga agaggaaggc 2520

gaaggcgaag aggaagaagg cgaaggggaa gaagaagaag gcgaaggcaa gggcgaagag 2580gaaggcgaag aggaagaagg cgaaggggaa gaagaagaag gcgaaggcaa gggcgaagag 2580

gagggcgaag aaggcgaggg cgaagaggag ggcgaagaag gcgaaggcga gggcgaagaa 2640gagggcgaag aaggcgaggg cgaagaggag ggcgaagaag gcgaaggcga gggcgaagaa 2640

gaagaaggcg aaggcgaagg cgaggaagaa ggcgaaggcg aaggggaaga agaggaaggc 2700gaagaaggcg aaggcgaagg cgaggaagaa ggcgaaggcg aaggggaaga agggaaggc 2700

gaaggcgaag gcgaagaaga aggcgaaggc gagggcgaag aggaagaagg cgaaggcaaa 2760gaaggcgaag gcgaagaaga aggcgaaggc gagggcgaag aggaagaagg cgaaggcaaa 2760

ggggaagaag aaggcgagga aggcgaaggc gaaggcgagg aagaagaagg cgaaggcgag 2820ggggaagaag aaggcgagga aggcgaaggc gaaggcgagg aagaagaagg cgaaggcgag 2820

ggcgaagatg gcgaaggcga aggcgaagag gaagagggcg agtgggaggg cgaagaagag 2880ggcgaagatg gcgaaggcga aggcgaagag gaagagggcg agtgggaggg cgaagaagag 2880

gaaggcgaag gcgagggcga agaggaaggc gaaggcgagg gcgaagaagg cgaaggcgaa 2940gaaggcgaag gcgagggcga agaggaaggc gaaggcgagg gcgaagaagg cgaaggcgaa 2940

ggcgaggaag aggaaggcga aggcgaaggg gaagaagaag agggcgaaga agaaggcgaa 3000ggcgaggaag aggaaggcga aggcgaaggg gaagaagaag agggcgaaga agaaggcgaa 3000

gaggaaggcg aaggggaaga agaaggcgaa ggcgaaggcg aagaagagga agagggcgaa 3060gaggaaggcg aaggggaaga agaaggcgaa ggcgaaggcg aagaagagga agagggcgaa 3060

gttgaaggcg aggttgaggg cgaagaaggc gaaggcgaag gggaagaaga agaaggcgag 3120gttgaaggcg aggttgaggg cgaagaaggc gaaggcgaag gggaagaaga agaaggcgag 3120

gaagaagggg aagagagaga aaaagaaggc gagggcgaag aaaaccgccg gaaccgcgaa 3180gaagaagggg aagagaga aaaagaaggc gagggcgaag aaaaccgccg gaaccgcgaa 3180

gaggaagagg aagaagaggg caagtaccaa gagactggcg aggaagagaa cgagcggcag 3240gaggaagagg aagaagaggg caagtaccaa gagactggcg aggaagagaa cgagcggcag 3240

gatggcgaag agtacaagaa ggtgtccaag atcaagggca gcgtgaagta cggcaagcac 3300gatggcgaag agtacaagaa ggtgtccaag atcaagggca gcgtgaagta cggcaagcac 3300

aagacctacc agaagaagtc cgtcaccaac acgcaaggca atggaaaaga acagcggagc 3360aagacctacc agaagaagtc cgtcaccaac acgcaaggca atggaaaaga acagcggagc 3360

aagatgcccg tgcagtccaa gaggctgctg aagaatggcc ctagcggcag caagaaattc 3420aagatgcccg tgcagtccaa gaggctgctg aagaatggcc ctagcggcag caagaaattc 3420

tggaacaatg tgctgcccca ctacctcgag ctgaagtga 3459tggaacaatg tgctgcccca ctacctcgag ctgaagtga 3459

<210> 2<210> 2

<211> 1152<211> 1152

<212> PRT<212> PRT

<213> 智人<213> Homo sapiens

<400> 2<400> 2

Met Arg Glu Pro Glu Glu Leu Met Pro Asp Ser Gly Ala Val Phe ThrMet Arg Glu Pro Glu Glu Leu Met Pro Asp Ser Gly Ala Val Phe Thr

1 5 10 151 5 10 15

Phe Gly Lys Ser Lys Phe Ala Glu Asn Asn Pro Gly Lys Phe Trp PhePhe Gly Lys Ser Lys Phe Ala Glu Asn Asn Pro Gly Lys Phe Trp Phe

20 25 3020 25 30

Lys Asn Asp Val Pro Val His Leu Ser Cys Gly Asp Glu His Ser AlaLys Asn Asp Val Pro Val His Leu Ser Cys Gly Asp Glu His Ser Ala

35 40 4535 40 45

Val Val Thr Gly Asn Asn Lys Leu Tyr Met Phe Gly Ser Asn Asn TrpVal Val Thr Gly Asn Asn Lys Leu Tyr Met Phe Gly Ser Asn Asn Trp

50 55 6050 55 60

Gly Gln Leu Gly Leu Gly Ser Lys Ser Ala Ile Ser Lys Pro Thr CysGly Gln Leu Gly Leu Gly Ser Lys Ser Ala Ile Ser Lys Pro Thr Cys

65 70 75 8065 70 75 80

Val Lys Ala Leu Lys Pro Glu Lys Val Lys Leu Ala Ala Cys Gly ArgVal Lys Ala Leu Lys Pro Glu Lys Val Lys Leu Ala Ala Cys Gly Arg

85 90 9585 90 95

Asn His Thr Leu Val Ser Thr Glu Gly Gly Asn Val Tyr Ala Thr GlyAsn His Thr Leu Val Ser Thr Glu Gly Gly Asn Val Tyr Ala Thr Gly

100 105 110100 105 110

Gly Asn Asn Glu Gly Gln Leu Gly Leu Gly Asp Thr Glu Glu Arg AsnGly Asn Asn Glu Gly Gln Leu Gly Leu Gly Asp Thr Glu Glu Arg Asn

115 120 125115 120 125

Thr Phe His Val Ile Ser Phe Phe Thr Ser Glu His Lys Ile Lys GlnThr Phe His Val Ile Ser Phe Phe Thr Ser Glu His Lys Ile Lys Gln

130 135 140130 135 140

Leu Ser Ala Gly Ser Asn Thr Ser Ala Ala Leu Thr Glu Asp Gly ArgLeu Ser Ala Gly Ser Asn Thr Ser Ala Ala Leu Thr Glu Asp Gly Arg

145 150 155 160145 150 155 160

Leu Phe Met Trp Gly Asp Asn Ser Glu Gly Gln Ile Gly Leu Lys AsnLeu Phe Met Trp Gly Asp Asn Ser Glu Gly Gln Ile Gly Leu Lys Asn

165 170 175165 170 175

Val Ser Asn Val Cys Val Pro Gln Gln Val Thr Ile Gly Lys Pro ValVal Ser Asn Val Cys Val Pro Gln Gln Val Thr Ile Gly Lys Pro Val

180 185 190180 185 190

Ser Trp Ile Ser Cys Gly Tyr Tyr His Ser Ala Phe Val Thr Thr AspSer Trp Ile Ser Cys Gly Tyr Tyr His His Ser Ala Phe Val Thr Thr Thr Asp

195 200 205195 200 205

Gly Glu Leu Tyr Val Phe Gly Glu Pro Glu Asn Gly Lys Leu Gly LeuGly Glu Leu Tyr Val Phe Gly Glu Pro Glu Asn Gly Lys Leu Gly Leu

210 215 220210 215 220

Pro Asn Gln Leu Leu Gly Asn His Arg Thr Pro Gln Leu Val Ser GluPro Asn Gln Leu Leu Gly Asn His Arg Thr Pro Gln Leu Val Ser Glu

225 230 235 240225 230 235 240

Ile Pro Glu Lys Val Ile Gln Val Ala Cys Gly Gly Glu His Thr ValIle Pro Glu Lys Val Ile Gln Val Ala Cys Gly Gly Glu His Thr Val

245 250 255245 250 255

Val Leu Thr Glu Asn Ala Val Tyr Thr Phe Gly Leu Gly Gln Phe GlyVal Leu Thr Glu Asn Ala Val Tyr Thr Phe Gly Leu Gly Gln Phe Gly

260 265 270260 265 270

Gln Leu Gly Leu Gly Thr Phe Leu Phe Glu Thr Ser Glu Pro Lys ValGln Leu Gly Leu Gly Thr Phe Leu Phe Glu Thr Ser Glu Pro Lys Val

275 280 285275 280 285

Ile Glu Asn Ile Arg Asp Gln Thr Ile Ser Tyr Ile Ser Cys Gly GluIle Glu Asn Ile Arg Asp Gln Thr Ile Ser Tyr Ile Ser Cys Gly Glu

290 295 300290 295 300

Asn His Thr Ala Leu Ile Thr Asp Ile Gly Leu Met Tyr Thr Phe GlyAsn His Thr Ala Leu Ile Thr Asp Ile Gly Leu Met Tyr Thr Phe Gly

305 310 315 320305 310 315 320

Asp Gly Arg His Gly Lys Leu Gly Leu Gly Leu Glu Asn Phe Thr AsnAsp Gly Arg His Gly Lys Leu Gly Leu Gly Leu Glu Asn Phe Thr Asn

325 330 335325 330 335

His Phe Ile Pro Thr Leu Cys Ser Asn Phe Leu Arg Phe Ile Val LysHis Phe Ile Pro Thr Leu Cys Ser Asn Phe Leu Arg Phe Ile Val Lys

340 345 350340 345 350

Leu Val Ala Cys Gly Gly Cys His Met Val Val Phe Ala Ala Pro HisLeu Val Ala Cys Gly Gly Cys His Met Val Val Phe Ala Ala Pro His

355 360 365355 360 365

Arg Gly Val Ala Lys Glu Ile Glu Phe Asp Glu Ile Asn Asp Thr CysArg Gly Val Ala Lys Glu Ile Glu Phe Asp Glu Ile Asn Asp Thr Cys

370 375 380370 375 380

Leu Ser Val Ala Thr Phe Leu Pro Tyr Ser Ser Leu Thr Ser Gly AsnLeu Ser Val Ala Thr Phe Leu Pro Tyr Ser Ser Leu Thr Ser Gly Asn

385 390 395 400385 390 395 400

Val Leu Gln Arg Thr Leu Ser Ala Arg Met Arg Arg Arg Glu Arg GluVal Leu Gln Arg Thr Leu Ser Ala Arg Met Arg Arg Arg Glu Arg Glu

405 410 415405 410 415

Arg Ser Pro Asp Ser Phe Ser Met Arg Arg Thr Leu Pro Pro Ile GluArg Ser Pro Asp Ser Phe Ser Met Arg Arg Thr Leu Pro Pro Ile Glu

420 425 430420 425 430

Gly Thr Leu Gly Leu Ser Ala Cys Phe Leu Pro Asn Ser Val Phe ProGly Thr Leu Gly Leu Ser Ala Cys Phe Leu Pro Asn Ser Val Phe Pro

435 440 445435 440 445

Arg Cys Ser Glu Arg Asn Leu Gln Glu Ser Val Leu Ser Glu Gln AspArg Cys Ser Glu Arg Asn Leu Gln Glu Ser Val Leu Ser Glu Gln Asp

450 455 460450 455 460

Leu Met Gln Pro Glu Glu Pro Asp Tyr Leu Leu Asp Glu Met Thr LysLeu Met Gln Pro Glu Glu Pro Asp Tyr Leu Leu Asp Glu Met Thr Lys

465 470 475 480465 470 475 480

Glu Ala Glu Ile Asp Asn Ser Ser Thr Val Glu Ser Leu Gly Glu ThrGlu Ala Glu Ile Asp Asn Ser Ser Thr Val Glu Ser Leu Gly Glu Thr

485 490 495485 490 495

Thr Asp Ile Leu Asn Met Thr His Ile Met Ser Leu Asn Ser Asn GluThr Asp Ile Leu Asn Met Thr His Ile Met Ser Leu Asn Ser Asn Glu

500 505 510500 505 510

Lys Ser Leu Lys Leu Ser Pro Val Gln Lys Gln Lys Lys Gln Gln ThrLys Ser Leu Lys Leu Ser Pro Val Gln Lys Gln Lys Lys Gln Gln Thr

515 520 525515 520 525

Ile Gly Glu Leu Thr Gln Asp Thr Ala Leu Thr Glu Asn Asp Asp SerIle Gly Glu Leu Thr Gln Asp Thr Ala Leu Thr Glu Asn Asp Asp Ser

530 535 540530 535 540

Asp Glu Tyr Glu Glu Met Ser Glu Met Lys Glu Gly Lys Ala Cys LysAsp Glu Tyr Glu Glu Met Ser Glu Met Lys Glu Gly Lys Ala Cys Lys

545 550 555 560545 550 555 560

Gln His Val Ser Gln Gly Ile Phe Met Thr Gln Pro Ala Thr Thr IleGln His Val Ser Gln Gly Ile Phe Met Thr Gln Pro Ala Thr Thr Ile

565 570 575565 570 575

Glu Ala Phe Ser Asp Glu Glu Val Glu Ile Pro Glu Glu Lys Glu GlyGlu Ala Phe Ser Asp Glu Glu Val Glu Ile Pro Glu Glu Lys Glu Gly

580 585 590580 585 590

Ala Glu Asp Ser Lys Gly Asn Gly Ile Glu Glu Gln Glu Val Glu AlaAla Glu Asp Ser Lys Gly Asn Gly Ile Glu Glu Gln Glu Val Glu Ala

595 600 605595 600 605

Asn Glu Glu Asn Val Lys Val His Gly Gly Arg Lys Glu Lys Thr GluAsn Glu Glu Asn Val Lys Val His Gly Gly Arg Lys Glu Lys Thr Glu

610 615 620610 615 620

Ile Leu Ser Asp Asp Leu Thr Asp Lys Ala Glu Val Ser Glu Gly LysIle Leu Ser Asp Asp Leu Thr Asp Lys Ala Glu Val Ser Glu Gly Lys

625 630 635 640625 630 635 640

Ala Lys Ser Val Gly Glu Ala Glu Asp Gly Pro Glu Gly Arg Gly AspAla Lys Ser Val Gly Glu Ala Glu Asp Gly Pro Glu Gly Arg Gly Asp

645 650 655645 650 655

Gly Thr Cys Glu Glu Gly Ser Ser Gly Ala Glu His Trp Gln Asp GluGly Thr Cys Glu Glu Gly Ser Ser Gly Ala Glu His Trp Gln Asp Glu

660 665 670660 665 670

Glu Arg Glu Lys Gly Glu Lys Asp Lys Gly Arg Gly Glu Met Glu ArgGlu Arg Glu Lys Gly Glu Lys Asp Lys Gly Arg Gly Glu Met Glu Arg

675 680 685675 680 685

Pro Gly Glu Gly Glu Lys Glu Leu Ala Glu Lys Glu Glu Trp Lys LysPro Gly Glu Gly Glu Lys Glu Leu Ala Glu Lys Glu Glu Trp Lys Lys

690 695 700690 695 700

Arg Asp Gly Glu Glu Gln Glu Gln Lys Glu Arg Glu Gln Gly His GlnArg Asp Gly Glu Glu Gln Glu Gln Lys Glu Arg Glu Gln Gly His Gln

705 710 715 720705 710 715 720

Lys Glu Arg Asn Gln Glu Met Glu Glu Gly Gly Glu Glu Glu His GlyLys Glu Arg Asn Gln Glu Met Glu Glu Gly Gly Glu Glu Glu His Gly

725 730 735725 730 735

Glu Gly Glu Glu Glu Glu Gly Asp Arg Glu Glu Glu Glu Glu Lys GluGlu Gly Glu Glu Glu Glu Glu Gly Asp Arg Glu Glu Glu Glu Glu Lys Glu

740 745 750740 745 750

Gly Glu Gly Lys Glu Glu Gly Glu Gly Glu Glu Val Glu Gly Glu ArgGly Glu Gly Lys Glu Glu Gly Glu Gly Glu Glu Val Glu Gly Glu Arg

755 760 765755 760 765

Glu Lys Glu Glu Gly Glu Arg Lys Lys Glu Glu Arg Ala Gly Lys GluGlu Lys Glu Glu Gly Glu Arg Lys Lys Glu Glu Arg Ala Gly Lys Glu

770 775 780770 775 780

Glu Lys Gly Glu Glu Glu Gly Asp Gln Gly Glu Gly Glu Glu Glu GluGlu Lys Gly Glu Glu Glu Glu Gly Asp Gln Gly Glu Gly Glu Glu Glu Glu

785 790 795 800785 790 795 800

Thr Glu Gly Arg Gly Glu Glu Lys Glu Glu Gly Gly Glu Val Glu GlyThr Glu Gly Arg Gly Glu Glu Lys Glu Glu Gly Gly Glu Val Glu Gly

805 810 815805 810 815

Gly Glu Val Glu Glu Gly Lys Gly Glu Arg Glu Glu Glu Glu Glu GluGly Glu Val Glu Glu Gly Lys Gly Glu Arg Glu Glu Glu Glu Glu

820 825 830820 825 830

Gly Glu Gly Glu Glu Glu Glu Gly Glu Gly Glu Glu Glu Glu Gly GluGly Glu Gly Glu Glu Glu Glu Gly Glu Gly Glu Glu Glu Glu Glu Gly Glu

835 840 845835 840 845

Gly Glu Glu Glu Glu Gly Glu Gly Lys Gly Glu Glu Glu Gly Glu GluGly Glu Glu Glu Glu Gly Glu Gly Lys Gly Glu Glu Glu Gly Glu Glu

850 855 860850 855 860

Gly Glu Gly Glu Glu Glu Gly Glu Glu Gly Glu Gly Glu Gly Glu GluGly Glu Gly Glu Glu Glu Gly Glu Glu Gly Glu Gly Glu Gly Glu Glu

865 870 875 880865 870 875 880

Glu Glu Gly Glu Gly Glu Gly Glu Glu Glu Gly Glu Gly Glu Gly GluGlu Glu Gly Glu Gly Glu Gly Glu Glu Glu Gly Glu Gly Glu Gly Glu

885 890 895885 890 895

Glu Glu Glu Gly Glu Gly Glu Gly Glu Glu Glu Gly Glu Gly Glu GlyGlu Glu Glu Gly Glu Gly Glu Gly Glu Glu Glu Gly Glu Gly Glu Gly

900 905 910900 905 910

Glu Glu Glu Glu Gly Glu Gly Lys Gly Glu Glu Glu Gly Glu Glu GlyGlu Glu Glu Glu Gly Glu Gly Lys Gly Glu Glu Glu Gly Glu Glu Glu Gly

915 920 925915 920 925

Glu Gly Glu Gly Glu Glu Glu Glu Gly Glu Gly Glu Gly Glu Asp GlyGlu Gly Glu Gly Glu Glu Glu Glu Gly Glu Gly Glu Gly Glu Asp Gly

930 935 940930 935 940

Glu Gly Glu Gly Glu Glu Glu Glu Gly Glu Trp Glu Gly Glu Glu GluGlu Gly Glu Gly Glu Glu Glu Glu Gly Glu Trp Glu Gly Glu Glu Glu

945 950 955 960945 950 955 960

Glu Gly Glu Gly Glu Gly Glu Glu Glu Gly Glu Gly Glu Gly Glu GluGlu Gly Glu Gly Glu Gly Glu Glu Glu Gly Glu Gly Glu Gly Glu Glu

965 970 975965 970 975

Gly Glu Gly Glu Gly Glu Glu Glu Glu Gly Glu Gly Glu Gly Glu GluGly Glu Gly Glu Gly Glu Glu Glu Glu Gly Glu Gly Glu Gly Glu Glu

980 985 990980 985 990

Glu Glu Gly Glu Glu Glu Gly Glu Glu Glu Gly Glu Gly Glu Glu GluGlu Glu Gly Glu Glu Glu Glu Gly Glu Glu Glu Gly Glu Gly Glu Glu Glu

995 1000 1005995 1000 1005

Gly Glu Gly Glu Gly Glu Glu Glu Glu Glu Gly Glu Val Glu GlyGly Glu Gly Glu Gly Glu Glu Glu Glu Glu Gly Glu Val Glu Gly

1010 1015 10201010 1015 1020

Glu Val Glu Gly Glu Glu Gly Glu Gly Glu Gly Glu Glu Glu GluGlu Val Glu Gly Glu Glu Gly Glu Gly Glu Gly Glu Glu Glu Glu

1025 1030 10351025 1030 1035

Gly Glu Glu Glu Gly Glu Glu Arg Glu Lys Glu Gly Glu Gly GluGly Glu Glu Glu Gly Glu Glu Arg Glu Lys Glu Gly Glu Gly Glu

1040 1045 10501040 1045 1050

Glu Asn Arg Arg Asn Arg Glu Glu Glu Glu Glu Glu Glu Gly LysGlu Asn Arg Arg Asn Arg Glu Glu Glu Glu Glu Glu Glu Gly Lys

1055 1060 10651055 1060 1065

Tyr Gln Glu Thr Gly Glu Glu Glu Asn Glu Arg Gln Asp Gly GluTyr Gln Glu Thr Gly Glu Glu Glu Asn Glu Arg Gln Asp Gly Glu

1070 1075 10801070 1075 1080

Glu Tyr Lys Lys Val Ser Lys Ile Lys Gly Ser Val Lys Tyr GlyGlu Tyr Lys Lys Val Ser Lys Ile Lys Gly Ser Val Lys Tyr Gly

1085 1090 10951085 1090 1095

Lys His Lys Thr Tyr Gln Lys Lys Ser Val Thr Asn Thr Gln GlyLys His Lys Thr Tyr Tyr Gln Lys Lys Ser Val Thr Asn Thr Gln Gly

1100 1105 11101100 1105 1110

Asn Gly Lys Glu Gln Arg Ser Lys Met Pro Val Gln Ser Lys ArgAsn Gly Lys Glu Gln Arg Ser Lys Met Pro Val Gln Ser Lys Arg

1115 1120 11251115 1120 1125

Leu Leu Lys Asn Gly Pro Ser Gly Ser Lys Lys Phe Trp Asn AsnLeu Leu Lys Asn Gly Pro Ser Gly Ser Lys Lys Phe Trp Asn Asn

1130 1135 11401130 1135 1140

Val Leu Pro His Tyr Leu Glu Leu LysVal Leu Pro His Tyr Leu Glu Leu Lys

1145 11501145 1150

<210> 3<210> 3

<211> 3459<211> 3459

<212> DNA<212> DNA

<213> 智人<213> Homo sapiens

<400> 3<400> 3

atgagggagc cggaagagct gatgcccgat tcgggtgctg tgtttacatt tgggaaaagt 60atgagggagc cggaagagct gatgcccgat tcgggtgctg tgtttacatt tgggaaaagt 60

aaatttgctg aaaataatcc cggtaaattc tggtttaaaa atgatgtccc tgtacatctt 120aaatttgctg aaaataatcc cggtaaattc tggtttaaaa atgatgtccc tgtacatctt 120

tcatgtggag atgaacattc tgctgttgtt accggaaata ataaacttta catgtttggc 180tcatgtggag atgaacattc tgctgttgtt accggaaata ataaacttta catgtttggc 180

agtaacaact ggggtcagtt aggattagga tcaaagtcag ccatcagcaa gccaacatgt 240agtaacaact ggggtcagtt aggattagga tcaaagtcag ccatcagcaa gccaacatgt 240

gtcaaagctc taaaacctga aaaagtgaaa ttagctgcct gtggaaggaa ccacaccctg 300gtcaaagctc taaaacctga aaaagtgaaa ttagctgcct gtggaaggaa ccacaccctg 300

gtgtcaacag aaggaggcaa tgtatatgca actggtggaa ataatgaagg acagttgggg 360gtgtcaacag aaggaggcaa tgtatatgca actggtggaa ataatgaagg acagttgggg 360

cttggtgaca ccgaagaaag aaacactttt catgtaatta gcttttttac atccgagcat 420cttggtgaca ccgaagaaag aaacactttt catgtaatta gcttttttac atccgagcat 420

aagattaagc agctgtctgc tggatctaat acttcagctg ccctaactga ggatggaaga 480aagattaagc agctgtctgc tggatctaat acttcagctg ccctaactga ggatggaaga 480

ctttttatgt ggggtgacaa ttccgaaggg caaattggtt taaaaaatgt aagtaatgtc 540ctttttatgt ggggtgacaa ttccgaaggg caaattggtt taaaaaatgt aagtaatgtc 540

tgtgtccctc agcaagtgac cattgggaaa cctgtctcct ggatctcttg tggatattac 600tgtgtccctc agcaagtgac cattgggaaa cctgtctcct ggatctcttg tggatattac 600

cattcagctt ttgtaacaac agatggtgag ctatatgtgt ttggagaacc tgagaatggg 660cattcagctt ttgtaacaac agatggtgag ctatatgtgt ttggagaacc tgagaatggg 660

aagttaggtc ttcccaatca gctcctgggc aatcacagaa caccccagct ggtgtctgaa 720aagttaggtc ttcccaatca gctcctgggc aatcacagaa caccccagct ggtgtctgaa 720

attccggaga aggtgatcca agtagcctgt ggtggagagc atactgtggt tctcacggag 780attccggaga aggtgatcca agtagcctgt ggtggagagc atactgtggt tctcacggag 780

aatgctgtgt atacctttgg gctgggacaa tttggtcagc tgggtcttgg cacttttctt 840aatgctgtgt atacctttgg gctgggacaa tttggtcagc tgggtcttgg cacttttctt 840

tttgaaactt cagaacccaa agtcattgag aatattaggg atcaaacaat aagttatatt 900tttgaaactt cagaacccaa agtcattgag aatattaggg atcaaacaat aagttatatt 900

tcttgtggag aaaatcacac agctttgata acagatatcg gccttatgta tacttttgga 960tcttgtggag aaaatcacac agctttgata acagatatcg gccttatgta tacttttgga 960

gatggtcgcc acggaaaatt aggacttgga ctggagaatt ttaccaatca cttcattcct 1020gatggtcgcc acggaaaatt aggacttgga ctggagaatt ttaccaatca cttcattcct 1020

actttgtgct ctaatttttt gaggtttata gttaaattgg ttgcttgtgg tggatgtcac 1080actttgtgct ctaatttttt gaggtttata gttaaattgg ttgcttgtgg tggatgtcac 1080

atggtagttt ttgctgctcc tcatcgtggt gtggcaaaag aaattgaatt cgatgaaata 1140atggtagttt ttgctgctcc tcatcgtggt gtggcaaaag aaattgaatt cgatgaaata 1140

aatgatactt gcttatctgt ggcgactttt ctgccgtata gcagtttaac ctcaggaaat 1200aatgatactt gcttatctgt ggcgactttt ctgccgtata gcagtttaac ctcaggaaat 1200

gtactgcaga ggactctatc agcacgtatg cggcgaagag agagggagag gtctccagat 1260gtactgcaga ggactctatc agcacgtatg cggcgaagag agagggag gtctccagat 1260

tctttttcaa tgaggagaac actacctcca atagaaggga ctcttggcct ttctgcttgt 1320tctttttcaa tgaggagaac actacctcca atagaaggga ctcttggcct ttctgcttgt 1320

tttctcccca attcagtctt tccacgatgt tctgagagaa acctccaaga gagtgtctta 1380tttctcccca attcagtctt tccacgatgt tctgagagaa acctccaaga gagtgtctta 1380

tctgaacagg acctcatgca gccagaggaa ccagattatt tgctagatga aatgaccaaa 1440tctgaacagg acctcatgca gccagaggaa ccagattatt tgctagatga aatgaccaaa 1440

gaagcagaga tagataattc ttcaactgta gaaagccttg gagaaactac tgatatctta 1500gaagcagaga tagataattc ttcaactgta gaaagccttg gagaaactac tgatatctta 1500

aacatgacac acatcatgag cctgaattcc aatgaaaagt cattaaaatt atcaccagtt 1560aacatgacac acatcatgag cctgaattcc aatgaaaagt cattaaaatt atcaccagtt 1560

cagaaacaaa agaaacaaca aacaattggg gaactgacgc aggatacagc tcttactgaa 1620cagaaacaaa agaaacaaca aacaattggg gaactgacgc aggatacagc tcttactgaa 1620

aacgatgata gtgatgaata tgaagaaatg tcagaaatga aagaagggaa agcatgtaaa 1680aacgatgata gtgatgaata tgaagaaatg tcagaaatga aagaagggaa agcatgtaaa 1680

caacatgtgt cacaagggat tttcatgacg cagccagcta cgactatcga agcattttca 1740caacatgtgt cacaagggat tttcatgacg cagccagcta cgactatcga agcattttca 1740

gatgaggaag tagagatccc agaggagaag gaaggagcag aggattcaaa aggaaatgga 1800gatgaggaag tagagatccc agaggagaag gaaggagcag aggattcaaa aggaaatgga 1800

atagaggagc aagaggtaga agcaaatgag gaaaatgtga aggtgcatgg aggaagaaag 1860atagaggagc aagaggtaga agcaaatgag gaaaatgtga aggtgcatgg aggaagaaag 1860

gagaaaacag agatcctatc agatgacctt acagacaaag cagaggtgag tgaaggcaag 1920gagaaaacag agatcctatc agatgacctt acagacaaag cagaggtgag tgaaggcaag 1920

gcaaaatcag tgggagaagc agaggatggg cctgaaggta gaggggatgg aacctgtgag 1980gcaaaatcag tgggagaagc agaggatggg cctgaaggta gaggggatgg aacctgtgag 1980

gaaggtagtt caggagcaga acactggcaa gatgaggaga gggagaaggg ggagaaagac 2040gaaggtagtt caggagcaga acactggcaa gatgaggaga gggagaaggg ggagaaagac 2040

aagggtagag gagaaatgga gaggccagga gagggagaga aggaactagc agagaaggaa 2100aagggtagag gagaaatgga gaggccagga gagggagaga aggaactagc agagaaggaa 2100

gaatggaaga agagggatgg ggaagagcag gagcaaaagg agagggagca gggccatcag 2160gaatggaaga agagggatgg ggaagagcag gagcaaaagg agagggagca gggccatcag 2160

aaggaaagaa accaagagat ggaggaggga ggggaggagg agcatggaga aggagaagaa 2220aaggaaagaa accaagagat ggaggaggga ggggaggagg agcatggaga aggagaagaa 2220

gaggagggag acagagaaga ggaagaagag aaggagggag aagggaaaga ggaaggagaa 2280gaggagggag acagagaaga ggaagaagag aaggagggag aagggaaaga ggaaggagaa 2280

ggggaagaag tggagggaga acgtgaaaag gaggaaggag agaggaaaaa ggaggaaaga 2340ggggaagaag tggagggaga acgtgaaaag gaggaaggag agaggaaaaa ggaggaaaga 2340

gcggggaagg aggagaaagg agaggaagaa ggagaccaag gagaggggga agaggaggaa 2400gcggggaagg aggagaaagg agggaagaa ggagaccaag gagaggggga agaggaggaa 2400

acagagggga gaggggagga aaaagaggag ggaggggaag tagagggagg ggaagtagag 2460acagagggga gaggggagga aaaagaggag ggaggggaag tagagggagg ggaagtagag 2460

gaggggaaag gagagaggga agaggaagag gaggagggtg agggggaaga ggaggaaggg 2520gaggggaaag gagagaggga agaggaagag gagagggtg agggggaaga ggaggaaggg 2520

gagggggaag aggaggaagg ggagggggaa gaggaggaag gagaagggaa aggggaggaa 2580gagggggaag aggaggaagg ggagggggaa gagggaag gagaagggaa aggggaggaa 2580

gaaggggaag aaggagaagg ggaggaagaa ggggaggaag gagaagggga gggggaagag 2640gaaggggaag aaggagaagg ggaggaagaa ggggaggaag gagaagggga gggggaagag 2640

gaggaaggag aaggggaggg agaagaggaa ggagaagggg agggagaaga ggaggaagga 2700gaggaaggag aaggggaggg agaagaggaa ggagaagggg agggaagaaga ggaggaagga 2700

gaaggggagg gagaagagga aggagaaggg gagggagaag aggaggaagg agaagggaaa 2760gaaggggagg gagaagagga aggagaaggg gagggagaag aggaggaagg agaagggaaa 2760

ggggaggagg aaggagagga aggagaaggg gagggggaag aggaggaagg agaaggggaa 2820ggggaggagg aaggagagga aggagaaggg gagggggaag aggagggaagg agaaggggaa 2820

ggggaggatg gagaagggga gggggaagag gaggaaggag aatgggaggg ggaagaggag 2880ggggaggatg gagaagggga gggggaagag gaggaaggag aatgggaggg ggaagaggag 2880

gaaggagaag gggaggggga agaggaagga gaaggggaag gggaggaagg agaaggggag 2940gaaggagaag gggaggggga agaggaagga gaaggggaag gggaggaagg agaaggggag 2940

ggggaagagg aggaaggaga aggggagggg gaagaggagg aaggggaaga agaaggggag 3000ggggaagagg aggaaggaga agggggagggg gaagaggagg aaggggaaga agaaggggag 3000

gaagaaggag agggagagga agaaggggag ggagaagggg aggaagaaga ggaaggggaa 3060gaagaaggag agggagga agaaggggag ggagaagggg aggaagaaga ggaaggggaa 3060

gtggaagggg aggtggaagg ggaggaagga gagggggaag gagaggaaga ggaaggagag 3120gtggaagggg aggtggaagg ggaggaagga gagggggaag gagaggaaga ggaaggagag 3120

gaggaaggag aagaaaggga aaaggagggg gaaggagaag aaaacaggag gaacagagaa 3180gaggaaggag aagaaaggga aaaggagggg gaaggagaag aaaacaggag gaacagagaa 3180

gaggaggagg aagaagaggg gaagtatcag gagacaggcg aagaagagaa tgaaaggcag 3240gaggaggagg aagaagaggg gaagtatcag gagacaggcg aagaagagaa tgaaaggcag 3240

gatggagagg agtacaaaaa agtgagcaaa ataaaaggat ctgtgaaata tggcaaacat 3300gatggagagg agtacaaaaa agtgagcaaa ataaaggat ctgtgaaata tggcaaacat 3300

aaaacatatc aaaaaaagtc agttactaac acacagggaa atgggaaaga gcagaggtcc 3360aaaacatatc aaaaaaagtc agttactaac acacagggaa atgggaaaga gcagaggtcc 3360

aaaatgccag tccagtcaaa acgactttta aaaaacgggc catcaggttc caaaaagttc 3420aaaatgccag tccagtcaaa acgactttta aaaaacgggc catcaggttc caaaaagttc 3420

tggaataatg tattaccaca ttacttggaa ttgaagtaa 3459tggaataatg tattaccaca ttacttggaa ttgaagtaa 3459

<210> 4<210> 4

<211> 199<211> 199

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> hGRK启动子<223> hGRK bootloader

<400> 4<400> 4

gggccccaga agcctggtgg ttgtttgtcc ttctcagggg aaaagtgagg cggccccttg 60gggccccaga agcctggtgg ttgtttgtcc ttctcagggg aaaagtgagg cggccccttg 60

gaggaagggg ccgggcagaa tgatctaatc ggattccaag cagctcaggg gattgtcttt 120gaggaagggg ccgggcagaa tgatctaatc ggattccaag cagctcaggg gattgtcttt 120

ttctagcacc ttcttgccac tcctaagcgt cctccgtgac cccggctggg atttagcctg 180ttctagcacc ttcttgccac tcctaagcgt cctccgtgac cccggctggg atttagcctg 180

gtgctgtgtc agccccggg 199gtgctgtgtc agccccggg 199

<210> 5<210> 5

<211> 4303<211> 4303

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> cohRPGR表达盒<223> cohRPGR expression cassette

<400> 5<400> 5

ttggccactc cctctctgcg cgctcgctcg ctcactgagg ccgggcgacc aaaggtcgcc 60ttggccactc cctctctgcg cgctcgctcg ctcactgagg ccgggcgacc aaaggtcgcc 60

cgacgcccgg gctttgcccg ggcggcctca gtgagcgagc gagcgcgcag agagggagtg 120cgacgcccgg gctttgcccg ggcggcctca gtgagcgagc gagcgcgcag agagggagtg 120

gccaactcca tcactagggg ttcctatcga ttgaattccc cggggatccg ggccccagaa 180gccaactcca tcactagggg ttcctatcga ttgaattccc cggggatccg ggccccagaa 180

gcctggtggt tgtttgtcct tctcagggga aaagtgaggc ggccccttgg aggaaggggc 240gcctggtggt tgtttgtcct tctcagggga aaagtgaggc ggccccttgg aggaaggggc 240

cgggcagaat gatctaatcg gattccaagc agctcagggg attgtctttt tctagcacct 300cgggcagaat gatctaatcg gattccaagc agctcagggg attgtctttt tctagcacct 300

tcttgccact cctaagcgtc ctccgtgacc ccggctggga tttagcctgg tgctgtgtca 360tcttgccact cctaagcgtc ctccgtgacc ccggctggga tttagcctgg tgctgtgtca 360

gccccgggtc tagagtcgac ctgcagaagc ttccaccatg agagaacccg aggaactgat 420gccccgggtc tagagtcgac ctgcagaagc ttccaccatg agagaacccg aggaactgat 420

gcccgactct ggcgccgtgt ttaccttcgg caagagcaag ttcgccgaga acaaccccgg 480gcccgactct ggcgccgtgt ttaccttcgg caagagcaag ttcgccgaga acaaccccgg 480

caagttctgg ttcaagaacg acgtgccagt gcacctgagc tgcggagatg aacactctgc 540caagttctgg ttcaagaacg acgtgccagt gcacctgagc tgcggagatg aacactctgc 540

cgtggtcacc ggcaacaaca agctgtacat gttcggcagc aacaactggg gccagctcgg 600cgtggtcacc ggcaacaaca agctgtacat gttcggcagc aacaactggg gccagctcgg 600

cctgggatct aagtctgcca tcagcaagcc tacctgcgtg aaggccctga agcctgagaa 660cctggggatct aagtctgcca tcagcaagcc tacctgcgtg aaggccctga agcctgagaa 660

agtgaaactg gccgcctgcg gcagaaatca caccctggtt tctaccgaag gcggcaatgt 720agtgaaactg gccgcctgcg gcagaaatca caccctggtt tctaccgaag gcggcaatgt 720

gtatgccacc ggcggaaaca atgagggaca gcttggactg ggcgacaccg aggaaagaaa 780gtatgccacc ggcggaaaca atgagggaca gcttggactg ggcgacaccg aggaaagaaa 780

caccttccac gtgatcagct ttttcaccag cgagcacaag atcaagcagc tgagcgccgg 840caccttccac gtgatcagct ttttcaccag cgagcacaag atcaagcagc tgagcgccgg 840

ctctaatacc tctgccgctc tgacagagga cggcagactg tttatgtggg gcgacaattc 900ctctaatacc tctgccgctc tgacagagga cggcagactg tttatgtggg gcgacaattc 900

tgagggccag atcggactga agaacgtgtc caatgtgtgc gtgccccagc aagtgacaat 960tgagggccag atcggactga agaacgtgtc caatgtgtgc gtgccccagc aagtgacaat 960

cggcaagcct gtgtcttgga tcagctgcgg ctactaccac agcgcctttg tgacaaccga 1020cggcaagcct gtgtcttgga tcagctgcgg ctactaccac agcgcctttg tgacaaccga 1020

tggcgagctg tatgtgttcg gcgagccaga gaatggcaag ctgggactgc ctaaccagct 1080tggcgagctg tatgtgttcg gcgagccaga gaatggcaag ctgggactgc ctaaccagct 1080

gctgggcaat cacagaaccc ctcagctggt gtctgagatc cccgaaaaag tgatccaggt 1140gctgggcaat cacagaaccc ctcagctggt gtctgagatc cccgaaaaag tgatccaggt 1140

ggcctgtggc ggagagcaca cagtggtgct gacagagaat gccgtgtaca cctttggcct 1200ggcctgtggc ggagagcaca cagtggtgct gacagagaat gccgtgtaca cctttggcct 1200

gggccagttt ggacaactcg gactgggaac cttcctgttc gagacaagcg agcccaaagt 1260gggccagttt ggacaactcg gactgggaac cttcctgttc gagacaagcg agcccaaagt 1260

gatcgagaac atccgggacc agaccatcag ctacatcagc tgtggcgaga accacacagc 1320gatcgagaac atccgggacc agaccatcag ctacatcagc tgtggcgaga accacacagc 1320

cctgatcaca gacatcggcc tgatgtacac attcggcgac ggaaggcatg gaaagctcgg 1380cctgatcaca gacatcggcc tgatgtacac attcggcgac ggaaggcatg gaaagctcgg 1380

acttggcctg gaaaacttca ccaaccactt catccctacg ctgtgcagca acttcctgcg 1440acttggcctg gaaaacttca ccaaccactt catccctacg ctgtgcagca acttcctgcg 1440

gttcattgtg aagctggtgg cctgcggagg atgccacatg gtggtttttg ctgcccctca 1500gttcattgtg aagctggtgg cctgcggagg atgccacatg gtggtttttg ctgcccctca 1500

cagaggcgtg gccaaagaga ttgagttcga cgagatcaac gatacctgcc tgagcgtggc 1560cagaggcgtg gccaaagaga ttgagttcga cgagatcaac gatacctgcc tgagcgtggc 1560

caccttcctg ccttacagca gcctgacatc tggcaacgtg ctgcagagga cactgagcgc 1620caccttcctg ccttacagca gcctgacatc tggcaacgtg ctgcagagga cactgagcgc 1620

cagaatgcgc agacgggaaa gagagagaag ccccgacagc ttcagcatga gaagaaccct 1680cagaatgcgc agacgggaaa gagagagaag ccccgacagc ttcagcatga gaagaaccct 1680

gcctccaatc gagggcacac tgggcctgtc tgcctgcttt ctgcctaaca gcgtgttccc 1740gcctccaatc gaggcacac tgggcctgtc tgcctgcttt ctgcctaaca gcgtgttccc 1740

cagatgcagc gagagaaacc tgcaagagag cgtgctgagc gagcaggatc tgatgcagcc 1800cagatgcagc gagagaaacc tgcaagagag cgtgctgagc gagcaggatc tgatgcagcc 1800

tgaggaaccc gactacctgc tggacgagat gaccaaagag gccgagatcg acaacagcag 1860tgaggaaccc gactacctgc tggacgagat gaccaaagag gccgagatcg acaacagcag 1860

cacagtggaa agcctgggcg agacaaccga catcctgaac atgacccaca tcatgagcct 1920cacagtggaa agcctgggcg agacaaccga catcctgaac atgacccaca tcatgagcct 1920

gaacagcaac gagaagtctc tgaagctgag ccccgtgcag aagcagaaga agcagcagac 1980gaacagcaac gagaagtctc tgaagctgag ccccgtgcag aagcagaaga agcagcagac 1980

catcggcgag ctgacacagg atactgccct gaccgagaac gacgacagcg acgagtacga 2040catcggcgag ctgacacagg atactgcct gaccgagaac gacgacagcg acgagtacga 2040

agagatgagc gagatgaagg aaggcaaggc ctgcaagcag cacgtgtccc agggcatctt 2100agagatgagc gagatgaagg aaggcaaggc ctgcaagcag cacgtgtccc agggcatctt 2100

tatgacccag cctgccacca ccatcgaggc cttttccgac gaggaagtgg aaatccccga 2160tatgacccag cctgccacca ccatcgaggc cttttccgac gaggaagtgg aaatccccga 2160

ggaaaaagag ggcgccgagg acagcaaagg caacggcatt gaggaacaag aggtggaagc 2220ggaaaaagag ggcgccgagg acagcaaagg caacggcatt gaggaacaag aggtggaagc 2220

caacgaagag aacgtgaagg tgcacggcgg acggaaagaa aagaccgaga tcctgagcga 2280caacgaagag aacgtgaagg tgcacggcgg acggaaagaa aagaccgaga tcctgagcga 2280

cgacctgacc gataaggccg aggtttccga gggcaaagcc aagtctgtgg gagaagccga 2340cgacctgacc gataaggccg aggtttccga gggcaaagcc aagtctgtgg gagaagccga 2340

ggatggacct gaaggccgcg gagatggaac ctgtgaagaa ggatctagcg gagccgagca 2400ggatggacct gaaggccgcg gagatggaac ctgtgaagaa ggatctagcg gagccgagca 2400

ctggcaggat gaggaacgcg agaagggcga gaaagacaaa ggcagaggcg agatggaaag 2460ctggcaggat gaggaacgcg agaagggcga gaaagacaaa ggcagaggcg agatggaaag 2460

acccggcgag ggcgaaaaag agctggccga gaaagaggaa tggaagaaac gcgacggcga 2520acccggcgag ggcgaaaaag agctggccga gaaagaggaa tggaagaaac gcgacggcga 2520

agaacaagag cagaaagaaa gagagcaggg ccaccagaaa gaacggaatc aagagatgga 2580agaacaagag cagaaagaaa gagagcaggg ccaccagaaa gaacggaatc aagagatgga 2580

agaaggcggc gaggaagaac acggcgaagg ggaagaagag gaaggcgacc gagaggaaga 2640agaaggcggc gaggaagaac acggcgaagg ggaagaagag gaaggcgacc gagaggaaga 2640

agaagagaaa gaaggcgaag gcaaagaaga aggcgagggc gaagaggtgg aaggcgagcg 2700agaagagaaa gaaggcgaag gcaaagaaga aggcgagggc gaagaggtgg aaggcgagcg 2700

tgaaaaagaa gagggcgaac gcaagaaaga agaacgcgcc ggaaaagagg aaaaaggcga 2760tgaaaaagaa gagggcgaac gcaagaaaga agaacgcgcc ggaaaagagg aaaaaggcga 2760

ggaagagggc gaccaaggcg aaggcgagga agaagaaact gaaggcagag gggaagagaa 2820ggaagagggc gaccaaggcg aaggcgagga agaagaaact gaaggcagag gggaagagaa 2820

agaggaaggc ggcgaagtcg aaggcggaga ggttgaagaa ggcaaaggcg agcgagaaga 2880agaggaaggc ggcgaagtcg aaggcggaga ggttgaagaa ggcaaaggcg agcgagaaga 2880

ggaagaagaa gaaggcgaag gcgaggaaga ggaaggcgaa ggcgaagagg aagaaggcga 2940ggaagaagaa gaaggcgaag gcgaggaaga ggaaggcgaa ggcgaagagg aagaaggcga 2940

aggggaagaa gaagaaggcg aaggcaaggg cgaagaggag ggcgaagaag gcgagggcga 3000aggggaagaa gaagaaggcg aaggcaaggg cgaagaggag ggcgaagaag gcgagggcga 3000

agaggagggc gaagaaggcg aaggcgaggg cgaagaagaa gaaggcgaag gcgaaggcga 3060agaggagggc gaagaaggcg aaggcgaggg cgaagaagaa gaaggcgaag gcgaaggcga 3060

ggaagaaggc gaaggcgaag gggaagaaga ggaaggcgaa ggcgaaggcg aagaagaagg 3120ggaagaaggc gaaggcgaag gggaagaaga ggaaggcgaa ggcgaaggcg aagaagaagg 3120

cgaaggcgag ggcgaagagg aagaaggcga aggcaaaggg gaagaagaag gcgaggaagg 3180cgaaggcgag ggcgaagagg aagaaggcga aggcaaaggg gaagaagaag gcgaggaagg 3180

cgaaggcgaa ggcgaggaag aagaaggcga aggcgagggc gaagatggcg aaggcgaagg 3240cgaaggcgaa ggcgaggaag aagaaggcga aggcgagggc gaagatggcg aaggcgaagg 3240

cgaagaggaa gagggcgagt gggagggcga agaagaggaa ggcgaaggcg agggcgaaga 3300cgaagaggaa gagggcgagt gggagggcga agaagaggaa ggcgaaggcg agggcgaaga 3300

ggaaggcgaa ggcgagggcg aagaaggcga aggcgaaggc gaggaagagg aaggcgaagg 3360ggaaggcgaa ggcgagggcg aagaaggcga aggcgaaggc gaggaagagg aaggcgaagg 3360

cgaaggggaa gaagaagagg gcgaagaaga aggcgaagag gaaggcgaag gggaagaaga 3420cgaaggggaa gaagaagagg gcgaagaaga aggcgaagag gaaggcgaag gggaagaaga 3420

aggcgaaggc gaaggcgaag aagaggaaga gggcgaagtt gaaggcgagg ttgagggcga 3480aggcgaaggc gaaggcgaag aagaggaaga gggcgaagtt gaaggcgagg ttgagggcga 3480

agaaggcgaa ggcgaagggg aagaagaaga aggcgaggaa gaaggggaag agagagaaaa 3540agaaggcgaa ggcgaagggg aagaagaaga aggcgaggaa gaaggggaag agagaaaa 3540

agaaggcgag ggcgaagaaa accgccggaa ccgcgaagag gaagaggaag aagagggcaa 3600agaaggcgag ggcgaagaaa accgccggaa ccgcgaagag gaagaggaag aagagggcaa 3600

gtaccaagag actggcgagg aagagaacga gcggcaggat ggcgaagagt acaagaaggt 3660gtaccaagag actggcgagg aagagaacga gcggcaggat ggcgaagagt acaagaaggt 3660

gtccaagatc aagggcagcg tgaagtacgg caagcacaag acctaccaga agaagtccgt 3720gtccaagatc aagggcagcg tgaagtacgg caagcacaag acctaccaga agaagtccgt 3720

caccaacacg caaggcaatg gaaaagaaca gcggagcaag atgcccgtgc agtccaagag 3780caccaacacg caaggcaatg gaaaagaaca gcggagcaag atgcccgtgc agtccaagag 3780

gctgctgaag aatggcccta gcggcagcaa gaaattctgg aacaatgtgc tgccccacta 3840gctgctgaag aatggcccta gcggcagcaa gaaattctgg aacaatgtgc tgccccacta 3840

cctcgagctg aagtgagcct cgagcagcgc tgctcgagag atctgcggcc gcgagctcgg 3900cctcgagctg aagtgagcct cgagcagcgc tgctcgagag atctgcggcc gcgagctcgg 3900

ggatccagac atgataagat acattgatga gtttggacaa accacaacta gaatgcagtg 3960ggatccagac atgataagat acattgatga gtttggacaa accacaacta gaatgcagtg 3960

aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa ccattataag 4020aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa ccattataag 4020

ctgcaataaa caagttaaca acaacaattg cattcatttt atgtttcagg ttcaggggga 4080ctgcaataaa caagttaaca acaacaattg cattcatttt atgtttcagg ttcaggggga 4080

ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa tgtggtatgg ctgattatga 4140ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa tgtggtatgg ctgattatga 4140

tcaatgcatc ctagccggag gaacccctag tgatggagtt ggccactccc tctctgcgcg 4200tcaatgcatc ctagccggag gaacccctag tgatggagtt ggccactccc tctctgcgcg 4200

ctcgctcgct cactgaggcc gcccgggcaa agcccgggcg tcgggcgacc tttggtcgcc 4260ctcgctcgct cactgaggcc gcccgggcaa agcccgggcg tcgggcgacc tttggtcgcc 4260

cggcctcagt gagcgagcga gcgcgcagag agggagtggc caa 4303cggcctcagt gagcgagcga gcgcgcagag aggagtggc caa 4303

<210> 6<210> 6

<211> 145<211> 145

<212> DNA<212> DNA

<213> 腺相关病毒2<213> Adeno-associated virus 2

<400> 6<400> 6

gccaactcca tcactagggg ttcct 145gccaactcca tcactagggg ttcct 145

<210> 7<210> 7

<211> 145<211> 145

<212> DNA<212> DNA

<213> 腺相关病毒2<213> Adeno-associated virus 2

<400> 7<400> 7

aggaacccct agtgatggag ttggccactc cctctctgcg cgctcgctcg ctcactgagg 60aggaacccct agtgatggag ttggccactc cctctctgcg cgctcgctcg ctcactgagg 60

ccgcccgggc aaagcccggg cgtcgggcga cctttggtcg cccggcctca gtgagcgagc 120ccgcccgggc aaagcccggg cgtcgggcga cctttggtcg cccggcctca gtgagcgagc 120

gagcgcgcag agagggagtg gccaa 145gagcgcgcag agagggagtg gccaa 145

<210> 8<210> 8

<211> 245<211> 245

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> SV40多聚腺苷酸化序列<223> SV40 polyadenylation sequence

<400> 8<400> 8

ggggatccag acatgataag atacattgat gagtttggac aaaccacaac tagaatgcag 60ggggatccag acatgataag atatacattgat gagtttggac aaaccacaac tagaatgcag 60

tgaaaaaaat gctttatttg tgaaatttgt gatgctattg ctttatttgt aaccattata 120tgaaaaaaat gctttatttg tgaaatttgt gatgctattg ctttatttgt aaccattata 120

agctgcaata aacaagttaa caacaacaat tgcattcatt ttatgtttca ggttcagggg 180agctgcaata aacaagttaa caacaacaat tgcattcatt ttatgtttca ggttcagggg 180

gaggtgtggg aggtttttta aagcaagtaa aacctctaca aatgtggtat ggctgattat 240gaggtgtggg aggtttttta aagcaagtaa aacctctaca aatgtggtat ggctgattat 240

gatca 245gatca 245

<210> 9<210> 9

<211> 745<211> 745

<212> PRT<212> PRT

<213> 人工序列<213> Artificial sequence

<220><220>

<223> rAAV变体衣壳蛋白<223> rAAV variant capsid protein

<400> 9<400> 9

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Thr Leu SerMet Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Thr Leu Ser

1 5 10 151 5 10 15

Glu Gly Ile Arg Gln Trp Trp Lys Leu Lys Pro Gly Pro Pro Pro ProGlu Gly Ile Arg Gln Trp Trp Lys Leu Lys Pro Gly Pro Pro Pro Pro

20 25 3020 25 30

Lys Ala Ala Glu Arg His Lys Asp Asp Ser Arg Gly Leu Val Leu ProLys Ala Ala Glu Arg His Lys Asp Asp Ser Arg Gly Leu Val Leu Pro

35 40 4535 40 45

Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu ProGly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro

50 55 6050 55 60

Val Asn Glu Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr AspVal Asn Glu Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 8065 70 75 80

Arg Gln Leu Asp Ser Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His AlaArg Gln Leu Asp Ser Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 9585 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly GlyAsp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu ProAsn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro

115 120 125115 120 125

Leu Gly Leu Val Glu Glu Pro Val Lys Thr Ala Pro Gly Lys Lys ArgLeu Gly Leu Val Glu Glu Pro Val Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140130 135 140

Pro Val Glu His Ser Pro Val Glu Pro Asp Ser Ser Ser Gly Thr GlyPro Val Glu His Ser Pro Val Glu Pro Asp Ser Ser Ser Gly Thr Gly

145 150 155 160145 150 155 160

Lys Ala Gly Gln Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln ThrLys Ala Gly Gln Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175165 170 175

Gly Asp Ala Asp Ser Val Pro Asp Pro Gln Pro Leu Gly Gln Pro ProGly Asp Ala Asp Ser Val Pro Asp Pro Gln Pro Leu Gly Gln Pro Pro

180 185 190180 185 190

Ala Ala Pro Ser Gly Leu Gly Thr Asn Thr Met Ala Thr Gly Ser GlyAla Ala Pro Ser Gly Leu Gly Thr Asn Thr Met Ala Thr Gly Ser Gly

195 200 205195 200 205

Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn SerAla Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ser

210 215 220210 215 220

Ser Gly Asn Trp His Cys Asp Ser Thr Trp Met Gly Asp Arg Val IleSer Gly Asn Trp His Cys Asp Ser Thr Trp Met Gly Asp Arg Val Ile

225 230 235 240225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His LeuThr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255245 250 255

Tyr Lys Gln Ile Ser Ser Gln Ser Gly Ala Ser Asn Asp Asn His TyrTyr Lys Gln Ile Ser Ser Gln Ser Gly Ala Ser Asn Asp Asn His Tyr

260 265 270260 265 270

Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe HisPhe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His

275 280 285275 280 285

Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn TrpCys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp

290 295 300290 295 300

Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln ValGly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val

305 310 315 320305 310 315 320

Lys Glu Val Thr Gln Asn Asp Gly Thr Thr Thr Ile Ala Asn Asn LeuLys Glu Val Thr Gln Asn Asp Gly Thr Thr Thr Thr Ile Ala Asn Asn Leu

325 330 335325 330 335

Thr Ser Thr Val Gln Val Phe Thr Asp Ser Glu Tyr Gln Leu Pro TyrThr Ser Thr Val Gln Val Phe Thr Asp Ser Glu Tyr Gln Leu Pro Tyr

340 345 350340 345 350

Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala AspVal Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala Asp

355 360 365355 360 365

Val Phe Met Val Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly SerVal Phe Met Val Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly Ser

370 375 380370 375 380

Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro SerGln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser

385 390 395 400385 390 395 400

Gln Met Leu Arg Thr Gly Asn Asn Phe Thr Phe Ser Tyr Thr Phe GluGln Met Leu Arg Thr Gly Asn Asn Phe Thr Phe Ser Tyr Thr Phe Glu

405 410 415405 410 415

Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp ArgAsp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp Arg

420 425 430420 425 430

Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser Arg ThrLeu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser Arg Thr

435 440 445435 440 445

Asn Thr Pro Ser Gly Thr Thr Thr Gln Ser Arg Leu Gln Phe Ser GlnAsn Thr Pro Ser Gly Thr Thr Thr Thr Gln Ser Arg Leu Gln Phe Ser Gln

450 455 460450 455 460

Ala Gly Ala Ser Asp Ile Arg Asp Gln Ser Arg Asn Trp Leu Pro GlyAla Gly Ala Ser Asp Ile Arg Asp Gln Ser Arg Asn Trp Leu Pro Gly

465 470 475 480465 470 475 480

Pro Cys Tyr Arg Gln Gln Arg Val Ser Lys Thr Ser Ala Asp Asn AsnPro Cys Tyr Arg Gln Gln Arg Val Ser Lys Thr Ser Ala Asp Asn Asn

485 490 495485 490 495

Asn Ser Glu Tyr Ser Trp Thr Gly Ala Thr Lys Tyr His Leu Asn GlyAsn Ser Glu Tyr Ser Trp Thr Gly Ala Thr Lys Tyr His Leu Asn Gly

500 505 510500 505 510

Arg Asp Ser Leu Val Asn Pro Gly Pro Ala Met Ala Ser His Lys AspArg Asp Ser Leu Val Asn Pro Gly Pro Ala Met Ala Ser His Lys Asp

515 520 525515 520 525

Asp Glu Glu Lys Phe Phe Pro Gln Ser Gly Val Leu Ile Phe Gly LysAsp Glu Glu Lys Phe Phe Pro Gln Ser Gly Val Leu Ile Phe Gly Lys

530 535 540530 535 540

Gln Gly Ser Glu Lys Thr Asn Val Asp Ile Glu Lys Val Met Ile ThrGln Gly Ser Glu Lys Thr Asn Val Asp Ile Glu Lys Val Met Ile Thr

545 550 555 560545 550 555 560

Asp Glu Glu Glu Ile Arg Thr Thr Asn Pro Val Ala Thr Glu Gln TyrAsp Glu Glu Glu Ile Arg Thr Thr Asn Pro Val Ala Thr Glu Gln Tyr

565 570 575565 570 575

Gly Ser Val Ser Thr Asn Leu Gln Arg Gly Asn Leu Ala Ile Ser AspGly Ser Val Ser Thr Asn Leu Gln Arg Gly Asn Leu Ala Ile Ser Asp

580 585 590580 585 590

Gln Thr Lys His Ala Arg Gln Ala Ala Thr Ala Asp Val Asn Thr GlnGln Thr Lys His Ala Arg Gln Ala Ala Thr Ala Asp Val Asn Thr Gln

595 600 605595 600 605

Gly Val Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr Leu GlnGly Val Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr Leu Gln

610 615 620610 615 620

Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly His Phe His ProGly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly His Phe His Pro

625 630 635 640625 630 635 640

Ser Pro Leu Met Gly Gly Phe Gly Leu Lys His Pro Pro Pro Gln IleSer Pro Leu Met Gly Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile

645 650 655645 650 655

Leu Ile Lys Asn Thr Pro Val Pro Ala Asn Pro Ser Thr Thr Phe SerLeu Ile Lys Asn Thr Pro Val Pro Ala Asn Pro Ser Thr Thr Phe Ser

660 665 670660 665 670

Ala Ala Lys Phe Ala Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln ValAla Ala Lys Phe Ala Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val

675 680 685675 680 685

Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys Arg TrpSer Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys Arg Trp

690 695 700690 695 700

Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Asn Lys Ser Val Asn ValAsn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Asn Lys Ser Val Asn Val

705 710 715 720705 710 715 720

Asp Phe Thr Val Asp Thr Asn Gly Val Tyr Ser Glu Pro Arg Pro IleAsp Phe Thr Val Asp Thr Asn Gly Val Tyr Ser Glu Pro Arg Pro Ile

725 730 735725 730 735

Gly Thr Arg Tyr Leu Thr Arg Asn LeuGly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745740 745

<210> 10<210> 10

<211> 3459<211> 3459

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 密码子优化的人RPGR<223> RPGR, the codon optimizer

<400> 10<400> 10

atgagagagc ctgaagagct gatgcctgat agcggagcag tgtttacctt tgggaagagc 60atgagagagc ctgaagagct gatgcctgat agcggagcag tgtttacctt tgggaagagc 60

aagttcgcag agaataaccc tgggaaattc tggtttaaga acgacgtgcc cgtgcacctg 120aagttcgcag agaataaccc tgggaaattc tggtttaaga acgacgtgcc cgtgcacctg 120

agctgtggcg atgagcactc cgccgtggtg acaggcaaca ataagctgta catgttcggc 180agctgtggcg atgagcactc cgccgtggtg acaggcaaca ataagctgta catgttcggc 180

tctaacaatt ggggacagct gggcctggga agcaagtccg ccatcagcaa gccaacctgc 240tctaacaatt ggggacagct gggcctggga agcaagtccg ccatcagcaa gccaacctgc 240

gtgaaggccc tgaagcccga gaaggtgaag ctggccgcct gtggcagaaa ccacacactg 300gtgaaggccc tgaagcccga gaaggtgaag ctggccgcct gtggcagaaa ccacacactg 300

gtgagcaccg agggaggaaa cgtgtacgca acaggaggca acaatgaagg ccagctgggc 360gtgagcaccg agggaggaaa cgtgtacgca acaggaggca acaatgaagg ccagctgggc 360

ctgggcgaca cagaggagag gaataccttt cacgtgatca gcttctttac ctccgagcac 420ctgggcgaca cagaggagag gaataccttt cacgtgatca gcttctttac ctccgagcac 420

aagatcaagc agctgtccgc cggctctaac acaagcgccg ccctgaccga ggacggccgc 480aagatcaagc agctgtccgc cggctctaac acaagcgccg ccctgaccga ggacggccgc 480

ctgttcatgt ggggcgataa tagcgagggc cagatcggcc tgaagaacgt gtccaacgtg 540ctgttcatgt ggggcgataa tagcgagggc cagatcggcc tgaagaacgt gtccaacgtg 540

tgcgtgcctc agcaggtgac catcggcaag ccagtgtcct ggatctcttg tggctactat 600tgcgtgcctc agcaggtgac catcggcaag ccagtgtcct ggatctcttg tggctactat 600

cacagcgcct tcgtgaccac agatggcgag ctgtacgtgt ttggagagcc agagaacggc 660cacagcgcct tcgtgaccac agatggcgag ctgtacgtgt ttggagagcc agagaacggc 660

aagctgggcc tgcctaacca gctgctgggc aatcaccgga caccccagct ggtgtccgag 720aagctgggcc tgcctaacca gctgctgggc aatcaccgga caccccagct ggtgtccgag 720

atccctgaga aagtgatcca ggtggcatgc ggaggagagc acacagtggt gctgaccgag 780atccctgaga aagtgatcca ggtggcatgc ggaggagagc acaagtggt gctgaccgag 780

aatgccgtgt ataccttcgg cctgggacag tttggacagc tgggcctggg cacattcctg 840aatgccgtgt ataccttcgg cctggggacag tttggacagc tgggcctggg cacattcctg 840

tttgagacaa gcgagccaaa agtgatcgag aacatccgcg accagacaat cagctacatc 900tttgagacaa gcgagccaaa agtgatcgag aacatccgcg accagacaat cagctacatc 900

tcctgcggcg agaatcacac agccctgatc accgacatcg gcctgatgta tacctttggc 960tcctgcggcg agaatcacac agccctgatc accgacatcg gcctgatgta tacctttggc 960

gatggccggc acggcaagct gggcctgggc ctggagaact tcacaaatca ctttatcccc 1020gatggccggc acggcaagct gggcctgggc ctggagaact tcacaaatca ctttatcccc 1020

accctgtgct ctaacttcct gcggttcatc gtgaagctgg tggcctgcgg cggctgtcac 1080accctgtgct ctaacttcct gcggttcatc gtgaagctgg tggcctgcgg cggctgtcac 1080

atggtggtgt tcgcagcacc tcacagggga gtggccaagg agatcgagtt tgacgagatc 1140atggtggtgt tcgcagcacc tcacagggga gtggccaagg agatcgagtt tgacgagatc 1140

aacgatacat gcctgtccgt ggccaccttc ctgccataca gctccctgac atccggcaat 1200aacgatacat gcctgtccgt ggccaccttc ctgccataca gctccctgac atccggcaat 1200

gtgctgcagc gcaccctgtc tgccaggatg cggagaaggg agagggagcg gtcccctgac 1260gtgctgcagc gcaccctgtc tgccaggatg cggagaaggg agagggagcg gtcccctgac 1260

tctttcagca tgaggcggac actgccacct atcgagggca ccctgggcct gtctgcctgc 1320tctttcagca tgaggcggac actgccacct atcgagggca ccctgggcct gtctgcctgc 1320

ttcctgccta acagcgtgtt cccaagatgt agcgagagga atctgcagga gtctgtgctg 1380ttcctgccta acagcgtgtt cccaagatgt agcgagagga atctgcagga gtctgtgctg 1380

agcgagcagg atctgatgca gccagaggag cccgactacc tgctggatga gatgacaaag 1440agcgagcagg atctgatgca gccagaggag cccgactacc tgctggatga gatgacaaag 1440

gaggccgaga tcgacaactc tagcaccgtg gagagcctgg gcgagacaac agatatcctg 1500gaggccgaga tcgacaactc tagcaccgtg gagagcctgg gcgagacaac agatatcctg 1500

aatatgacac acatcatgtc cctgaactct aatgagaagt ctctgaagct gagcccagtg 1560aatatgacac acatcatgtc cctgaactct aatgagaagt ctctgaagct gagcccagtg 1560

cagaagcaga agaagcagca gaccatcggc gagctgaccc aggacacagc cctgaccgag 1620cagaagcaga agaagcagca gaccatcggc gagctgaccc aggacacagc cctgaccgag 1620

aacgacgatt ctgatgagta tgaggagatg agcgagatga aggagggcaa ggcctgtaag 1680aacgacgatt ctgatgagta tgaggagatg agcgagatga aggagggcaa ggcctgtaag 1680

cagcacgtgt cccagggcat cttcatgacc cagccagcca ccacaatcga ggccttttct 1740cagcacgtgt cccagggcat cttcatgacc cagccagcca ccacaatcga ggccttttct 1740

gacgaagagg tggagatccc cgaggagaag gagggcgccg aggatagcaa gggcaatggc 1800gacgaagagg tggagatccc cgaggagaag gagggcgccg aggatagcaa gggcaatggc 1800

atcgaggagc aggaggtgga ggccaacgag gagaatgtga aggtgcacgg cggcagaaag 1860atcgaggagc aggaggtgga ggccaacgag gagaatgtga aggtgcacgg cggcagaaag 1860

gagaagacag agatcctgtc cgacgatctg accgacaagg ccgaggtgtc cgagggcaag 1920gagaagacag agatcctgtc cgacgatctg accgacaagg ccgaggtgtc cgagggcaag 1920

gccaagtctg tgggagaggc agaggacgga ccagagggac gcggcgatgg aacctgcgag 1980gccaagtctg tgggagaggc agaggacgga ccagagggac gcggcgatgg aacctgcgag 1980

gagggatcct ctggagcaga gcactggcag gacgaagaaa gagagaaggg cgagaaggat 2040gagggatcct ctggagcaga gcactggcag gacgaagaaa gagagaaggg cgagaaggat 2040

aagggcagag gagagatgga gaggcctgga gagggagaga aggagctggc agagaaggag 2100aagggcagag gagagatgga gagcctgga gagggagaga aggagctggc agagaaggag 2100

gagtggaaga agagggacgg cgaggagcag gagcagaagg agagagagca gggccaccag 2160gagtggaaga agagggacgg cgaggagcag gagcagaagg agagagagca gggccaccag 2160

aaggagagga accaggagat ggaggaggga ggagaggagg agcacggcga gggagaggag 2220aaggagagga acccaggagat ggaggaggga ggagaggagg agcacggcga gggagaggag 2220

gaggagggcg atagagagga agaagaggag aaggagggag agggcaagga ggaaggcgag 2280gaggagggcg atagagagga agaagaggag aaggagggag agggcaagga ggaaggcgag 2280

ggagaggagg tggagggaga aagggagaag gaggagggag agcgcaagaa ggaagaaaga 2340ggagaggagg tggagggaga aagggagaag gaggagggag agcgcaagaa ggaagaaaga 2340

gcaggcaagg aagagaaggg agaggaggag ggcgatcagg gcgaaggaga ggaggaggag 2400gcaggcaagg aagagaaggg agaggagggag ggcgatcagg gcgaaggaga ggaggaggag 2400

acagagggaa ggggagagga gaaggaggag ggaggagagg tcgaaggagg agaagtggag 2460acagagggaa ggggagagga gaaggaggag ggaggagagg tcgaaggagg agaagtggag 2460

gagggcaagg gcgaaagaga agaggaggag gaggaaggcg agggcgaaga agaggagggc 2520gagggcaagg gcgaaagaga agaggaggag gaggaaggcg agggcgaaga agaggagggc 2520

gagggcgagg aagaagaggg cgagggcgaa gaggaagaag gcgagggcaa gggcgaggag 2580gagggcgagg aagaagaggg cgagggcgaa gaggaagaag gcgagggcaa gggcgaggag 2580

gagggcgaag aaggcgaagg ggaggaggag ggcgaagagg gagagggcga gggcgaggag 2640gagggcgaag aaggcgaagg ggaggaggag ggcgaagagg gagagggcga gggcgaggag 2640

gaagaaggcg aaggcgaagg cgaagaagaa ggagaaggag agggcgaaga ggaggaaggc 2700gaagaaggcg aaggcgaagg cgaagaagaa ggagaaggag agggcgaaga ggaggaaggc 2700

gaaggagaag gagaggagga aggagaaggg gagggcgaag aggaggaggg agaaggcaag 2760gaaggagaag gagaggagga aggagaaggg gagggcgaag aggaggaggg agaaggcaag 2760

ggagaagaag aaggcgaaga aggcgaggga gaaggcgagg aagaagaagg cgagggagag 2820ggagaagaag aaggcgaaga aggcgaggga gaaggcgagg aagaagaagg cgagggag 2820

ggagaggacg gcgaaggcga gggcgaggaa gaggaaggag agtgggaggg cgaggaagag 2880ggagaggacg gcgaaggcga gggcgaggaa gaggaaggag agtgggaggg cgaggaagag 2880

gagggagaag gagaaggcga agaagaaggg gaaggagagg gcgaggaagg agaaggcgaa 2940gagggagaag gagaaggcga agaagaaggg gaaggagagg gcgaggaagg agaaggcgaa 2940

ggcgaagagg aggaggggga aggggagggc gaggaggaag agggagaaga ggaaggcgaa 3000ggcgaagagg aggaggggga aggggagggc gaggaggaag agggaaga ggaaggcgaa 3000

gaagagggag aaggcgaaga ggaaggagaa ggcgagggag aagaagagga ggagggcgag 3060gaagaggggag aaggcgaaga ggaaggagaa ggcgagggag aagaagagga ggagggcgag 3060

gtcgaaggcg aggtggaggg cgaagagggg gaaggcgaag gcgaggagga ggaaggggaa 3120gtcgaaggcg aggtggaggg cgaagagggg gaaggcgaag gcgaggagga ggaaggggaa 3120

gaagaaggcg aggagagaga gaaagaaggc gagggcgagg agaacagaag gaatcgcgaa 3180gaagaaggcg aggagagaga gaaagaaggc gaggcgagg agaacagaag gaatcgcgaa 3180

gaagaagagg aagaagaggg caagtaccag gagacaggcg aggaggagaa cgagcggcag 3240gaagaagagg aagaagaggg caagtaccag gagacaggcg aggaggagaa cgagcggcag 3240

gatggcgagg agtataagaa ggtgtccaag atcaagggct ctgtgaagta cggcaagcac 3300gatggcgagg agtataagaa ggtgtccaag atcaagggct ctgtgaagta cggcaagcac 3300

aagacctatc agaagaagag cgtgaccaac acacagggca atggcaagga gcagcgcagc 3360aagacctatc agaagaagag cgtgaccaac acaagggca atggcaagga gcagcgcagc 3360

aagatgcctg tgcagtccaa gcggctgctg aagaatggcc cctctgggag caagaagttt 3420aagatgcctg tgcagtccaa gcggctgctg aagaatggcc cctctgggag caagaagttt 3420

tggaataatg tcctgccaca ctacctggag ctgaaatga 3459tggaataatg tcctgccaca ctacctggag ctgaaatga 3459

Claims

1. A nucleic acid encoding the human retinitis pigmentosa GTPase regulatory factor (RPGR) protein of SEQ ID NO:2 and codon-optimized for expression in humans, said nucleic acid comprising a nucleotide sequence as shown in SEQ ID NO:1 or comprising a nucleotide sequence having at least 95% identity with it, said nucleic acid being expressed at a higher level than the expression level in cells otherwise identical to the wild-type RPGR nucleotide sequence of SEQ ID NO:3.

2. The nucleic acid according to claim 1, wherein the nucleotide sequence has a codon fitness index of at least 0.89.

3. The nucleic acid according to claim 1, comprising the nucleotide sequence shown in SEQ ID NO:1.

4. An expression cassette comprising a nucleic acid according to any one of claims 1 to 3 and an expression control sequence operatively linked to and heterologous to the nucleic acid sequence.

5. The expression box of claim 4, wherein the expression control sequence is a constitutive promoter.

6. The expression cassette of claim 4, wherein the expression control sequence is a promoter that directs the preferential expression of the nucleic acid in rods and cones, preferably a human G protein-coupled receptor rhodopsin kinase 1 (hGRK) promoter, comprising a nucleotide sequence as shown in SEQ ID NO:4 or a sequence having at least 90%, at least 95%, or at least 98% identity with it.

7. The expression cassette of claim 6, comprising from 5' to 3': (a) an AAV2 terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf15 gene of SEQ ID NO:1, (d) an SV40 polyadenylated sequence, and (e) an AAV2 terminal repeat.

8. The expression cassette of claim 7, wherein the 5'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:6 and/or wherein the hGRK promoter has a nucleotide sequence as shown in SEQ ID NO:4 and/or wherein the SV40 polyadenylated sequence has a nucleotide sequence as shown in SEQ ID NO:8 and/or wherein the 3'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:7.

9. The expression cassette of claim 8, comprising or consisting of the nucleotide sequence of SEQ ID NO:5 or a sequence having at least 90%, at least 95%, or at least 98% identity with it.

10. A vector comprising the nucleic acid according to any one of claims 1 to 3 or the expression cassette according to any one of claims 4 to 9.

11. The vector of claim 10, wherein the vector is a recombinant gland-associated (rAAV) vector.

12. The vector of claim 11, wherein the rAAV vector comprises an AAV capsid of serotype 2, 5 or 8 or a variant thereof.

13. The vector of claim 12, wherein the rAAV vector comprises an AAV2 capsid or a variant thereof.

14. The vector of claim 13, wherein the rAAV vector comprises an AAV2 capsid variant containing a capsid protein comprising or consisting of the sequence of SEQ ID NO:9.

15. The vector according to any one of claims 11-14, wherein the rAAV vector comprises nucleic acid, the nucleic acid comprising from 5' to 3': (a) an AAV2 terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf15 gene of SEQ ID NO:1, and (d) an AAV2 terminal repeat.

16. The vector of claim 15, wherein the 5'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:6 and/or wherein the hGRK promoter has a nucleotide sequence as shown in SEQ ID NO:4 and/or wherein the SV40 polyadenylated sequence has a nucleotide sequence as shown in SEQ ID NO:8 and/or wherein the 3'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:7.

17. The vector of claim 16, wherein the rAAV vector comprises nucleic acid, the nucleic acid comprising the nucleotide sequence of SEQ ID NO:5 or a sequence having at least 90%, at least 95%, or at least 98% identity with it.

18. The vector of claim 17, wherein the rAAV vector comprises (i) a capsid containing a capsid protein comprising or consisting of the sequence of SEQ ID NO:9, and (ii) a nucleic acid comprising or consisting of the nucleotide sequence of SEQ ID NO:5.

19. A host cell comprising the nucleic acid according to any one of claims 1 to 3 or the expression cassette according to any one of claims 4 to 9.

20. The host cell of claim 19, wherein the host cell is a mammalian cell.

21. The host cell of claim 19 or 20, wherein the host cell is a CHO cell, HEK293 cell, HEK293T cell, HeLa cell, BHK21 cell, or Vero cell and/or wherein the host cell is grown in suspension culture or cell culture chamber culture and/or wherein the host cell is a photoreceptor cell, retinal ganglion cell, glial cell, bipolar cell, amacrine cell, horizontal cell, or retinal pigment epithelial cell.

22. A method for treating XLRP in a subject in need, comprising administering to the subject a therapeutically effective amount of a nucleic acid according to any one of claims 1-3, an expression cassette according to any one of claims 4-9, or a vector according to any one of claims 10-18.

23. A method for treating XLRP in a subject in need, comprising administering an infectious rAAV to the subject, the infectious rAAV comprising (i) an AAV capsid and (ii) a nucleic acid, the nucleic acid comprising from 5' to 3': (a) an AAV2 terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf15 gene of SEQ ID NO:1, (d) an SV40 polyadenylated sequence, and (e) an AAV2 terminal repeat.

24. The method of claim 23, wherein the 5'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:6 and/or wherein the hGRK promoter has a nucleotide sequence as shown in SEQ ID NO:4 and/or wherein the SV40 polyadenylated sequence has a nucleotide sequence as shown in SEQ ID NO:8 and/or wherein the 3'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:7.

25. The method according to claim 23 or 24, wherein the rAAV comprises (i) a capsid containing a capsid protein comprising or consisting of the sequence of SEQ ID NO:9, and (ii) a nucleic acid comprising or consisting of the nucleotide sequence of SEQ ID NO:5.

26. The method according to any one of claims 22-25, wherein the nucleic acid or vector is administered to the subject via periocular, intravitreal, choroidal, or subretinal injection and/or wherein the vector is administered to the subject at a dose of approximately ^10¹⁰ vector genomes (vg)/eye to approximately ^10¹³ vg/eye, preferably approximately 1 × ^10¹¹ vg/eye to approximately 5 × ^10¹² vg/eye, more preferably at a dose of approximately 3 × ^10¹¹ vg/eye or at a dose of approximately 1 × ^10¹² vg/eye.

27. A nucleic acid according to any one of claims 1-3, an expression cassette according to any one of claims 4-9, or a vector according to any one of claims 10-18 for the treatment of XLRP.

28. A nucleic acid according to any one of claims 1-3, an expression cassette according to any one of claims 4-9, or a vector according to any one of claims 10-18 for manufacturing a medicament for treating XLRP.

29. A nucleic acid, expression cassette, or vector for use according to claim 27 or 28, wherein the nucleic acid or vector is administered by injection periocularly, intravitreally, choroidally, or subretinal and/or wherein the vector is administered at a dose of approximately ^10¹⁰ vector genomes (vg)/eye to approximately ^10¹³ vg/eye, preferably approximately 1 × ^10¹¹ vg/eye to approximately 5 × ^10¹² vg/eye, more preferably at a dose of approximately 3 × ^10¹¹ vg/eye or at a dose of approximately 1 × ^10¹² vg/eye.

30. An infectious rAAV for treating XLRP, comprising (i) an AAV capsid and (ii) a nucleic acid, said nucleic acid comprising from 5' to 3': (a) an AAV2 terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf125 gene of SEQ ID NO:1, (d) an SV40 polyadenylated sequence, and (e) an AAV2 terminal repeat.

31. An infectious rAAV for manufacturing a medicament for treating XLRP, comprising (i) an AAV capsid and (ii) a nucleic acid, said nucleic acid comprising from 5' to 3': (a) an AAV 2-terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf125 gene of SEQ ID NO:1, (d) an SV40 polyadenylated sequence, and (e) an AAV 2-terminal repeat.

32. The infectious rAAV according to claim 30 or 31, wherein the 5'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:6 and/or wherein the hGRK promoter has a nucleotide sequence as shown in SEQ ID NO:4 and/or wherein the SV40 polyadenylated sequence has a nucleotide sequence as shown in SEQ ID NO:8 and/or wherein the 3'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:7.

33. The infectious rAAV of claim 32, wherein the rAAV comprises (i) a capsid containing a capsid protein comprising or consisting of the sequence of SEQ ID NO:9, and (ii) a nucleic acid comprising or consisting of the nucleotide sequence of SEQ ID NO:5.

34. An infectious rAAV for use according to any one of claims 30-33, wherein the rAAV is administered by intravitreal injection and/or wherein the vector is administered at a dose of about ^10¹⁰ vector genomes (vg)/eye to about ^10¹³ vg/eye, preferably about 1 × ^10¹¹ vg/eye to about 5 × ^10¹² vg/eye, more preferably at a dose of about 3 × ^10¹¹ vg/eye or at a dose of about 1 × ^10¹² vg/eye.

35. A method for treating a disease or condition mediated by reduced RPGrf15 levels in mammals, the method comprising administering a therapeutically effective amount of a nucleic acid according to any one of claims 1-3, an expression cassette according to any one of claims 4-9, or a vector according to any one of claims 10-18.

36. A method for increasing RPGrf15 levels in a mammal, the method comprising administering to the mammal a nucleic acid according to any one of claims 1-3, an expression cassette according to any one of claims 4-9, or a vector according to any one of claims 10-18.

37. A pharmaceutical composition comprising a nucleic acid according to any one of claims 1-3, an expression cassette according to any one of claims 4-9, or a carrier according to any one of claims 10-18, and at least one pharmaceutically acceptable excipient.

38. A pharmaceutical composition comprising infectious rAAV, said infectious rAAV comprising (i) an AAV capsid and (ii) a nucleic acid, said nucleic acid comprising from 5' to 3': (a) an AAV2 terminal repeat, (b) an hGRK promoter, (c) a codon-optimized RPGRorf gene of SEQ ID NO:1, (d) an SV40 polyadenylated sequence, and (e) an AAV2 terminal repeat.

39. The pharmaceutical composition of claim 38, wherein the 5'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:6 and/or wherein the hGRK promoter has a nucleotide sequence as shown in SEQ ID NO:4 and/or wherein the SV40 polyadenylated sequence has a nucleotide sequence as shown in SEQ ID NO:8 and/or wherein the 3'AAV2 terminal repeat has a nucleotide sequence as shown in SEQ ID NO:7.

40. The pharmaceutical composition of claim 39, wherein the rAAV comprises (i) a capsid containing a capsid protein comprising or consisting of the sequence of SEQ ID NO:9, and (ii) a nucleic acid comprising or consisting of the nucleotide sequence of SEQ ID NO:5.

41. The pharmaceutical composition according to any one of claims 38-40, wherein the pharmaceutical composition comprises ^10⁹ vg to ^10¹⁴ vg of rAAV, preferably ^10¹⁰ vg to ^10¹³ vg of rAAV, more preferably comprising about 3 × ^10¹¹ vg or about 1 × ^10¹² vg of rAAV.