[go: up one dir, main page]

WO2019089910A1 - Highly compact cas9-based transcriptional regulators for in vivo gene regulation - Google Patents

Highly compact cas9-based transcriptional regulators for in vivo gene regulation Download PDF

Info

Publication number
WO2019089910A1
WO2019089910A1 PCT/US2018/058685 US2018058685W WO2019089910A1 WO 2019089910 A1 WO2019089910 A1 WO 2019089910A1 US 2018058685 W US2018058685 W US 2018058685W WO 2019089910 A1 WO2019089910 A1 WO 2019089910A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
modified
cas9 protein
aav vector
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2018/058685
Other languages
French (fr)
Inventor
Renzhi HAN
Li Xu
Yandi GAO
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ohio State Innovation Foundation
Original Assignee
Ohio State Innovation Foundation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ohio State Innovation Foundation filed Critical Ohio State Innovation Foundation
Publication of WO2019089910A1 publication Critical patent/WO2019089910A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K19/00Hybrid peptides, i.e. peptides covalently bound to nucleic acids, or non-covalently bound protein-protein complexes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/35Nature of the modification
    • C12N2310/351Conjugate
    • C12N2310/3519Fusion with another nucleic acid
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector

Definitions

  • This invention relates to gene regulation, and more particularly to Cas9-based systems for gene regulation.
  • CRISPR Clustered regularly interspaced short palindromic repeats
  • CRISPR/Cas9 A simple version of the CRISPR/Cas system, CRISPR/Cas9, has been engineered to edit genomes. Delivery of the Cas9 nuclease complexed with a synthetic guide RNA (gRNA) enables the cutting of a cell's genome at a desired location, such that existing genes to be removed and/or new ones added. Cas9 has also been engineered as a nuclease-inactive, programmable transcription factor with the ability to up- or down-regulate a gene of interest. In these applications, point mutations are introduced to Cas9 that inactivate the nuclease functionality.
  • gRNA synthetic guide RNA
  • nuclease-inactive Cas9 is unable to cleave DNA, but retains the ability to target specific DNA sequences via the gRNA.
  • Effector domains for transcriptional regulation are be bound to the nuclease-inactive Cas9 protein, to the gRNA, or to an RNA aptamer that binds the gRNA.
  • These constructs are highly efficient in turning on or off the gene of interest in cultured cells.
  • clinical translation of nuclease-inactive Cas9 transcription factor technology has stalled due to the large sizes of the Cas9 fusions, which are beyond the packaging capacity of conventional adeno-associated virus (AAV) vectors used for gene therapy.
  • AAV adeno-associated virus
  • AAV is a naturally occurring replication defective non-pathogenic virus with a single stranded DNA genome.
  • AAV vectors have a favorable safety profile and are capable of achieving persistent transgene expression. Long-term expression is predominantly mediated by episomally retained AAV genomes. More than 90% of the stably transduced vector genomes are extra-chromosomal, mostly organized as high-molecular-weight concatemers. Therefore, the risk of insertional oncogenesis is minimal.
  • the major limitation of AAV vectors is the limited packaging capacity of the vector particles, constraining the size of the transgene expression cassette to obtain functional vectors.
  • the systems include a modified Cas9 protein having an HNH domain deletion, a guide RNA bound to the modified Cas9 protein and including a target sequence, and at least one effector domain.
  • the at least one effector domain is bound to the modified Cas9 protein, the guide RNA, or both.
  • the systems are highly compact and can be packaged into AAV for in vivo applications.
  • An adeno- associated virus (AAV) vector system for in vivo delivery of a Cas9-based gene expression modulation system includes a first expression cassette including a polynucleotide that encodes the modified Cas9 protein, a second expression cassette including a polynucleotide that encodes the guide RNA, and a third expression cassette including a polynucleotide that encodes the effector domain.
  • the first, second, and third expression cassettes are located on one or more adeno-associated virus (AAV) vectors.
  • Methods of modulating gene expression include introducing a system for modulating gene expression into a cell, hybridizing the target sequence to a target gene, directing the effector domain to the target gene, and modulating the
  • FIGS 1A-1F Engineering a smaller version of transcription activator based on SpCas9.
  • the three components include the full-length dCas9 (D10A/H840A) fused with VP64 (Component-I), MS2 aptamer modified gRNA (Component-II), and MCP-VP64-p65-HSFl (Component-Ill).
  • 3-SAM/MS2gRNA the 3-component SAM system reported by Konermann et al., 2015 with MS2-modified FST-specific gRNA
  • 2- SAM/MS2gRNA the 2-component SAM system disclosed herein with MS2-modified FST- specific gRNA
  • 2- SAM/MS2ogRNA the 2-component SAM system disclosed herein with MS2-modified FST-specific gRNA carrying an optimized gRNA scaffold, which was optimized by replacing a thymine residue with a cytosine residue.
  • AHNH-VP64/control AHNH-VP64 fusion construct plus control gRNA
  • dCas9- VP64/ogRNA dCas9-VP64 fusion construct plus optimized MS2-modified gRNA for FST
  • AHNHVP64/ ogRNA AHNH-VP64 fusion construct plus optimized MS2-modified gRNA for FST;
  • ⁇ /ogRNA ⁇ construct plus optimized MS2-modified gRNA for FST;
  • AHNHAREC2/ogRNA ⁇ was further truncated to delete the REC2 domain.
  • the transcription activator with the 2-component ⁇ fused with or without VP64 is also highly active for ASCL1 gene in HEK293 cells.
  • FIGS 2A-2C The transcription repressor with ⁇ fused with or without KRAB domain efficiently knock down the expression of GFP.
  • FIG. 3 qRT-PCR analysis of FIBG1 expression in FEK293 cells transfected with different versions of transcription activators based on SpCas9 (SpAFINH) or SaCas9 (SaAFINH). Note that the SpAFINH system has two vectors while the SaAFINH system is in an all-in-one vector.
  • a cell includes a plurality of cells, including mixtures thereof.
  • compositions and methods include the recited elements, but not excluding others.
  • Consisting essentially of when used to define compositions and methods shall mean excluding other elements of any essential significance to the combination.
  • a composition consisting essentially of the elements as defined herein would not exclude trace contaminants from the isolation and purification method and pharmaceutically acceptable carriers, such as phosphate buffered saline, preservatives, and the like.
  • Consisting of shall mean excluding more than trace elements of other ingredients and substantial method steps for administering the compositions of this invention. Embodiments defined by each of these transition terms are within the scope of this invention.
  • polynucleotide refers to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown. Polynucleotide can refer to a sequence of genomic, synthetic, or recombinant origin and may be double-stranded or single-stranded, whether representing the sense or anti-sense strand.
  • polynucleotides a gene or gene fragment, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, genomic DNA, synthetic DNA, nucleic acid probes, and primers.
  • the polynucleotide may contain chemical modifications.
  • a polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs.
  • modifications to the nucleotide structure may be imparted before or after assembly of the polymer.
  • the sequence of nucleotides may be interrupted by non-nucleotide components.
  • a polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component.
  • the term also refers to both double- and single-stranded molecules. Unless otherwise specified or required, any embodiment of this invention that is a polynucleotide encompasses both the double- stranded form and each of two complementary single-stranded forms known or predicted to make up the double-stranded form.
  • a polynucleotide is composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); thymine (T); and uracil (U) for thymine (T) when the
  • polynucleotide is RNA.
  • polynucleotide sequence is the alphabetical
  • This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
  • nucleic acids or polypeptide sequences refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99% or higher identity over a specified region when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection
  • sequences are then said to be “substantially identical.”
  • This definition also refers to, or may be applied to, the compliment of a test sequence.
  • the definition also includes sequences that have deletions and/or additions, as well as those that have substitutions.
  • the preferred algorithms can account for gaps and the like.
  • identity exists over a region that is at least about 10 amino acids or 20 nucleotides in length, or more preferably over a region that is 10-50 amino acids or 20-50 nucleotides in length.
  • percent (%) amino acid sequence identity is defined as the percentage of amino acids in a candidate sequence that are identical to the amino acids in a reference sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity.
  • Alignment for purposes of determining percent sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared can be determined by known methods.
  • expression level refers to the value of a parameter which measures the extent of expression of a given gene. Said value can be determined by measuring the level of mRNA encoded by the gene of interest or by measuring the amount of protein encoded by said gene.
  • the expression level of the gene can be determined by any method known to the skilled person including determining the level of the mRNA or of the protein encoded by the gene. The art is familiar with quantification methods for nucleotides (e.g., genes, cDNA, mRNA, etc.) as well as proteins, polypeptides, enzymes, etc. In one embodiment, the
  • the determination of the expression level of the gene can be carried out by measuring the expression level of the mRNA encoded by the gene.
  • the biological sample may be treated to physically, chemically or mechanically disrupt tissue or cell structure, to release intracellular components into an aqueous or organic solution to prepare nucleic acids for further analysis.
  • the nucleic acids are extracted from the sample by procedures known to the skilled person and commercially available.
  • RNA is then extracted from frozen or fresh samples by any of the methods typical in the art, for example, Sambrook, Fischer and Maniatis, Molecular Cloning, a laboratory manual, (2nd ed.), Cold Spring Harbor Laboratory Press, New York, (1989). Care is taken to avoid degradation of the RNA during the extraction process. It is understood that the amount or level of a molecule in a sample or specimen need not be determined in absolute terms, but can be determined in relative terms (e.g., when compare to a control or a sham or an untreated sample or specimen).
  • Modulating refers to changing the expression, activity, or amount of a polypeptide or polynucleotide.
  • modulate is meant to alter, by increasing or decreasing.
  • a “modulator” can mean, for example, a composition that can either increase or decrease the expression, activity, or amount of a gene or gene product such as a polypeptide or
  • the expression, activity, or amount of the polypeptide or polynucleotide can be considered to be modulated if it is different from a "reference value".
  • the "reference value” is the level of the same polypeptide or polynucleotide in a control cell or specimen wherein the expression, activity, or amount of a gene or gene product has not been modulated (for example, no treatment with a modified Cas9 system). Modulation does not have to be complete.
  • the level can be modulated by about a 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%), 95%), 99%), 100%) increase as compared to a reference value, or by about a 10%>, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 100% decrease as compared to a reference value (or any amount above, below, or in between).
  • the term "vector” means a DNA construct containing a DNA sequence which is operably linked to a suitable control sequence capable of effecting the expression of the DNA in a suitable host.
  • control sequences include a promoter to effect transcription, an optional operator sequence to control such transcription, a sequence encoding suitable mRNA ribosome binding sites, and sequences which control the termination of transcription and translation.
  • the vector may be a plasmid, a phage particle, or simply a potential genomic insert. Once transformed into a suitable host, the vector may replicate and function independently of the host genome, or may in some instances, integrate into the genome itself.
  • a plasmid is the most commonly used form of vector, however, the invention is intended to include such other forms of vectors which serve equivalent function as and which are, or become, known in the art.
  • an expression cassette refers to a nucleic acid construct, which when introduced into a host cell, results in transcription and/or translation of a RNA or polypeptide, respectively.
  • an expression cassette comprising a promoter operably linked to a second nucleic acid may include a promoter that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation (e.g., by methods described in Sambrook et al., Molecular Cloning— A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., (1989) or Current Protocols in Molecular Biology Volumes 1-3, John Wiley & Sons, Inc.
  • an expression cassette comprising a terminator (or termination sequence) operably linked to a second nucleic acid may include a terminator that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation.
  • the expression cassette comprises a promoter operably linked to a second nucleic acid (e.g.
  • the expression cassette comprises an endogenous promoter. In some embodiments, the expression cassette comprises an endogenous terminator. In some embodiments, the expression cassette comprises a synthetic (or non-natural) promoter. In some embodiments, the expression cassette comprises a synthetic (or non-natural) terminator.
  • effector domain refers to a polypeptide that acts as a transcription activator or a transcription repressor.
  • VP65, p65, HSF1, and Rta are effector domains that act as transcription activators.
  • KRAB is an effector domain that acts as a transcription repressor.
  • aptamer refers to a nucleic acid that has a specific binding affinity for a target molecule, such as a protein. Like all nucleic acids, a particular nucleic acid ligand may be described by a linear sequence of nucleotides (A, U, T, C and G). Aptamers can be engineered to encode for the complementary sequence of a target protein known to associate with the presence or absence of a specific disease.
  • the gRNA sequence that binds to the modified Cas9 protein can include an aptamer that binds to a particular aptamer-binding protein.
  • the aptamer-binding protein can be fused to an effector domain, such that binding of the aptamer to the aptamer-binding protein brings the effector domain into close proximity with the DNA sequence targeted by the gRNA.
  • hybridization refers to a process of establishing a non-covalent, sequence- specific interaction between two or more complementary strands of nucleic acids into a single hybrid, which in the case of two strands is referred to as a duplex.
  • Target refers to a molecule that has an affinity for a given probe. Targets may be naturally-occurring or man-made molecules. Also, they can be employed in their unaltered state or as aggregates with other species.
  • the systems include a modified Cas9 protein having an UNH domain deletion, a guide RNA bound to the modified Cas9 protein and including a target sequence, and at least one effector domain.
  • the at least one effector domain is bound to the modified Cas9 protein, the guide RNA, or both.
  • the systems are highly compact and can be packaged into AAV for in vivo applications.
  • An adeno- associated virus (AAV) vector system for in vivo delivery of a Cas9-based gene expression modulation system includes a first expression cassette including a polynucleotide that encodes the modified Cas9 protein, a second expression cassette including a polynucleotide that encodes the guide RNA, and a third expression cassette including a polynucleotide that encodes the effector domain.
  • the first, second, and third expression cassettes are located on one or more adeno-associated virus (AAV) vectors.
  • Methods of modulating gene expression include introducing a system for modulating gene expression into a cell, hybridizing the target sequence to a target gene, directing the effector domain to the target gene, and modulating the transcription of the target gene.
  • the modified Cas9 proteins described herein are nuclease-inactive and include a deletion of the HNH domain (for example, the domain encoded by SEQ ID NO: 19, SEQ ID NO: 20, or SEQ ID NO: 21).
  • the removal of the HNH domain enables the production of a more compact Cas9 that can be packaged into the AAV vector systems for in vivo delivery.
  • the modified Cas9 protein can be derived from the Cas9 protein of a number of different species, including, but not limited to, Streptococcus pyogenes (SpCas9), Streptococcus aureus (SaCas9), and Campylobacter jejuni (CjCas9).
  • An example full length cDNA sequence of SpCas9 is given as SEQ ID NO: 1.
  • An example full length amino acid sequence of SpCas9 is given as SEQ ID NO: 2.
  • An example full length cDNA sequence of SaCas9 is given as SEQ ID NO: 12.
  • An example full length amino acid sequence of SaCas9 is given as SEQ ID NO: 13.
  • An example full length cDNA sequence of CjCas9 is given as SEQ ID NO: 12.
  • An example full length amino acid sequence of CjCas9 is given as SEQ ID NO: 13.
  • the nucleic acid sequence of the modified SpCas9 is SEQ ID NO: 3 (which includes the HNH deletion).
  • nucleic acid sequences encoding the modified SpCas9 are least 50%, at least 60%>, at least 75%, at least 80%>, at least 85%>, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 3.
  • amino acid sequence of the modified SpCas9 is SEQ ID NO: 4.
  • the amino acid sequence of the modified SpCas9 is at least 50%, at least 60%>, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 4.
  • the nucleic acid sequence of modified SaCas9 is SEQ ID NO: 14. In some embodiments, the nucleic acid sequence of modified SaCas9 is at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99%) identical to SEQ ID NO: 14. In some embodiments, the amino acid sequence of modified SaCas9 is SEQ ID NO: 15.
  • the amino acid sequence of the modified SaCas9 is at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 15.
  • the nucleic acid sequence of modified CjCas9 is SEQ ID NO: 22. In some embodiments, the nucleic acid sequence of modified CjCas9 is at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 22. In some embodiments, the amino acid sequence of modified CjCas9 is SEQ ID NO: 23.
  • the amino acid sequence of the modified CjCas9 is at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 23.
  • the systems further include a guide RNA, which is bound to the modified Cas9 protein.
  • the guide RNA includes a DNA binding sequence.
  • the DNA binding sequence is
  • the target sequence is at or adjacent to the gene being targeted for modulation (the target gene).
  • At least one effector domain is fused, directly bound to, or indirectly bound to the modified Cas9 protein, the guide RNA, or both.
  • the effector domain can be an activator domain for increasing transcription of the target gene or a repressor domain for repressing transcription of the target gene.
  • Example activator domains include, but are not limited to VP64, p65, HSF1, and Rta.
  • Example repressor domains include, but are not limited to, KRAB.
  • the guide RNA is operably linked to one or more effector domains via one or more aptamers.
  • the aptamer can be, for example, MS2 or PP7.
  • the effector domain can be fused to an aptamer-binding protein, such as, for example, an MS2-binding protein (MCP) or a PP7 coating protein.
  • MCP MS2-binding protein
  • one or more effector domains can be fused to, directly bound to, or indirectly bound to the modified Cas9 protein.
  • any SpCas9, SaCas9, and CjCas9 can be fused to, directly bound to, or indirectly bound to one or more transcription activator domains, such as, but not limited to, VP64, p65, HSF1, and Rta.
  • any SpCas9, SaCas9, and CjCas9 can be fused to, directly bound to, or indirectly bound to one or more transcription repressor domains, such as, but not limited to, KRAB.
  • AAV vector systems for in vivo delivery of the Cas9-based gene expression modulation system include first, second, and third expression cassettes located on one or more adeno- associated virus (AAV) vectors.
  • the first expression cassette includes a polynucleotide encoding the modified Cas9 protein having the HNH domain deletion.
  • the modified Cas9 protein is nuclease inactive, and can be derived from the Cas9 protein of many different species, including, but not limited to, Streptococcus pyogenes (SpCas9), Streptococcus aureus (SaCas9), and Campylobacter jejuni (CjCas9).
  • the second expression cassette includes a polynucleotide encoding the guide RNA that includes a DNA binding sequence.
  • the third expression cassette includes a polynucleotide encoding an effector domain, such as, but not limited to, VP64, p65, HSF1, Rta (activators) or KRAB (repressor).
  • the first, second, and third expression cassettes can be comprised on two AAV vectors.
  • the first expression cassette can be comprised on a first AAV vector
  • the second expression cassette can be comprised on a second AAV vector.
  • the third expression cassette which encodes for the effector domain, can be comprised on the first AAV vector such that the translated effector domain is fused to, directly bound to, or indirectly bound to the modified Cas9 protein.
  • the third expression cassette can be comprised on the second AAV vector such that the translated effector domain is bound to the guide RNA (for example, via an aptamer on the second expression cassette and an aptamer- binding protein on the third expression cassette).
  • Potential nucleotide sequences for the first vector include SEQ ID NO: 7, SEQ ID NO: 8 and SEQ ID NO: 9. In some embodiments, potential nucleotide sequences of the first vector are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 7, SEQ ID NO: 8 and/or SEQ ID NO: 9.
  • Potential nucleotide sequences for the second vector include SEQ ID NO: 10 and SEQ ID NO: 11.
  • potential nucleotide sequences of the second vector are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical SEQ ID NO: 10 and/or SEQ ID NO: 11.
  • a fourth expression cassette can be included in the AAV vector system.
  • the fourth expression cassette can include a polynucleotide that encodes for an additional effector (activator or repressor) domain. If the third expression cassette is comprised on either the first or second vector, the fourth expression cassette is comprised on the other of the first or second vector (such that both the first and second AAV vectors encode for effector domains).
  • the first, second, and third expression cassettes are comprised on the same adeno-associated virus (AAV) vector.
  • AAV adeno-associated virus
  • potential nucleotide sequences for all-in-one vectors that include a third expression cassette encoding for an activator domain include SEQ ID NO: 16, SEQ ID NO: 17 and SEQ ID NO: 24.
  • potential nucleotide sequences for all-in-one vectors that include a third expression cassette encoding for an activator domain are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 16, SEQ ID NO: 17 and/or SEQ ID NO: 24.
  • a potential nucleotide sequence for an all-in-one vector that includes a third expression cassette encoding for a repressor domain includes SEQ ID NO: 18.
  • potential nucleotide sequences encoding for all-in-one vectors that includes a third expression cassette encoding for a repressor domain are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 18.
  • the methods include introducing to a cell a system for modulating gene expression.
  • the system includes a modified Cas9 protein including a HNH domain deletion.
  • the modified Cas9 protein is nuclease inactive, and can be derived from the Cas9 protein of a number of different species, including, but not limited to, Streptococcus pyogenes (SpCas9), Streptococcus aureus (SaCas9), and Campylobacter jejuni (CjCas9).
  • the system further includes a guide RNA bound to the modified Cas9 protein and including a DNA binding sequence, and an effector domain bound to the modified Cas9 protein or the guide RNA.
  • the modified Cas9 protein can be fused to, directly bound to, or indirectly bound to the effector domain.
  • the guide RNA may be operably linked to one or more effector domains (for example, via one or more aptamers).
  • the effector domain can be an activator, such as, but not limited to, VP64, p65, HSF1, and/or Rta, or the effector domain can be a repressor, such as, but not limited to, KRAB.
  • the methods further include hybridizing the DNA binding sequence of the guide RNA to a target sequence and directing the effector domain to the target sequence.
  • directing means bringing the effector domain into the proximity of the target sequence so as to facilitate binding of the effector domain to the target sequence or to a region adjacent the target sequence and the target gene.
  • the methods further include modulating the transcription of a target gene that is at or adjacent to the target sequence. Modulating the transcription of the target gene can include increasing expression levels of the target gene or decreasing expression levels of the target gene.
  • the system is comprised on one or more AAV vectors, as described above. Introducing the system to the cell includes infecting the cell with the AAV comprising the AAV vectors.
  • modified Cas9 protein is nuclease inactive, and can be derived from the Cas9 protein of a number of different species, including, but not limited to, Streptococcus pyogenes (SpCas9), Streptococcus aureus (SaCas9), and Campylobacter jejuni (CjCas9).
  • Potential nucleic acid sequences of the modified Cas9 are SEQ ID NO.
  • SEQ ID NO: 14 SEQ ID NO: 22, or nucleic acid sequences that are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO. 3, SEQ ID NO: 14, OR SEQ ID NO: 22.
  • nucleic acid sequences of the deleted HNH domain are SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, or nucleic acid sequences that are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21.
  • the CRISPR/Cas9 system has been harnessed for genome editing in the cells of various species.
  • the most commonly used Cas9 are from Streptococcus pyogenes (SpCas9) and
  • RNA-guided nuclease Cas9 has also been reengineered as programmable transcription factors to upregulate (activators) or down-regulate (repressors) the expression of gene of interest.
  • the nuclease-inactive dSpCas9 or dSaCas9 is can be directly fused with effector domains such as transcriptional activation domains (such as, but not limited to, VP64, p65, HSF1 or Rta) or transcriptional repression domains (such as, but not limited to, KRAB). Effector domains can also be fused with aptamer-binding proteins (e.g.
  • MS2-binding protein or PP7 coating protein that bind to aptamers (such as MS2 or PP7 aptamers) on the guide RNA scaffold, thereby bringing the effector domain to the target sequence.
  • aptamers such as MS2 or PP7 aptamers
  • HC-Cas9TF highly compact Cas9-based transcriptional factors
  • the traditional approach to engineer a catalytic inactive form of SpCas9 or SaCas9 is to mutate two critical residues to alanine (e.g. D10A/H840A for dSpCas9, and D10A/N580A for SaCas9). It has been shown that the deletion of HNH domain from SpCas9 (dSpCas9AHNH) renders the enzyme inactive but still maintains the RNA-guided DNA binding activity.
  • dSpCas9AHNH can be used to activate or repress expression of gene of interest when the gene-specific gRNA is provided.
  • Deletion of the HNH domain from SaCas9 (dSaCas9AHNH) or Campylobacter jejuni Cas9 (dCjCas9AHNH) can also be used for gene modulation. Deletion of the HNH domain significantly reduces the size of the fusion constructs, and can thus be packaged into AAV vectors for in vivo applications.
  • a one-vector SaCas9-based transcription activation system 2) a one-vector CjCas9-based transcription activation system, 3) a two-vector SaCas9-based transcription activation system, 4) a two-vector SpCas9-based transcription activation system, 5) a one-vector SaCas9-based transcription repression system, 6) a one-vector CjCas9-based transcription repression system, and 7) a two-vector SpCas9-based transcription repression system.
  • FIG. 1 shows schematics and data for engineering a smaller version of transcription activator based on SpCas9.
  • SAM three-component SpCas9-based transcription activator
  • the three components include the full-length dCas9 (D10A/H840A) fused with VP64 (Component-I), MS2 aptamer modified gRNA (Component-II), and MCP- VP64-p65-HSFl (Component-Ill).
  • AHNH-VP64/control AHNH-VP64 fusion construct plus control gRNA
  • dCas9- VP64/ogRNA dCas9-VP64 fusion construct plus optimized MS2-modified gRNA for FST
  • AHNHVP64/ ogRNA AHNH-VP64 fusion construct plus optimized MS2-modified gRNA for FST
  • ⁇ /ogRNA ⁇ construct plus optimized MS2-modified gRNA for FST;
  • FIG. 1 shows schematics and data for engineering a transcription repressor with ⁇ fused with or without KRAB domain efficiently knock down the expression of GFP.
  • FIG 3 shows data from a qRT-PCR analysis of FIBG1 expression in HEK293 cells transfected with different versions of transcription activators based on SpCas9 (SpAFINH) or SaCas9 (SaAFE H). Note that the SpAFINH system has two vectors while the SaAHNFI system is in an all-in-one vector.
  • SEQ ID NO: 1 cDNA sequence of full-length SpCas9 (4104bp), HNH domain in lowercase italics
  • SEQ ID NO: 2 Amino-acid sequence of full-length SpCas9, HNH domain in bold italics
  • SEQ ID NO: 5 cDNA sequence of full-length CjCas9 (2955bp), HNH domain encoding sequence in bold italics:
  • SEQ ID NO: 6 Amino-acid sequence of full length dCjCas9AHNH (841 aa), linker replacing the HNH domain in bold italics
  • SEQ ID NO: 7 AAV Vector 1, example 1 (ITR-promoter-NLS-dSpCas9AHNH-NLS- minipA-ITR), ITR sequences in bold, LS sequence in bold italics, dSpCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, miniPA sequence in grey text
  • SEQ ID NO: 8 AAV Vector 1, example 2 flTR-promoter-NLS-dSpCas9AHNH-VP64- NLS-minipA-ITR), ITR sequences in bold, NLS sequence in bold italics, dSpCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, miniPA sequence in grey text, VP64 sequence in grey highlight
  • SEQ ID NO:9 AAV Vector 1, example 3 (ITR-promoter-NLS-dSpCas9AHNH-KRAB- NLS-minipA-ITR), ITR sequences in bold, NLS sequence in bold italics, dSpCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, miniPA sequence in grey text, KRAB sequence in grey highlight
  • SEQ ID NO: 10 AAV Vector 2, example 1 (ITR-promoter-MCP-VP64-p65-HSFl-polyA- U6-MS2ogRNA), ITR sequences in bold, MCP in grey highlight, VP64-p65-HSFl sequence in bold italics, polyA sequence lowercase underlined, MS2ogRNA sequence in grey text, green fluorescent protein sequence uppercase underlined, NLS in underlined italics cctgcaggcagctgcgcgctcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gcccggcctcagtgagcgagcgcgcgcagagagggagtggccaactccatcactaggggttc ctgcggcctctagactcgaggcgttgacattgattattgactagttattaa
  • SEQ ID NO: 11 AAV Vector 2, example 2 (ITR-promoter-MCP-KRAB-polyA-U6- MS2ogRNA), ITR sequences in bold, MCP in grey highlight, KRAB sequence in bold italics, polyA sequence underlined, MS2ogRNA sequence in grey text, green fluorescent protein sequence uppercase underlined, LS in underlined italics cctgcaggcagctgcgcgctcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gccggcctcagtgagcgagcgcgcgcagagagggagtggccaactccatcactaggggttc ctgcggcctctagactcgaggcgttgacattgattattgactagttattaatagtaatcaatta cggggtcattagttttt
  • SEQ ID NO: 13 Amino-acid sequence of full length SaCas9, HNH domain sequence in bold
  • SEQ ID NO: 14 cDNA sequence of SaCas9AHNH (2697bp), GGSGGS linker replacing the HNH domain in lowercase bold
  • SEQ ID NO: 15 Amino-acid sequence of SaCas9AHNH (899aa), linker replacing the HNH domain underlined
  • SEQ ID NO: 16 All-in-one SaCas9AHNH-based transcription activator in AAV vector, example 1 TR-promoter-NLS-dSaCas9AHNH-NLS-VP64-p65-NLS-minipA-U6- SaogRNA-ITR), ITR sequences in bold, NLS sequences in grey highlight, dSaCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, VP64-p65 sequence in bold italics, miniPA sequence in grey text cctgcaggcagctgcgcgcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gccggcctcagtgagcgagcgcgcgcagagagggagtggccaactccatcactaggggttc ctgcggcctctagactc
  • SEQ ID NO: 17 All-in-one SaCas9AHNH-based transcription activator in AAV vector, example 2 aTR-promoter-NLS-dSaCas9AHNH-NLS-VP64-HSFl-NLS-minipA-U6- SaogRNA-ITR), ITR sequences in bold, NLS sequences in grey highlight, dSaCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, VP64-HSF1 sequence in bold italics, miniPA sequence in grey text cctgcaggcagctgcgcgcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gccggcctcagtgagcgagcgcgcgcagagagggagtggccaactccatcactaggggttc ctgcggcctt
  • SEQ ID NO: 18 All-in-one SaCas9AHNH-based transcription repressor in AAV vector fITR-promoter-NLS-dSaCas9AHNH-NLS-KRAB-NLS-minipA-U6-SaogRNA-ITR): ITR sequences in bold, LS sequences in grey highlight, dSaCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, KRAB sequence in bold italics, miniPA sequence in grey text cctgcaggcagctgcgcgctcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gccggcctcagtgagcgagcgcgcgcagagagggagtggccaactccatcactaggggttc ctgctgcggcct ct c
  • SEQ ID NO: 19 cDNA sequence of SpCas9 HNH domain actacccagaagggacagaagaacagtagggaaaggatgaagaggattgaagagggtataaaag aactggggtcccaaatccttaaggaacacccagttgaaaacacccagcttcagaatgagaagct ctacctgtactacctgcagaacggcagggacatgtacgtggatcaggaactggacatcaatcgg cgg cgg cgg ctccgactacgacgtggatcatatcgtgtggatcatatcggatcgg ctccgactacgacgtggatcatatcgtgtggatcatatcgtgtggatcatatcgg ctccgactacgacgtggatcatatcgtgtggatcatatcgtgtg
  • SEQ ID NO: 20 cDNA sequence of SaCas9 HNH domain
  • SEQ ID NO: 22 cDNA sequence of dCjCas9AHNH (2523bp), linker replacing the HNH domain in bold italics
  • SEQ ID NO: 23 Amino-acid sequence of dCjCas9AHNH (841 aa), linker replacing the HNH domain in bold italics
  • SEQ ID NO: 24 All-in-one dCjCas9AHNH-based transcription activator in AAV vector
  • ITR-promoter-HA-M.S-dCjCas9AHNH-NLS-W64-p65- LS-minipA-U6-SaogRNA-ITR ITR in lowercase grey highlighting, HA in lowercase italics, LS in lowercase bold underlining, dCjCas9AHNH in uppercase underlining, VP64-p65 in uppercase grey highlighting, minipA in uppercase bold italics, U6 in grey text, SaogRNA in uppercase underlined italics

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Systems and methods for modulating gene expression are disclosed herein. The systems include a modified Cas9 protein having an HNH domain deletion, a guide RNA bound to the modified Cas9 protein, and at least one effector domain bound to the modified Cas9 protein, the guide RNA, or both. An AAV vector system for delivery of the gene expression modulation system includes a first expression cassette including a polynucleotide that encodes the modified Cas9 protein, a second expression cassette including a polynucleotide that encodes the guide RNA, and a third expression cassette including a polynucleotide that encodes the effector domain. The first, second, and third expression cassettes are located on one or more AAV vectors. Methods of modulating gene expression include introducing a system for modulating gene expression into a cell, hybridizing the guide RNA to a target gene, directing the effector domain to the target gene, and modulating transcription.

Description

HIGHLY COMPACT CAS9-BASED TRANSCRIPTIONAL REGULATORS FOR IN VIVO GENE REGULATION
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional Application 62/580, 189, filed November 1, 2017, which is incorporated by reference in its entirety for all purposes.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR
DEVELOPMENT
[0002] This invention was made with Government Support under Grant Nos. ROIHLI 16546 and R01 AR064241 awarded by the National Institutes of Health. The Government has certain rights in the invention.
FIELD
[0003] This invention relates to gene regulation, and more particularly to Cas9-based systems for gene regulation.
BACKGROUND
[0004] Clustered regularly interspaced short palindromic repeats (CRISPR)-Cas systems have drastically advanced the field of programmable DNA-binding tools. CRISPR is a family of DNA sequences in bacteria that contains snippets of DNA from viruses that have attacked the bacterium. In nature, these snippets are used by the bacterium to detect and destroy DNA from further attacks by similar viruses. Thus, the CRISPR/Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as those present within plasmids and phages that provides a form of acquired immunity.
[0005] A simple version of the CRISPR/Cas system, CRISPR/Cas9, has been engineered to edit genomes. Delivery of the Cas9 nuclease complexed with a synthetic guide RNA (gRNA) enables the cutting of a cell's genome at a desired location, such that existing genes to be removed and/or new ones added. Cas9 has also been engineered as a nuclease-inactive, programmable transcription factor with the ability to up- or down-regulate a gene of interest. In these applications, point mutations are introduced to Cas9 that inactivate the nuclease functionality. Thus, nuclease-inactive Cas9 is unable to cleave DNA, but retains the ability to target specific DNA sequences via the gRNA. Effector domains for transcriptional regulation are be bound to the nuclease-inactive Cas9 protein, to the gRNA, or to an RNA aptamer that binds the gRNA. These constructs are highly efficient in turning on or off the gene of interest in cultured cells. However, clinical translation of nuclease-inactive Cas9 transcription factor technology has stalled due to the large sizes of the Cas9 fusions, which are beyond the packaging capacity of conventional adeno-associated virus (AAV) vectors used for gene therapy.
[0006] AAV is a naturally occurring replication defective non-pathogenic virus with a single stranded DNA genome. AAV vectors have a favorable safety profile and are capable of achieving persistent transgene expression. Long-term expression is predominantly mediated by episomally retained AAV genomes. More than 90% of the stably transduced vector genomes are extra-chromosomal, mostly organized as high-molecular-weight concatemers. Therefore, the risk of insertional oncogenesis is minimal. The major limitation of AAV vectors is the limited packaging capacity of the vector particles, constraining the size of the transgene expression cassette to obtain functional vectors.
SUMMARY
[0007] Systems and methods for modulating gene expression are disclosed herein. The systems include a modified Cas9 protein having an HNH domain deletion, a guide RNA bound to the modified Cas9 protein and including a target sequence, and at least one effector domain. The at least one effector domain is bound to the modified Cas9 protein, the guide RNA, or both. The systems are highly compact and can be packaged into AAV for in vivo applications. An adeno- associated virus (AAV) vector system for in vivo delivery of a Cas9-based gene expression modulation system includes a first expression cassette including a polynucleotide that encodes the modified Cas9 protein, a second expression cassette including a polynucleotide that encodes the guide RNA, and a third expression cassette including a polynucleotide that encodes the effector domain. The first, second, and third expression cassettes are located on one or more adeno-associated virus (AAV) vectors. Methods of modulating gene expression include introducing a system for modulating gene expression into a cell, hybridizing the target sequence to a target gene, directing the effector domain to the target gene, and modulating the
transcription of the target gene. [0008] The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
DESCRIPTION OF DRAWINGS
[0009] Figures 1A-1F. Engineering a smaller version of transcription activator based on SpCas9. A) Schematic of a three-component SpCas9-based transcription activator, termed SAM. The three components include the full-length dCas9 (D10A/H840A) fused with VP64 (Component-I), MS2 aptamer modified gRNA (Component-II), and MCP-VP64-p65-HSFl (Component-Ill). B) Schematic map showing the construct to encode both Component-II and Component-Ill, which converts the three-component system to a two-component system. C) Domain structures of SpCas9. Full-length SpCas9 has 1368 amino acids, while the HNH deleted SpCas9 has 1224 amino acids. D) Comparison of the transcription activation efficiency of different versions of the SAM systems for activating follistatin (FST) gene in HEK293 cells. 3- SAM/Control = the 3-component SAM system reported by Konermann et al., 2015 (doi:
10.1038/naturel4136) with control gRNA; 3-SAM/MS2gRNA = the 3-component SAM system reported by Konermann et al., 2015 with MS2-modified FST-specific gRNA; 2- SAM/MS2gRNA = the 2-component SAM system disclosed herein with MS2-modified FST- specific gRNA; 2- SAM/MS2ogRNA = the 2-component SAM system disclosed herein with MS2-modified FST-specific gRNA carrying an optimized gRNA scaffold, which was optimized by replacing a thymine residue with a cytosine residue. E) Comparison the transcription activation efficiency of HNH deletion SpCas9-based 2-component transcription activators. AHNH-VP64/control = AHNH-VP64 fusion construct plus control gRNA; dCas9- VP64/ogRNA = dCas9-VP64 fusion construct plus optimized MS2-modified gRNA for FST; AHNHVP64/ ogRNA = AHNH-VP64 fusion construct plus optimized MS2-modified gRNA for FST;
ΔΗΝΗ/ogRNA = ΔΗΝΗ construct plus optimized MS2-modified gRNA for FST;
AHNHAREC2/ogRNA = ΔΗΝΗ was further truncated to delete the REC2 domain. F) The transcription activator with the 2-component ΔΗΝΗ fused with or without VP64 is also highly active for ASCL1 gene in HEK293 cells.
[0010] Figures 2A-2C. The transcription repressor with ΔΗΝΗ fused with or without KRAB domain efficiently knock down the expression of GFP. A) Schematic of different transcription repressors using dCas9 or ΔΗΝΗ fused with or without KRAB domain. B) Western blotting analysis of GFP expression using dCas9-KRAB, ΔΗΝΗ-KRAB or ΔΗΝΗ alone with control gRNA or two gRNAs targeting GFP. C) Quantitative analysis of GFP expression.
[0011] Figure 3. qRT-PCR analysis of FIBG1 expression in FEK293 cells transfected with different versions of transcription activators based on SpCas9 (SpAFINH) or SaCas9 (SaAFINH). Note that the SpAFINH system has two vectors while the SaAFINH system is in an all-in-one vector.
DETAILED DESCRIPTION
Definitions
[0012] Terms used throughout this application are to be construed with ordinary and typical meaning to those of ordinary skill in the art. However, Applicant desires that the following terms be given the particular definition as defined below.
[0013] As used in the specification and claims, the singular form "a," "an," and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a cell" includes a plurality of cells, including mixtures thereof.
[0014] As used herein, the term "comprising" is intended to mean that the compositions and methods include the recited elements, but not excluding others. "Consisting essentially of when used to define compositions and methods, shall mean excluding other elements of any essential significance to the combination. Thus, a composition consisting essentially of the elements as defined herein would not exclude trace contaminants from the isolation and purification method and pharmaceutically acceptable carriers, such as phosphate buffered saline, preservatives, and the like. "Consisting of shall mean excluding more than trace elements of other ingredients and substantial method steps for administering the compositions of this invention. Embodiments defined by each of these transition terms are within the scope of this invention.
[0015] The terms "polynucleotide" refers to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown. Polynucleotide can refer to a sequence of genomic, synthetic, or recombinant origin and may be double-stranded or single-stranded, whether representing the sense or anti-sense strand. The following are non- limiting examples of polynucleotides: a gene or gene fragment, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, genomic DNA, synthetic DNA, nucleic acid probes, and primers. The polynucleotide may contain chemical modifications. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component. The term also refers to both double- and single-stranded molecules. Unless otherwise specified or required, any embodiment of this invention that is a polynucleotide encompasses both the double- stranded form and each of two complementary single-stranded forms known or predicted to make up the double-stranded form.
[0016] A polynucleotide is composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); thymine (T); and uracil (U) for thymine (T) when the
polynucleotide is RNA. Thus, the term "polynucleotide sequence" is the alphabetical
representation of a polynucleotide molecule. This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
[0017] The terms "identical" or percent "identity," in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99% or higher identity over a specified region when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI web site or the like). Such sequences are then said to be "substantially identical." This definition also refers to, or may be applied to, the compliment of a test sequence. The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 10 amino acids or 20 nucleotides in length, or more preferably over a region that is 10-50 amino acids or 20-50 nucleotides in length. As used herein, percent (%) amino acid sequence identity is defined as the percentage of amino acids in a candidate sequence that are identical to the amino acids in a reference sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. Alignment for purposes of determining percent sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared can be determined by known methods.
[0018] The term "expression level", as used herein, refers to the value of a parameter which measures the extent of expression of a given gene. Said value can be determined by measuring the level of mRNA encoded by the gene of interest or by measuring the amount of protein encoded by said gene. The expression level of the gene can be determined by any method known to the skilled person including determining the level of the mRNA or of the protein encoded by the gene. The art is familiar with quantification methods for nucleotides (e.g., genes, cDNA, mRNA, etc.) as well as proteins, polypeptides, enzymes, etc. In one embodiment, the
determination of the expression level of the gene can be carried out by measuring the expression level of the mRNA encoded by the gene. For this purpose, the biological sample may be treated to physically, chemically or mechanically disrupt tissue or cell structure, to release intracellular components into an aqueous or organic solution to prepare nucleic acids for further analysis. The nucleic acids are extracted from the sample by procedures known to the skilled person and commercially available. RNA is then extracted from frozen or fresh samples by any of the methods typical in the art, for example, Sambrook, Fischer and Maniatis, Molecular Cloning, a laboratory manual, (2nd ed.), Cold Spring Harbor Laboratory Press, New York, (1989). Care is taken to avoid degradation of the RNA during the extraction process. It is understood that the amount or level of a molecule in a sample or specimen need not be determined in absolute terms, but can be determined in relative terms (e.g., when compare to a control or a sham or an untreated sample or specimen).
[0001] "Modulating" refers to changing the expression, activity, or amount of a polypeptide or polynucleotide. By "modulate" is meant to alter, by increasing or decreasing. As used herein, a "modulator" can mean, for example, a composition that can either increase or decrease the expression, activity, or amount of a gene or gene product such as a polypeptide or
polynucleotide. The expression, activity, or amount of the polypeptide or polynucleotide can be considered to be modulated if it is different from a "reference value". The "reference value" is the level of the same polypeptide or polynucleotide in a control cell or specimen wherein the expression, activity, or amount of a gene or gene product has not been modulated (for example, no treatment with a modified Cas9 system). Modulation does not have to be complete. For example, the level can be modulated by about a 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%), 95%), 99%), 100%) increase as compared to a reference value, or by about a 10%>, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 100% decrease as compared to a reference value (or any amount above, below, or in between). These exemplary values notwithstanding, it is expected that a skilled practitioner can determine cut-off points, etc. that represent a statistically significant difference to determine whether the polypeptide or polynucleotide is modulated.
[0019] The term "vector" means a DNA construct containing a DNA sequence which is operably linked to a suitable control sequence capable of effecting the expression of the DNA in a suitable host. Such control sequences include a promoter to effect transcription, an optional operator sequence to control such transcription, a sequence encoding suitable mRNA ribosome binding sites, and sequences which control the termination of transcription and translation. The vector may be a plasmid, a phage particle, or simply a potential genomic insert. Once transformed into a suitable host, the vector may replicate and function independently of the host genome, or may in some instances, integrate into the genome itself. A plasmid is the most commonly used form of vector, however, the invention is intended to include such other forms of vectors which serve equivalent function as and which are, or become, known in the art.
[0020] The term "expression cassette" refers to a nucleic acid construct, which when introduced into a host cell, results in transcription and/or translation of a RNA or polypeptide, respectively. In embodiments, an expression cassette comprising a promoter operably linked to a second nucleic acid (e.g. polynucleotide) may include a promoter that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation (e.g., by methods described in Sambrook et al., Molecular Cloning— A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., (1989) or Current Protocols in Molecular Biology Volumes 1-3, John Wiley & Sons, Inc. (1994-1998)). In some embodiments, an expression cassette comprising a terminator (or termination sequence) operably linked to a second nucleic acid (e.g. polynucleotide) may include a terminator that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation. In some embodiments, the expression cassette comprises a promoter operably linked to a second nucleic acid (e.g.
polynucleotide) and a terminator operably linked to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation. In some embodiments, the expression cassette comprises an endogenous promoter. In some embodiments, the expression cassette comprises an endogenous terminator. In some embodiments, the expression cassette comprises a synthetic (or non-natural) promoter. In some embodiments, the expression cassette comprises a synthetic (or non-natural) terminator.
[0021] As used herein, "effector domain" refers to a polypeptide that acts as a transcription activator or a transcription repressor. For example, VP65, p65, HSF1, and Rta are effector domains that act as transcription activators. KRAB, for example, is an effector domain that acts as a transcription repressor.
[0022] As used herein, "aptamer" refers to a nucleic acid that has a specific binding affinity for a target molecule, such as a protein. Like all nucleic acids, a particular nucleic acid ligand may be described by a linear sequence of nucleotides (A, U, T, C and G). Aptamers can be engineered to encode for the complementary sequence of a target protein known to associate with the presence or absence of a specific disease. For example, the gRNA sequence that binds to the modified Cas9 protein can include an aptamer that binds to a particular aptamer-binding protein. The aptamer-binding protein can be fused to an effector domain, such that binding of the aptamer to the aptamer-binding protein brings the effector domain into close proximity with the DNA sequence targeted by the gRNA.
[0023] The term "hybridization" refers to a process of establishing a non-covalent, sequence- specific interaction between two or more complementary strands of nucleic acids into a single hybrid, which in the case of two strands is referred to as a duplex.
[0024] The term "target" refers to a molecule that has an affinity for a given probe. Targets may be naturally-occurring or man-made molecules. Also, they can be employed in their unaltered state or as aggregates with other species.
[0025] Systems and methods for modulating gene expression are disclosed herein. The systems include a modified Cas9 protein having an UNH domain deletion, a guide RNA bound to the modified Cas9 protein and including a target sequence, and at least one effector domain. The at least one effector domain is bound to the modified Cas9 protein, the guide RNA, or both. The systems are highly compact and can be packaged into AAV for in vivo applications. An adeno- associated virus (AAV) vector system for in vivo delivery of a Cas9-based gene expression modulation system includes a first expression cassette including a polynucleotide that encodes the modified Cas9 protein, a second expression cassette including a polynucleotide that encodes the guide RNA, and a third expression cassette including a polynucleotide that encodes the effector domain. The first, second, and third expression cassettes are located on one or more adeno-associated virus (AAV) vectors. Methods of modulating gene expression include introducing a system for modulating gene expression into a cell, hybridizing the target sequence to a target gene, directing the effector domain to the target gene, and modulating the transcription of the target gene.
[0026] The modified Cas9 proteins described herein are nuclease-inactive and include a deletion of the HNH domain (for example, the domain encoded by SEQ ID NO: 19, SEQ ID NO: 20, or SEQ ID NO: 21). The removal of the HNH domain enables the production of a more compact Cas9 that can be packaged into the AAV vector systems for in vivo delivery.
[0027] The modified Cas9 protein can be derived from the Cas9 protein of a number of different species, including, but not limited to, Streptococcus pyogenes (SpCas9), Streptococcus aureus (SaCas9), and Campylobacter jejuni (CjCas9). An example full length cDNA sequence of SpCas9 is given as SEQ ID NO: 1. An example full length amino acid sequence of SpCas9 is given as SEQ ID NO: 2. An example full length cDNA sequence of SaCas9 is given as SEQ ID NO: 12. An example full length amino acid sequence of SaCas9 is given as SEQ ID NO: 13. An example full length cDNA sequence of CjCas9 is given as SEQ ID NO: 12. An example full length amino acid sequence of CjCas9 is given as SEQ ID NO: 13.
[0028] In some embodiments, the nucleic acid sequence of the modified SpCas9 is SEQ ID NO: 3 (which includes the HNH deletion). In some embodiments, nucleic acid sequences encoding the modified SpCas9 are least 50%, at least 60%>, at least 75%, at least 80%>, at least 85%>, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 3. In some embodiments, the amino acid sequence of the modified SpCas9 is SEQ ID NO: 4. In some embodiments, the amino acid sequence of the modified SpCas9 is at least 50%, at least 60%>, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 4.
[0029] In some embodiments, the nucleic acid sequence of modified SaCas9 is SEQ ID NO: 14. In some embodiments, the nucleic acid sequence of modified SaCas9 is at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99%) identical to SEQ ID NO: 14. In some embodiments, the amino acid sequence of modified SaCas9 is SEQ ID NO: 15. In some embodiments, the amino acid sequence of the modified SaCas9 is at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 15.
[0030] In some embodiments, the nucleic acid sequence of modified CjCas9 is SEQ ID NO: 22. In some embodiments, the nucleic acid sequence of modified CjCas9 is at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 22. In some embodiments, the amino acid sequence of modified CjCas9 is SEQ ID NO: 23. In some embodiments, the amino acid sequence of the modified CjCas9 is at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 23.
[0031] The systems further include a guide RNA, which is bound to the modified Cas9 protein. The guide RNA includes a DNA binding sequence. The DNA binding sequence is
complementary to a target sequence. The target sequence is at or adjacent to the gene being targeted for modulation (the target gene).
[0032] At least one effector domain is fused, directly bound to, or indirectly bound to the modified Cas9 protein, the guide RNA, or both. The effector domain can be an activator domain for increasing transcription of the target gene or a repressor domain for repressing transcription of the target gene. Example activator domains include, but are not limited to VP64, p65, HSF1, and Rta. Example repressor domains include, but are not limited to, KRAB. In some embodiments, the guide RNA is operably linked to one or more effector domains via one or more aptamers. The aptamer can be, for example, MS2 or PP7. The effector domain can be fused to an aptamer-binding protein, such as, for example, an MS2-binding protein (MCP) or a PP7 coating protein.
[0033] In some embodiments, one or more effector domains can be fused to, directly bound to, or indirectly bound to the modified Cas9 protein. For example, any SpCas9, SaCas9, and CjCas9 can be fused to, directly bound to, or indirectly bound to one or more transcription activator domains, such as, but not limited to, VP64, p65, HSF1, and Rta. Or, any SpCas9, SaCas9, and CjCas9 can be fused to, directly bound to, or indirectly bound to one or more transcription repressor domains, such as, but not limited to, KRAB.
[0034] AAV vector systems for in vivo delivery of the Cas9-based gene expression modulation system include first, second, and third expression cassettes located on one or more adeno- associated virus (AAV) vectors. The first expression cassette includes a polynucleotide encoding the modified Cas9 protein having the HNH domain deletion. As described above, the modified Cas9 protein is nuclease inactive, and can be derived from the Cas9 protein of many different species, including, but not limited to, Streptococcus pyogenes (SpCas9), Streptococcus aureus (SaCas9), and Campylobacter jejuni (CjCas9). The second expression cassette includes a polynucleotide encoding the guide RNA that includes a DNA binding sequence. The third expression cassette includes a polynucleotide encoding an effector domain, such as, but not limited to, VP64, p65, HSF1, Rta (activators) or KRAB (repressor).
[0035] In some embodiments, the first, second, and third expression cassettes can be comprised on two AAV vectors. For example, the first expression cassette can be comprised on a first AAV vector, and the second expression cassette can be comprised on a second AAV vector. The third expression cassette, which encodes for the effector domain, can be comprised on the first AAV vector such that the translated effector domain is fused to, directly bound to, or indirectly bound to the modified Cas9 protein. Alternatively, the third expression cassette can be comprised on the second AAV vector such that the translated effector domain is bound to the guide RNA (for example, via an aptamer on the second expression cassette and an aptamer- binding protein on the third expression cassette). Potential nucleotide sequences for the first vector include SEQ ID NO: 7, SEQ ID NO: 8 and SEQ ID NO: 9. In some embodiments, potential nucleotide sequences of the first vector are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 7, SEQ ID NO: 8 and/or SEQ ID NO: 9. Potential nucleotide sequences for the second vector include SEQ ID NO: 10 and SEQ ID NO: 11. In some embodiments, potential nucleotide sequences of the second vector are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical SEQ ID NO: 10 and/or SEQ ID NO: 11.
[0036] In some embodiments, a fourth expression cassette can be included in the AAV vector system. The fourth expression cassette can include a polynucleotide that encodes for an additional effector (activator or repressor) domain. If the third expression cassette is comprised on either the first or second vector, the fourth expression cassette is comprised on the other of the first or second vector (such that both the first and second AAV vectors encode for effector domains).
[0037] In some embodiments, the first, second, and third expression cassettes are comprised on the same adeno-associated virus (AAV) vector. For example, potential nucleotide sequences for all-in-one vectors that include a third expression cassette encoding for an activator domain include SEQ ID NO: 16, SEQ ID NO: 17 and SEQ ID NO: 24. In some embodiments, potential nucleotide sequences for all-in-one vectors that include a third expression cassette encoding for an activator domain are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 16, SEQ ID NO: 17 and/or SEQ ID NO: 24. A potential nucleotide sequence for an all-in-one vector that includes a third expression cassette encoding for a repressor domain includes SEQ ID NO: 18. In some embodiments, potential nucleotide sequences encoding for all-in-one vectors that includes a third expression cassette encoding for a repressor domain are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 18.
[0038] Further disclosed herein are methods of modulating gene expression. The methods include introducing to a cell a system for modulating gene expression. The system includes a modified Cas9 protein including a HNH domain deletion. As described above, the modified Cas9 protein is nuclease inactive, and can be derived from the Cas9 protein of a number of different species, including, but not limited to, Streptococcus pyogenes (SpCas9), Streptococcus aureus (SaCas9), and Campylobacter jejuni (CjCas9). The system further includes a guide RNA bound to the modified Cas9 protein and including a DNA binding sequence, and an effector domain bound to the modified Cas9 protein or the guide RNA. The modified Cas9 protein can be fused to, directly bound to, or indirectly bound to the effector domain. The guide RNA may be operably linked to one or more effector domains (for example, via one or more aptamers). The effector domain can be an activator, such as, but not limited to, VP64, p65, HSF1, and/or Rta, or the effector domain can be a repressor, such as, but not limited to, KRAB.
[0039] The methods further include hybridizing the DNA binding sequence of the guide RNA to a target sequence and directing the effector domain to the target sequence. As used herein, "directing" means bringing the effector domain into the proximity of the target sequence so as to facilitate binding of the effector domain to the target sequence or to a region adjacent the target sequence and the target gene. The methods further include modulating the transcription of a target gene that is at or adjacent to the target sequence. Modulating the transcription of the target gene can include increasing expression levels of the target gene or decreasing expression levels of the target gene. In some embodiments, the system is comprised on one or more AAV vectors, as described above. Introducing the system to the cell includes infecting the cell with the AAV comprising the AAV vectors.
[0040] Isolated nucleic acids encoding for modified Cas9 proteins including a HNH domain deletion are also disclosed herein. As described above, the modified Cas9 protein is nuclease inactive, and can be derived from the Cas9 protein of a number of different species, including, but not limited to, Streptococcus pyogenes (SpCas9), Streptococcus aureus (SaCas9), and Campylobacter jejuni (CjCas9). Potential nucleic acid sequences of the modified Cas9 are SEQ ID NO. 3, SEQ ID NO: 14, SEQ ID NO: 22, or nucleic acid sequences that are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO. 3, SEQ ID NO: 14, OR SEQ ID NO: 22. Potential nucleic acid sequences of the deleted HNH domain are SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, or nucleic acid sequences that are at least 50%, at least 60%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97.5%, or at least 99% identical to SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21.
EXAMPLE
[0041] The CRISPR/Cas9 system has been harnessed for genome editing in the cells of various species. The most commonly used Cas9 are from Streptococcus pyogenes (SpCas9) and
Staphylococcus aureus (SaCas9). The RNA-guided nuclease Cas9 has also been reengineered as programmable transcription factors to upregulate (activators) or down-regulate (repressors) the expression of gene of interest. In these applications, the nuclease-inactive dSpCas9 or dSaCas9 is can be directly fused with effector domains such as transcriptional activation domains (such as, but not limited to, VP64, p65, HSF1 or Rta) or transcriptional repression domains (such as, but not limited to, KRAB). Effector domains can also be fused with aptamer-binding proteins (e.g. MS2-binding protein or PP7 coating protein) that bind to aptamers (such as MS2 or PP7 aptamers) on the guide RNA scaffold, thereby bringing the effector domain to the target sequence. These constructs are highly efficient in turning on or off the target gene in cultured cells. However, their use in postnatal animals or potential clinical translation is impractical because the large size of the dCas9 fusions, which are beyond the packaging capacity of the adeno-associated virus (AAV) vectors for such in vivo applications.
[0042] Disclosed herein are highly compact Cas9-based transcriptional factors (HC-Cas9TF), which could be packaged into AAV for in vivo applications in animals and clinical translation. The traditional approach to engineer a catalytic inactive form of SpCas9 or SaCas9 is to mutate two critical residues to alanine (e.g. D10A/H840A for dSpCas9, and D10A/N580A for SaCas9). It has been shown that the deletion of HNH domain from SpCas9 (dSpCas9AHNH) renders the enzyme inactive but still maintains the RNA-guided DNA binding activity. Herein, it is shown that dSpCas9AHNH can be used to activate or repress expression of gene of interest when the gene-specific gRNA is provided. Deletion of the HNH domain from SaCas9 (dSaCas9AHNH) or Campylobacter jejuni Cas9 (dCjCas9AHNH) can also be used for gene modulation. Deletion of the HNH domain significantly reduces the size of the fusion constructs, and can thus be packaged into AAV vectors for in vivo applications.
[0043] Disclosed herein are several example systems that can be packaged into AAV for in vivo gene regulation: 1) a one-vector SaCas9-based transcription activation system, 2) a one-vector CjCas9-based transcription activation system, 3) a two-vector SaCas9-based transcription activation system, 4) a two-vector SpCas9-based transcription activation system, 5) a one-vector SaCas9-based transcription repression system, 6) a one-vector CjCas9-based transcription repression system, and 7) a two-vector SpCas9-based transcription repression system.
[0044] Figure 1 shows schematics and data for engineering a smaller version of transcription activator based on SpCas9. A) Schematic of a three-component SpCas9-based transcription activator, termed SAM. The three components include the full-length dCas9 (D10A/H840A) fused with VP64 (Component-I), MS2 aptamer modified gRNA (Component-II), and MCP- VP64-p65-HSFl (Component-Ill). B) Schematic map showing the construct to encode both Component-II and Component-Ill, which converts the three-component system to a two- component system. C) Domain structures of SpCas9. Full-length SpCas9 has 1368 amino acids, while the HNH deleted SpCas9 has 1224 amino acids. D) Comparison of the transcription activation efficiency of different versions of the SAM systems for activating follistatin (FST) gene in HEK293 cells. 3-SAM/Control = the 3 -component SAM system reported by Konermann et al., 2015 (doi: 10.1038/naturel4136) with control gRNA; 3-SAM/MS2gRNA = the 3- component SAM system reported by Konermann et al., 2015 with MS2-modified FST-specific gRNA; 2- SAM/MS2gRNA = the 2-component SAM system disclosed herein with MS2- modified FST-specific gRNA; 2- SAM/MS2ogRNA = the 2-component SAM system disclosed herein with MS2-modified FST-specific gRNA carrying an optimized gRNA scaffold, which was optimized by replacing a thymine residue with a cytosine residue. E) Comparison the transcription activation efficiency of HNH deletion SpCas9-based 2-component transcription activators. AHNH-VP64/control = AHNH-VP64 fusion construct plus control gRNA; dCas9- VP64/ogRNA = dCas9-VP64 fusion construct plus optimized MS2-modified gRNA for FST; AHNHVP64/ ogRNA = AHNH-VP64 fusion construct plus optimized MS2-modified gRNA for FST; ΔΗΝΗ/ogRNA = ΔΗΝΗ construct plus optimized MS2-modified gRNA for FST;
AHNHAREC2/ogRNA = ΔΗΝΗ was further truncated to delete the REC2 domain. F) The transcription activator with the 2-component ΔΗΝΗ fused with or without VP64 is also highly active for ASCL1 gene in HEK293 cells. [0045] Figure 2 shows schematics and data for engineering a transcription repressor with ΔΗΝΗ fused with or without KRAB domain efficiently knock down the expression of GFP. A)
Schematic of different transcription repressors using dCas9 or AFINFI fused with or without KRAB domain. B) Western blotting analysis of GFP expression using dCas9-KRAB, AHNFI- KRAB or ΔΗΝΗ alone with control gRNA or two gRNAs targeting GFP. C) Quantitative analysis of GFP expression.
[0046] Figure 3 shows data from a qRT-PCR analysis of FIBG1 expression in HEK293 cells transfected with different versions of transcription activators based on SpCas9 (SpAFINH) or SaCas9 (SaAFE H). Note that the SpAFINH system has two vectors while the SaAHNFI system is in an all-in-one vector.
[0047] Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed invention belongs. Publications cited herein and the materials for which they are cited are specifically incorporated by reference.
[0048] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. While the invention has been described with reference to particular embodiments and implementations, it will understood that various changes and additional variations may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention or the inventive concept thereof. In addition, many modifications may be made to adapt a particular situation or device to the teachings of the invention without departing from the essential scope thereof. Such equivalents are intended to be encompassed by the following claims. It is intended that the invention not be limited to the particular implementations disclosed herein, but that the invention will include all implementations falling within the scope of the appended claims.
SEQUENCES
SEQ ID NO: 1: cDNA sequence of full-length SpCas9 (4104bp), HNH domain in lowercase italics
ATGGACAAGAAGTACTCCATTGGGCTCGATATCGGCACAAACAGCGTCGGCTGGGCCGTCATTA CGGACGAGTACAAGGTGCCGAGCAAAAAATTCAAAGTTCTGGGCAATACCGATCGCCACAGCAT AAAGAAGAACCTCATTGGCGCCCTCCTGTTCGACTCCGGGGAGACaGCCGAAGCCACGCGGCTC AAAAGAACAGCACGGCGCAGATATACCCGCAGAAAGAATCGGATCTGCTACCTGCAGGAGATCT TTAGTAATGAGATGGCTAAGGTGGATGACTCTTTCTTCCATAGGCTGGAGGAGTCCTTTTTGGT GGAGGAGGATAAAAAGCACGAGCGCCACCCAATCTTTGGCAATATCGTGGACGAGGTGGCGTAC CATGAAAAGTACCCAACCATATATCATCTGAGGAAGAAGCTTGTAGACAGTACTGATAAGGCTG ACTTGCGGTTGATCTATCTCGCGCTGGCGCATATGATCAAATTTCGGGGACACTTCCTCATCGA GGGGGACCTGAACCCAGACAACAGCGATGTCGACAAACTCTTTATCCAACTGGTTCAGACTTAC AATCAGCTTTTCGAAGAGAACCCGATCAACGCATCCGGAGTTGACGCCAAAGCAATCCTGAGCG CTAGGCTGTCCAAATCCCGGCGGCTCGAAAACCTCATCGCACAGCTCCCTGGGGAGAAGAAGAA CGGCCTGTTTGGTAATCTTATCGCCCTGTCACTCGGGCTGACCCCCAACTTTAAATCTAACTTC GACCTGGCCGAAGATGCCAAGCTTCAACTGAGCAAAGACACCTACGATGATGATCTCGACAATC TGCTGGCCCAGATCGGCGACCAGTACGCAGACCTTTTTTTGGCGGCAAAGAACCTGTCAGACGC CATTCTGCTGAGTGATATTCTGCGAGTGAACACGGAGATCACCAAAGCTCCGCTGAGCGCTAGT ATGATCAAGCGCTATGATGAGCACCACCAAGACTTGACTTTGCTGAAGGCCCTTGTCAGACAGC AACTGCCTGAGAAGTACAAGGAAATTTTCTTCGATCAGTCTAAAAATGGCTACGCCGGATACAT TGACGGCGGAGCAAGCCAGGAGGAATTTTACAAATTTATTAAGCCCATCTTGGAAAAAATGGAC GGCACCGAGGAGCTGCTGGTAAAGCTTAACAGAGAAGATCTGTTGCGCAAACAGCGCACTTTCG ACAATGGAAGCATCCCCCACCAGATTCACCTGGGCGAACTGCACGCTATCCTCAGGCGGCAAGA GGATTTCTACCCCTTTTTGAAAGATAACAGGGAAAAGATTGAGAAAATCCTCACATTTCGGATA CCCTACTATGTAGGCCCCCTCGCCCGGGGAAATTCCAGATTCGCGTGGATGACTCGCAAATCAG AAGAGACCATCACTCCCTGGAACTTCGAGGAAGTCGTGGATAAGGGGGCCTCTGCCCAGTCCTT CATCGAAAGGATGACTAACTTTGATAAAAATCTGCCTAACGAAAAGGTGCTTCCTAAACACTCT CTGCTGTACGAGTACTTCACAGTTTATAACGAGCTCACCAAGGTCAAATACGTCACAGAAGGGA TGAGAAAGCCAGCATTCCTGTCTGGAGAGCAGAAGAAAGCTATCGTGGACCTCCTCTTCAAGAC GAACCGGAAAGTTACCGTGAAACAGCTCAAAGAAGACTATTTCAAAAAGATTGAATGTTTCGAC TCTGTTGAAATCAGCGGAGTGGAGGATCGCTTCAACGCATCCCTGGGAACGTATCACGATCTCC T GAAAAT CAT T AAAGAC AAG GAC T T CC TGGACAAT GAGGAGAACGAGGACAT T C T T GAGGACAT
TGTCCTCACCCTTACGTTGTTTGAAGATAGGGAGATGATTGAAGAACGCTTGAAAACTTACGCT CAT C T C T T C GAC GAC AAAG T CAT GAAACAG C T C AAGAG G C G C C GAT AT AC AGGAT GGGGGCGGC T G T C AAGAAAAC T GAT C AAT G G GAT C C GAGAC AAG C AGAG T G GAAAGAC AAT C C T G GAT T T T C T TAAGTCCGATGGATTTGCCAACCGGAACTTCATGCAGTTGATCCATGATGACTCTCTCACCTTT AAG GAG GAC AT C C AGAAAG C AC AAG TTTCTGGC C AG G G G GAC AG T C T T C AC GAG C AC AT C G C T A ATCTTGCAGGTAGCCCAGCTATCAAAAAGGGAATACTGCAGACCGTTAAGGTCGTGGATGAACT C G T C AAAG T AAT G G GAAG G CAT AAG C C C GAGAAT AT C G T T AT C GAGAT G G C C C GAGAGAAC C AA
actacccagaa ggga ca gaagaa ca gta ggga a a gga tgaaga gga ttgaaga gggta taaaag aactggggtcccaaatccttaaggaacacccagttgaaaacacccagcttcagaatgagaagct c ta cctgta c ta cctgca gaacggca ggga ca t gta cgtgga tea ggaactgga catcaa tegg ctctccgactacgacgtggatca tatcgtgccccagtcttttctcaaagatgattctattgata ataaagtgttgacaagatccgataaaaatagagggaagagtgataacgtcccctcagaagaagt tgtcaagaaaatgaaaaattattggcggcagctgctgaacgccaaactgatcacacaacggaag ttcgataatctgactaaggctgaacgaggtggcctgtctgagttggataaagccggcttcatca a a AG G C AG C T T G T T GAG AC AC G C C AG AT C AC C AAG CACGTGGCC C AAAT T C T C GAT TCACGCAT GAAC AC C AAG T AC GAT GAAAAT GAC AAAC T GAT T C GAGAG G T GAAAG T T AT T AC T C T GAAG T C T AAG C T G G T C T C AGAT T T C AGAAAG GAC T T T C AG T T T TAT AAG G T GAGAGAGAT C AAC AAT T AC C ACCATGCGCATGATGCCTACCTGAATGCAGTGGTAGGCACTGCACTTATCAAAAAATATCCCAA G C T T GAAT C T GAAT TTGTTTACG GAGAC TAT AAAG T G T AC GAT G T T AG GAAAAT GAT C G C AAAG T C T GAG C AG GAAAT AG G C AAG G C C AC C GC T AAG T AC T T C T T T T AC AG C AAT AT TAT GAAT T T T T T C AAGAC C GAGAT T AC AC T G G C C AAT G GAGAGAT T C G GAAG C GAC C AC T T AT C GAAAC AAAC G G AGAAACAGGAGAAATCGTGTGGGACAAGGGTAGGGATTTCGCGACAGTCCGGAAGGTCCTGTCC AT G C C G C AG G T GAAC AT C G T T AAAAAGAC C GAAG T AC AGAC C G GAG G C T T C T C C AAG GAAAG T A T C C T C C C GAAAAG GAAC AG C GAC AAG C T GAT C G C AC G C AAAAAAGAT T G G GAC C C C AAGAAAT A CGGCGGATTCGATTCTCCTACAGTCGCTTACAGTGTACTGGTTGTGGCCAAAGTGGAGAAAGGG AAGTCTAAAAAACTCAAAAGCGTCAAGGAACTGCTGGGCATCACAATCATGGAGCGATCAAGCT T C GAAAAAAAC C C C AT C GAC T T T C T C GAG G C GAAAG GAT AT AAAGAG G T C AAAAAAGAC C T C AT CATTAAGCTTCCCAAGTACTCTCTCTTTGAGCTTGAAAACGGCCGGAAACGAATGCTCGCTAGT GCGGGCGAGCTGCAGAAAGGTAACGAGCTGGCACTGCCCTCTAAATACGTTAATTTCTTGTATC TGGCCAGCCACTATGAAAAGCTCAAAGGGTCTCCCGAAGATAATGAGCAGAAGCAGCTGTTCGT G GAACAAC AC AAAC AC T AC C T T GAT GAGAT C AT C GAG C AAAT AAG C GAAT T C T C C AAAAGAG T G ATCCTCGCCGACGCTAACCTCGATAAGGTGCTTTCTGCTTACAATAAGCACAGGGATAAGCCCA TCAGGGAGCAGGCAGAAAACATTATCCACTTGTTTACTCTGACCAACTTGGGCGCGCCTGCAGC CTTCAAGTACTTCGACACCACCATAGACAGAAAGCGGTACACCTCTACAAAGGAGGTCCTGGAC GCCACACTGATTCATCAGTCAATTACGGGGCTCTATGAAACAAGAATCGACCTCTCTCAGCTCG GTGGAGAC
SEQ ID NO: 2: Amino-acid sequence of full-length SpCas9, HNH domain in bold italics
MDKKYS IGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHS IKKNLIGALLFDSGETAEATRL KRTARRRYTRRKNRICYLQEI FSNEMAKVDDSFFHRLEESFLVEEDKKHERHPI FGNIVDEVAY HEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTY NQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNF DLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRWTEITKAPLSAS MIKRYDEHHQDLTLLKALVRQQLPEKYKEI FFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMD GTEELLVKLNREDLLRKQRTFDNGS IPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRI PYYVGPLARGNSRFAWMTRKSEETITPWNFEEWDKGASAQSFIERMTNFDKNLPNEKVLPKHS LLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFD SVEISGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA HLFDDK\/MKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTF KEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKWDELVK\/MGRHKPENIVIEMARENQ TTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINR LSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSD^
FDNL KAERGGLSELDKAGFIKRQLVE RQI KHVAQILDSRMNTKYDENDKLIREVKVITLKS KLVSDFRKDFQFYKVREINNYHHAHDAYLNAWGTALIKKYPKLESEFVYGDYKVYDVRKMIAK SEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLS MPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLWAKVEKG KSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLAS AGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRV ILADANLDKVLSAYNKHRDKPIREQAENI IHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLD ATLIHQS ITGLYETRIDLSQLGGD SEQ ID NO: 3: cDNA sequence of dSpCas9AHNH (3672bp), linker replacing the HNH in bold lowercase italics
ATGGACAAGAAGTACTCCATTGGGCTCGATATCGGCACAAACAGCGTCGGCTGGGCCGTCATTA C G GAC GAG T AC AAG G T G C C GAG C AAAAAAT T C AAAG T T C T G G G C AAT AC C GAT C G C C AC AG CAT AAAGAAGAACCTCATTGGCGCCCTCCTGTTCGACTCCGGGGAGACaGCCGAAGCCACGCGGCTC AAAAGAAC AG C AC G G C G C AGAT AT AC C C GC AGAAAGAAT C G GAT C T G C T AC C T G C AG GAGAT C T TTAGTAATGAGATGGCTAAGGTGGATGACTCTTTCTTCCATAGGCTGGAGGAGTCCTTTTTGGT GGAGGAGGATAAAAAGCACGAGCGCCACCCAATCTTTGGCAATATCGTGGACGAGGTGGCGTAC CAT GAAAAG T AC C C AAC CAT AT AT CAT C T GAG GAAGAAG C T T G T AGAC AG T AC T GAT AAG G C T G ACTTGCGGTTGATCTATCTCGCGCTGGCGCATATGATCAAATTTCGGGGACACTTCCTCATCGA G G G G GAC C T GAAC C C AGAC AAC AG C GAT G T C GAC AAAC TCTTTATC C AAC T GG T T C AGAC T T AC AATCAGCTTTTCGAAGAGAACCCGATCAACGCATCCGGAGTTGACGCCAAAGCAATCCTGAGCG CTAGGCTGTCCAAATCCCGGCGGCTCGAAAACCTCATCGCACAGCTCCCTGGGGAGAAGAAGAA CGGCCTGTTTGGTAATCTTATCGCCCTGTCACTCGGGCTGACCCCCAACTTTAAATCTAACTTC GAC C T G G C C GAAGAT G C C AAG C T T C AAC T GAG C AAAGAC AC C T AC GAT GAT GAT C T C GAC AAT C TGCTGGCCCAGATCGGCGACCAGTACGCAGACCTTTTTTTGGCGGCAAAGAACCTGTCAGACGC CATTCTGCTGAGTGATATTCTGCGAGTGAACACGGAGATCACCAAAGCTCCGCTGAGCGCTAGT AT GAT C AAG C G C T AT GAT GAG C AC C AC C AAGAC T T GAC T T T G C T GAAG GCCCTTGT C AGAC AG C AAC T GC C T GAGAAG T AC AAG GAAAT T T T C T T C GAT C AG T C T AAAAAT G G C T AC G C C G GAT AC AT T GAC GG C G GAG C AAG C C AG GAG GAAT T T T AC AAAT T T AT T AAG C C CAT C T T GGAAAAAAT G GAC GGCACCGAGGAGCTGCTGGTAAAGCTTAACAGAGAAGATCTGTTGCGCAAACAGCGCACTTTCG ACAATGGAAGCATCCCCCACCAGATTCACCTGGGCGAACTGCACGCTATCCTCAGGCGGCAAGA GGATTTCTACCCCTTTTT GAAAGAT AAC AG G GAAAAGAT T GAGAAAAT C C T CAC AT T T C G GAT A CCCTACTATGTAGGCCCCCTCGCCCGGGGAAATTCCAGATTCGCGTGGATGACTCGCAAATCAG AAGAGACCATCACTCCCTGGAACTTCGAGGAAGTCGTGGATAAGGGGGCCTCTGCCCAGTCCTT C AT C GAAAG GAT GAC T AAC T T T GAT AAAAAT C T G C C T AAC GAAAAG G T G C T T C C T AAAC AC T C T C T G C T G T AC GAG T AC T T CAC AG T T TAT AAC GAG C T CAC C AAG G T C AAAT AC G T C AC AGAAG G GA TGAGAAAGCCAGCATTCCTGTCTGGAGAGCAGAAGAAAGCTATCGTGGACCTCCTCTTCAAGAC GAAC C G GAAAG T T AC C G T GAAAC AG C T C AAAGAAGAC T AT T T C AAAAAGAT T GAAT G T T T C GAC TCTGTTGAAATCAGCGGAGTGGAGGATCGCTTCAACGCATCCCTGGGAACGTATCACGATCTCC T GAAAAT CAT T AAAGAC AAG GAC T T CC T GGACAAT GAGGAGAACGAGGACAT T C T T GAGGACAT TGTCCTCACCCTTACGTTGTTTGAAGATAGGGAGATGATTGAAGAACGCTTGAAAACTTACGCT CATCTCTTCGACGACAAAGTCATGAAACAGCTCAAGAGGCGCCGATATACAGGATGGGGGCGGC TGTCAAGAAAACTGATCAATGGGATCCGAGACAAGCAGAGTGGAAAGACAATCCTGGATTTTCT TAAGTCCGATGGATTTGCCAACCGGAACTTCATGCAGTTGATCCATGATGACTCTCTCACCTTT AAGGAGGACATCCAGAAAGCACAAGTTTCTGGCCAGGGGGACAGTCTTCACGAGCACATCGCTA ATCTTGCAGGTAGCCCAGCTATCAAAAAGGGAATACTGCAGACCGTTAAGGTCGTGGATGAACT CGTCAAAGTAATGGGAAGGCATAAGCCCGAGAATATCGTTATCGAGATGGCCCGAGAGAACCAA
ggaggtagcggtggctctAGGCAGCTTGTTGAGACACGCCAGATCACCAAGCACGTGGCCCAAA TTCTCGATTCACGCATGAACACCAAGTACGATGAAAATGACAAACTGATTCGAGAGGTGAAAGT TATTACTCTGAAGTCTAAGCTGGTCTCAGATTTCAGAAAGGACTTTCAGTTTTATAAGGTGAGA GAGATCAACAATTACCACCATGCGCATGATGCCTACCTGAATGCAGTGGTAGGCACTGCACTTA TCAAAAAATATCCCAAGCTTGAATCTGAATTTGTTTACGGAGACTATAAAGTGTACGATGTTAG GAAAATGATCGCAAAGTCTGAGCAGGAAATAGGCAAGGCCACCGCTAAGTACTTCTTTTACAGC AATATTATGAATTTTTTCAAGACCGAGATTACACTGGCCAATGGAGAGATTCGGAAGCGACCAC TTATCGAAACAAACGGAGAAACAGGAGAAATCGTGTGGGACAAGGGTAGGGATTTCGCGACAGT CCGGAAGGTCCTGTCCATGCCGCAGGTGAACATCGTTAAAAAGACCGAAGTACAGACCGGAGGC TTCTCCAAGGAAAGTATCCTCCCGAAAAGGAACAGCGACAAGCTGATCGCACGCAAAAAAGATT GGGACCCCAAGAAATACGGCGGATTCGATTCTCCTACAGTCGCTTACAGTGTACTGGTTGTGGC CAAAGTGGAGAAAGGGAAGTCTAAAAAACTCAAAAGCGTCAAGGAACTGCTGGGCATCACAATC ATGGAGCGATCAAGCTTCGAAAAAAACCCCATCGACTTTCTCGAGGCGAAAGGATATAAAGAGG TCAAAAAAGACCTCATCATTAAGCTTCCCAAGTACTCTCTCTTTGAGCTTGAAAACGGCCGGAA ACGAATGCTCGCTAGTGCGGGCGAGCTGCAGAAAGGTAACGAGCTGGCACTGCCCTCTAAATAC GTTAATTTCTTGTATCTGGCCAGCCACTATGAAAAGCTCAAAGGGTCTCCCGAAGATAATGAGC AGAAGCAGCTGTTCGTGGAACAACACAAACACTACCTTGATGAGATCATCGAGCAAATAAGCGA ATTCTCCAAAAGAGTGATCCTCGCCGACGCTAACCTCGATAAGGTGCTTTCTGCTTACAATAAG CACAGGGATAAGCCCATCAGGGAGCAGGCAGAAAACATTATCCACTTGTTTACTCTGACCAACT TGGGCGCGCCTGCAGCCTTCAAGTACTTCGACACCACCATAGACAGAAAGCGGTACACCTCTAC AAAGGAGGTCCTGGACGCCACACTGATTCATCAGTCAATTACGGGGCTCTATGAAACAAGAATC GACCTCTCTCAGCTCGGTGGAGAC SEQ ID NO: 4: Amino-acid sequence of dSpCas9AHNH (1224 aa), linker replacing the HNH in bold italics
MDKKYS IGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHS IKKNLIGALLFDSGETAEATRL KRTARRRYTRRKNRICYLQEI FSNEMAKVDDSFFHRLEESFLVEEDKKHERHPI FGNIVDEVAY HEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTY NQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNF DLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRWTEITKAPLSAS MIKRYDEHHQDLTLLKALVRQQLPEKYKEI FFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMD GTEELLVKLNREDLLRKQRTFDNGS IPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRI PYYVGPLARGNSRFAWMTRKSEETITPWNFEEWDKGASAQSFIERMTNFDKNLPNEKVLPKHS LLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFD SVEISGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA HLFDDK\/MKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTF KEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKWDELVK\/MGRHKPENIVIEMARENQ GGSGGSRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVR EINNYHHAHDAYLNAWGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYS NIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQWIVKKTEVQTGG FSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLWAKVEKGKSKKLKSVKELLGITI MERSSFEKNPIDFLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLASAGELQKGNELALPSKY VNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEI IEQISEFSKRVILADANLDKVLSAYNK HRDKPIREQAENI IHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQS ITGLYETRI DLSQLGGD
SEQ ID NO: 5: cDNA sequence of full-length CjCas9 (2955bp), HNH domain encoding sequence in bold italics:
ATGGCCCGCATCCTCGCTTTCGACATCGGAATCTCTAGTATCGGATGGGCCTTCTCTGAAAACG ACGAACTGAAAGACTGCGGCGTGAGAATCTTCACAAAGGTTGAAAACCCTAAAACAGGCGAGTC TTTAGCTCTGCCACGTAGGTTGGCCCGCTCCGCCCGAAAAAGGCTGGCTCGGCGGAAGGCTCGC CTCAACCACTTGAAGCATTTGATAGCTAATGAGTTCAAACTGAACTACGAAGATTACCAGTCCT TCGACGAGTCATTGGCAAAAGCCTACAAAGGCAGCCTTATCAGTCCTTATGAGTTGAGATTTCG CGCACTCAACGAACTGCTTTCTAAGCAAGACTTTGCTAGGGTCATTCTGCACATCGCAAAACGG C GAG G T T AT GAC GAT AT C AAGAAC T C C GAC GAT AAAGAAAAG G GAG C CAT T C T C AAG G C GAT C A AAC AGAAT GAG GAAAAAT T G G C AAAC T AC C AGAG T G T G G G C GAG T AT C T G T AT AAAGAG T AT T T CCAGAAGT T T AAG GAAAAC AG C AAG GAG T T T ACAAAC G T CAGAAAT AAAAAGGAG T C T T AC GAG AGAT GC AT C G C G C AG T CAT T C C T C AAAGAT GAG C T GAAG C T GAT AT T T AAGAAG C AAC G C GAAT T TGGTT TCTCAT TCTCTAAGAAGT TCGAAGAGGAGGT TCT T TCCGTGGCGT TT TACAAGAGGGC GCTCAAAGACT TCTCCCACCTGGT TGGTAACTGTAGT T TCT TCACGGATGAGAAGCGAGCTCCC AAAAAT TCTCCCCTGGCT T TCATGT T TGT TGCCCTGACTCGGATCAT TAACCTGCTGAACAACC T GAAAAAT AC T GAAG G GAT C T T G TAT AC GAAG GAC GAC C T AAAT G C AC T C C T GAAT GAAG T G C T C AAAAAC G GAAC T C T AAC C TAT AAAC AGAC C AAGAAAT T AC TGGGGCTCTCT GAC GAC T AC GAG T T C AAG G G C GAGAAG G G T AC T T AT T T TAT C GAAT T C AAAAAG TAT AAG GAG T T CAT T AAAG CAT T G G G GGAAC AC AAC C T C AG C C AG GAC GAT C T C AAT GAAAT T G C C AAG GAC AT C AC G C T GAT T AA AGAC GAGAT AAAAC T G AAAAAG G C AC T GG C C AAG TAT GAC C T C AAC C AGAAC C AGAT C GAC T C T CTGTCCAAGCTGGAGT TCAAAGACCACCTAAACATATCCT TCAAAGCCCTGAAACTGGTCACCC CTCTAATGCTCGAAGGAAAAAAATACGACGAGGCGTGTAATGAACTGAATCTTAAGGTGGCCAT C AAT GAG GAT AAGAAG GAC T T T C T T C C AG C C T T T AAC GAGAC AT AT T AC AAAGAC GAG G T C AC A AACCCGGT TGTGCTGAGGGCCATAAAAGAGTATCGGAAGGT TCTGAATGCCCTCCTGAAGAAGT AC G G CAAAG T G C AC AAAAT AAAT AT C GAAT T G G C TAG G GAG G T G GGGAAGAACCA TTCTCAGCG AGCAAAGATCGAGAAAGAGCAGAATGAGAACTACAAAGCCAAGAAAGACGCCGAACTGGAGTGC GAAAAGCTGGGGCTTAAAA TAAACAGTAAAAACA TCCTGAAA TTAAGA TTGTTCAAAGAGCAAA AGGAGTTTTGCGCCTACTCAGGGGAAAAAA TCAAAA TA TCAGACCTGCAGGACGAGAAAA TGCT GGAGATCGACCACA TCTA TCCGTA TAGCAGGTCA TTTGACGA TTCCTACA TGAACAAAGTGCTT GTGTTTACCAAACAGAACCAAGAAAAGCTGAACCAAACCCCCTTTGAGGCTTTCGGAAACGACT CAGCCAAGTGGCAGAAAATCGAAGTCCTAGCCAAGAATCTGCCTACAAAAAAACAAAAGAGGAT TCTTGATAAGAACTATAAGGACAAGGAACAGAAAAACTTTAAAGACAGGAACCT! GAAT GACACG AGGTACAT TGCGCGACTGGT TCTAAACTATACCAAAGACTACCTGGAT T TCCTCCCTCTGAGCG AC GAC GAGAAT AC T AAAC T GAAT GAT AC C C AGAAAG G C T CAAAG G T C C AC G T T GAG G C T AAG T C CGGGATGCTGACTAGCGCCCTCCGCCACACGTGGGGCT TCAGCGCCAAAGATCGGAATAATCAT C T T CAT C AC G C T AT T GAT G C AG T AAT CAT AG C C T AC G C T AAC AAC AG CAT C G T GAAAG C C T T C T C C GAT T T C AAGAAAGAAC AG GAG T C T AAT AG C G C C GAG T T G T AC G C C AAGAAAAT T T C C GAAT T GGACTATAAAAATAAGAGAAAAT TCT TCGAACCCT TCTCCGGGT T TCGCCAAAAGGTCT TAGAT AAG AT C GAC GAGAT T T TCGT T TC C AAG C C C GAAAG AAAAAAG CCT TCAGGGGCACTGCAC GAAG AGAC AT T C C G C AAG GAAGAG GAAT T T T AC C AAT C T T AC G G T G G T AAAGAG G GAG T T C T GAAG G C TCTGGAGCT TGGGAAGATCCGCAAGGTAAACGGGAAAATCGTGAAAAACGGGGACATGT TCAGG G T G GAT AT C T T CAAG C AC AAAAAGAC C AAC AAG T T C T AC G C AG T AC C C AT C T AC AC T AT G GAT T TCGCTTTAAAGGTTCTCCCAAATAAGGCGGTGGCTCGATCGAAGAAAGGAGAGATCAAGGACTG GAT C T TAAT GGAT GAAAAT TACGAGT T T T GC T T C T CGC T C TACAAAGATAGCC T GAT T C T GAT C C AGACAAAAGAC AT G C AG GAAC CAGAAT TTGTTTAT TAT AAC G C C T T C AC GAG C AG T AC AG T G T C C C T GAT T G T GAG CAAG CAT GAT AAC AAG T T C GAGAC T C T G T C TAAGAAT CAGAAAAT C C T T T T CAAGAACGCCAACGAGAAGGAGGTCATCGCAAAGTCAATTGGCATCCAAAACCTGAAGGTGTTC GAGAAAT AC AT AG T G T C C G C AC T C G G T GAAG T AAC T AAAG C C GAAT T T C GACAG CGC GAG GAT T TTAAGAAAAGC
SEQ ID NO: 6: Amino-acid sequence of full length dCjCas9AHNH (841 aa), linker replacing the HNH domain in bold italics
MARILAFAIGISS IGWAFSENDELKDCGVRI FTKVENPKTGESLALPRRLARSARKRLARRKAR LNHLKHLIANE FKLNYEDYQSFDESLAKAYKGSLISPYELRFRALNELLSKQDFARVILHIAKR RGYDDIKNSDDKEKGAILKAIKQNEEKLANYQSVGEYLYKEYFQKFKENSKEFTNVRNKKESYE RCIAQSFLKDELKLI FKKQREFGFSFSKKFEEEVLSVAFYKRALKDFSHLVGNCSFFTDEKRAP KNSPLAFMFVALTRI INLLNNLKNTEGILYTKDDLNALLNEVLKNGTLTYKQTKKLLGLSDDYE FKGEKGTYFIEFKKYKEFIKALGEHNLSQDDLNEIAKDITLIKDEIKLKKALAKYDLNQNQIDS LSKLEFKDHLNISFKALKLVTPLMLEGKKYDEACNELNLKVAINEDKKDFLPAFNETYYKDEVT NPWLRAIKEYRKVLNALLKKYGKVHKINIELAREVGGSGGSRNLNDTRYIARLVLNYTKDYLD FLPLSDDENTKLNDTQKGSKVHVEAKSGMLTSALRHTWGFSAKDRNNHLHHAIDAVI IAYANNS IVKAFSDFKKEQESNSAELYAKKISELDYKNKRKFFEPFSGFRQKVLDKIDEI FVSKPERKKPS GALHEETFRKEEEFYQSYGGKEGVLKALELGKIRKVNGKIVKNGDMFRVDI FKHKKTNKFYAVP IYTMDFALKVLPNKAVARSKKGEIKDWILMDENYEFCFSLYKDSLILIQTKDMQEPEFVYYNAF TSSTVSLIVSKHDNKFETLSKNQKILFKNANEKEVIAKS IGIQNLKVFEKYIVSALGEVTKAEF RQREDFKKS
SEQ ID NO: 7: AAV Vector 1, example 1 (ITR-promoter-NLS-dSpCas9AHNH-NLS- minipA-ITR), ITR sequences in bold, LS sequence in bold italics, dSpCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, miniPA sequence in grey text
CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGCGTCGGGC GACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAACTCCATC
ACTAGGGGTTCCTACGCGTGATGAGAGCAGCCACTACGGGTCTAGGCTGCCCATGTAAGGAGGC
AAGGCCTGGGGACACCCGAGATGCCTGGTTATAATTAACCCAGACATGTGGCTGCCCCCCCCCC CCCAACACCTGCTGCCTGCTAAAAATAACCCTGTCCCTGGTGGccctgcatgcccACTCACGGG GATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGA CTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGG GAGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGATCCGCTAGCCACCA TGga cA TCGA TTACAAGGA TGACGA TGACAAGA TGGCCCCCAAGAAGAAGAGGAAGGTGGGCA TTCACGGGNYG GACAAGAAGTACTCCATTGGGCTCGATATCGGCACAAACAGCGTCGGCTGGGCCGTCATTACGG ACGAGTACAAGGTGCCGAGCAAAAAATTCAAAGTTCTGGGCAATACCGATCGCCACAGCATAAA GAAGAACCTCATTGGCGCCCTCCTGTTCGACTCCGGGGAGACaGCCGAAGCCACGCGGCTCAAA AGAACAGCACGGCGCAGATATACCCGCAGAAAGAATCGGATCTGCTACCTGCAGGAGATCTTTA GTAATGAGATGGCTAAGGTGGATGACTCTTTCTTCCATAGGCTGGAGGAGTCCTTTTTGGTGGA GGAGGATAAAAAGCACGAGCGCCACCCAATCTTTGGCAATATCGTGGACGAGGTGGCGTACCAT GAAAAGTACCCAACCATATATCATCTGAGGAAGAAGCTTGTAGACAGTACTGATAAGGCTGACT TGCGGTTGATCTATCTCGCGCTGGCGCATATGATCAAATTTCGGGGACACTTCCTCATCGAGGG GGACCTGAACCCAGACAACAGCGATGTCGACAAACTCTTTATCCAACTGGTTCAGACTTACAAT CAGCTTTTCGAAGAGAACCCGATCAACGCATCCGGAGTTGACGCCAAAGCAATCCTGAGCGCTA GGCTGTCCAAATCCCGGCGGCTCGAAAACCTCATCGCACAGCTCCCTGGGGAGAAGAAGAACGG CCTGTTTGGTAATCTTATCGCCCTGTCACTCGGGCTGACCCCCAACTTTAAATCTAACTTCGAC CTGGCCGAAGATGCCAAGCTTCAACTGAGCAAAGACACCTACGATGATGATCTCGACAATCTGC TGGCCCAGATCGGCGACCAGTACGCAGACCTTTTTTTGGCGGCAAAGAACCTGTCAGACGCCAT TCTGCTGAGTGATATTCTGCGAGTGAACACGGAGATCACCAAAGCTCCGCTGAGCGCTAGTATG ATCAAGCGCTATGATGAGCACCACCAAGACTTGACTTTGCTGAAGGCCCTTGTCAGACAGCAAC TGCCTGAGAAGTACAAGGAAATTTTCTTCGATCAGTCTAAAAATGGCTACGCCGGATACATTGA CGGCGGAGCAAGCCAGGAGGAATTTTACAAATTTATTAAGCCCATCTTGGAAAAAATGGACGGC ACCGAGGAGCTGCTGGTAAAGCTTAACAGAGAAGATCTGTTGCGCAAACAGCGCACTTTCGACA ATGGAAGCATCCCCCACCAGATTCACCTGGGCGAACTGCACGCTATCCTCAGGCGGCAAGAGGA
TTTCTACCCCTTTTTGAAAGATAACAGGGAAAAGATTGAGAAAATCCTCACATTTCGGATACCC TACTATGTAGGCCCCCTCGCCCGGGGAAATTCCAGATTCGCGTGGATGACTCGCAAATCAGAAG AGACCATCACTCCCTGGAACTTCGAGGAAGTCGTGGATAAGGGGGCCTCTGCCCAGTCCTTCAT CGAAAGGATGACTAACTTTGATAAAAATCTGCCTAACGAAAAGGTGCTTCCTAAACACTCTCTG CTGTACGAGTACTTCACAGTTTATAACGAGCTCACCAAGGTCAAATACGTCACAGAAGGGATGA GAAAGCCAGCATTCCTGTCTGGAGAGCAGAAGAAAGCTATCGTGGACCTCCTCTTCAAGACGAA CCGGAAAGTTACCGTGAAACAGCTCAAAGAAGACTATTTCAAAAAGATTGAATGTTTCGACTCT GTTGAAATCAGCGGAGTGGAGGATCGCTTCAACGCATCCCTGGGAACGTATCACGATCTCCTGA AAATCATTAAAGACAAGGACTTCCTGGACAATGAGGAGAACGAGGACATTCTTGAGGACATTGT CCTCACCCTTACGTTGTTTGAAGATAGGGAGATGATTGAAGAACGCTTGAAAACTTACGCTCAT CTCTTCGACGACAAAGTCATGAAACAGCTCAAGAGGCGCCGATATACAGGATGGGGGCGGCTGT CAAGAAAACTGATCAATGGGATCCGAGACAAGCAGAGTGGAAAGACAATCCTGGATTTTCTTAA GTCCGATGGATTTGCCAACCGGAACTTCATGCAGTTGATCCATGATGACTCTCTCACCTTTAAG GAGGACATCCAGAAAGCACAAGTTTCTGGCCAGGGGGACAGTCTTCACGAGCACATCGCTAATC TTGCAGGTAGCCCAGCTATCAAAAAGGGAATACTGCAGACCGTTAAGGTCGTGGATGAACTCGT CAAAGTAATGGGAAGGCATAAGCCCGAGAATATCGTTATCGAGATGGCCCGAGAGAACCAAgga ggtagcggtggctctAGGCAGCTTGTTGAGACACGCCAGATCACCAAGCACGTGGCCCAAATTC TCGATTCACGCATGAACACCAAGTACGATGAAAATGACAAACTGATTCGAGAGGTGAAAGTTAT TACTCTGAAGTCTAAGCTGGTCTCAGATTTCAGAAAGGACTTTCAGTTTTATAAGGTGAGAGAG ATCAACAATTACCACCATGCGCATGATGCCTACCTGAATGCAGTGGTAGGCACTGCACTTATCA AAAAATATCCCAAGCTTGAATCTGAATTTGTTTACGGAGACTATAAAGTGTACGATGTTAGGAA AATGATCGCAAAGTCTGAGCAGGAAATAGGCAAGGCCACCGCTAAGTACTTCTTTTACAGCAAT ATTATGAATTTTTTCAAGACCGAGATTACACTGGCCAATGGAGAGATTCGGAAGCGACCACTTA TCGAAACAAACGGAGAAACAGGAGAAATCGTGTGGGACAAGGGTAGGGATTTCGCGACAGTCCG GAAGGTCCTGTCCATGCCGCAGGTGAACATCGTTAAAAAGACCGAAGTACAGACCGGAGGCTTC TCCAAGGAAAGTATCCTCCCGAAAAGGAACAGCGACAAGCTGATCGCACGCAAAAAAGATTGGG ACCCCAAGAAATACGGCGGATTCGATTCTCCTACAGTCGCTTACAGTGTACTGGTTGTGGCCAA AGTGGAGAAAGGGAAGTCTAAAAAACTCAAAAGCGTCAAGGAACTGCTGGGCATCACAATCATG GAGCGATCAAGCTTCGAAAAAAACCCCATCGACTTTCTCGAGGCGAAAGGATATAAAGAGGTCA AAAAAGACCTCATCATTAAGCTTCCCAAGTACTCTCTCTTTGAGCTTGAAAACGGCCGGAAACG AATGCTCGCTAGTGCGGGCGAGCTGCAGAAAGGTAACGAGCTGGCACTGCCCTCTAAATACGTT AATTTCTTGTATCTGGCCAGCCACTATGAAAAGCTCAAAGGGTCTCCCGAAGATAATGAGCAGA AG C AG CTGTTCGTG GAAC AAC AC AAAC AC T AC C T T GAT GAGAT CAT C GAG C AAAT AAG C GAAT T
CTCCAAAAGAGTGATCCTCGCCGACGCTAACCTCGATAAGGTGCTTTCTGCTTACAATAAGCAC AG G GAT AAG C C CAT C AG G GAG C AG G C AGAAAAC AT T AT C C AC T T G T T T AC T C T GAC C AAC T T G G GCGCGCCTG C AG C C T T C AAG T AC T T C GAC AC C AC C AT AGAC AGAAAG C G G T AC AC C T C T AC AAA G GAG G T C C T G GAC G C C AC AC T GAT T CAT C AG T C AAT T AC GGGGCTCTAT GAAAC AAGAAT C GAC CTCTCTCAGCTCGGTGGAGACagcagggctgaccccaagaagaagaggaaggtgrtgaAAGGGTT CGATCCCTAccggTAATAAAaGATCtTTATTTTCATTaGATCtGTGTGTTGGTTTTTTGTGTGc GGCCGCAGGAA.CCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGA GGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGA GCGCGCAGCTGCCTGCAGG
SEQ ID NO: 8: AAV Vector 1, example 2 flTR-promoter-NLS-dSpCas9AHNH-VP64- NLS-minipA-ITR), ITR sequences in bold, NLS sequence in bold italics, dSpCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, miniPA sequence in grey text, VP64 sequence in grey highlight
CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGCGTCGGGC GACC T T TGGTCGCCCGGCC TCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAAC TCCATC
ACTAGGGGTTCCTACGCGTGATGAGAGCAGCCACTACGGGTCTAGGCTGCCCATGTAAGGAGGC AAGGCCTGGGGACACCCGAGATGCCTGGTTATAATTAACCCAGACATGTGGCTGCCCCCCCCCC CCCAACACCTGCTGCCTGCTAAAAATAACCCTGTCCCTGGTGGccctgcatgcccACTCACGGG GATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGA CTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGG GAG G T C TAT AT AAG C AGAG C T G G T T TAG T GAAC C G T C AGAT C C G C TAG C C AC C A TGga cA TCGA TTACAAGGA TGACGA TGACAAGA TGGCCCCCAAGAAGAAGAGGAAGGTGGGCA TTCACGGGNY G GACAAGAAGTACTCCATTGGGCTCGATATCGGCACAAACAGCGTCGGCTGGGCCGTCATTACGG AC GAG T AC AAG G T G C C GAG C AAAAAAT T C AAAG T T C T G G G C AAT AC C GAT C G C C AC AG CAT AAA GAAGAACCTCATTGGCGCCCTCCTGTTCGACTCCGGGGAGACaGCCGAAGCCACGCGGCTCAAA AGAAC AG C AC G G C G C AGAT AT AC C C G C AGAAAGAAT C G GAT C T G C T AC C T G C AG GAGAT C T T T A GTAATGAGATGGCTAAGGTGGATGACTCTTTCTTCCATAGGCTGGAGGAGTCCTTTTTGGTGGA GGAGGATAAAAAGCACGAGCGCCACCCAATCTTTGGCAATATCGTGGACGAGGTGGCGTACCAT GAAAAG T AC C C AAC CAT AT AT CAT C T GAG GAAGAAG C T T G T AGAC AG T AC T GAT AAG G C T GAC T TGCGGTTGATCTATCTCGCGCTGGCGCATATGATCAAATTTCGGGGACACTTCCTCATCGAGGG GGACCTGAACCCAGACAACAGCGATGTCGACAAACTCTTTATCCAACTGGTTCAGACTTACAAT
CAGCTTTTCGAAGAGAACCCGATCAACGCATCCGGAGTTGACGCCAAAGCAATCCTGAGCGCTA GGCTGTCCAAATCCCGGCGGCTCGAAAACCTCATCGCACAGCTCCCTGGGGAGAAGAAGAACGG CCTGTTTGGTAATCTTATCGCCCTGTCACTCGGGCTGACCCCCAACTTTAAATCTAACTTCGAC CTGGCCGAAGATGCCAAGCTTCAACTGAGCAAAGACACCTACGATGATGATCTCGACAATCTGC TGGCCCAGATCGGCGACCAGTACGCAGACCTTTTTTTGGCGGCAAAGAACCTGTCAGACGCCAT TCTGCTGAGTGATATTCTGCGAGTGAACACGGAGATCACCAAAGCTCCGCTGAGCGCTAGTATG ATCAAGCGCTATGATGAGCACCACCAAGACTTGACTTTGCTGAAGGCCCTTGTCAGACAGCAAC TGCCTGAGAAGTACAAGGAAATTTTCTTCGATCAGTCTAAAAATGGCTACGCCGGATACATTGA CGGCGGAGCAAGCCAGGAGGAATTTTACAAATTTATTAAGCCCATCTTGGAAAAAATGGACGGC ACCGAGGAGCTGCTGGTAAAGCTTAACAGAGAAGATCTGTTGCGCAAACAGCGCACTTTCGACA ATGGAAGCATCCCCCACCAGATTCACCTGGGCGAACTGCACGCTATCCTCAGGCGGCAAGAGGA TTTCTACCCCTTTTTGAAAGATAACAGGGAAAAGATTGAGAAAATCCTCACATTTCGGATACCC TACTATGTAGGCCCCCTCGCCCGGGGAAATTCCAGATTCGCGTGGATGACTCGCAAATCAGAAG AGACCATCACTCCCTGGAACTTCGAGGAAGTCGTGGATAAGGGGGCCTCTGCCCAGTCCTTCAT CGAAAGGATGACTAACTTTGATAAAAATCTGCCTAACGAAAAGGTGCTTCCTAAACACTCTCTG CTGTACGAGTACTTCACAGTTTATAACGAGCTCACCAAGGTCAAATACGTCACAGAAGGGATGA GAAAGCCAGCATTCCTGTCTGGAGAGCAGAAGAAAGCTATCGTGGACCTCCTCTTCAAGACGAA CCGGAAAGTTACCGTGAAACAGCTCAAAGAAGACTATTTCAAAAAGATTGAATGTTTCGACTCT GTTGAAATCAGCGGAGTGGAGGATCGCTTCAACGCATCCCTGGGAACGTATCACGATCTCCTGA AAATCATTAAAGACAAGGACTTCCTGGACAATGAGGAGAACGAGGACATTCTTGAGGACATTGT CCTCACCCTTACGTTGTTTGAAGATAGGGAGATGATTGAAGAACGCTTGAAAACTTACGCTCAT CTCTTCGACGACAAAGTCATGAAACAGCTCAAGAGGCGCCGATATACAGGATGGGGGCGGCTGT CAAGAAAACTGATCAATGGGATCCGAGACAAGCAGAGTGGAAAGACAATCCTGGATTTTCTTAA GTCCGATGGATTTGCCAACCGGAACTTCATGCAGTTGATCCATGATGACTCTCTCACCTTTAAG GAGGACATCCAGAAAGCACAAGTTTCTGGCCAGGGGGACAGTCTTCACGAGCACATCGCTAATC TTGCAGGTAGCCCAGCTATCAAAAAGGGAATACTGCAGACCGTTAAGGTCGTGGATGAACTCGT CAAAGTAATGGGAAGGCATAAGCCCGAGAATATCGTTATCGAGATGGCCCGAGAGAACCAAgga ggtagcggtggctctAGGCAGCTTGTTGAGACACGCCAGATCACCAAGCACGTGGCCCAAATTC TCGATTCACGCATGAACACCAAGTACGATGAAAATGACAAACTGATTCGAGAGGTGAAAGTTAT TACTCTGAAGTCTAAGCTGGTCTCAGATTTCAGAAAGGACTTTCAGTTTTATAAGGTGAGAGAG ATCAACAATTACCACCATGCGCATGATGCCTACCTGAATGCAGTGGTAGGCACTGCACTTATCA AAAAATATCCCAAGCTTGAATCTGAATTTGTTTACGGAGACTATAAAGTGTACGATGTTAGGAA AAT GAT C G C AAAG T C T GAG C AG GAAAT AG G C AAG G C C AC C G C T AAG T AC T T C T T T T AC AG C AAT
AT TATGAAT T T T T TCAAGACCGAGAT TACACTGGCCAATGGAGAGAT TCGGAAGCGACCACT TA T C GAAAC AAAC G GAGAAAC AG GAGAAAT C G T G T G G GAC AAG G G TAG G GAT T T C G C GAC AG T C C G GAAGGTCCTGTCCATGCCGCAGGTGAACATCGT TAAAAAGACCGAAGTACAGACCGGAGGCT TC TCCAAGGAAAGTATCCTCCCGAAAAGGAACAGCGACAAGCTGATCGCACGCAAAAAAGAT TGGG ACCCCAAGAAATACGGCGGAT TCGAT TCTCCTACAGTCGCT TACAGTGTACTGGT TGTGGCCAA AGTGGAGAAAGGGAAGTCTAAAAAACTCAAAAGCGTCAAGGAACTGCTGGGCATCACAATCATG GAG C GAT C AAG C T T C GAAAAAAAC C C CAT C GAC T T T C T C GAG G C GAAAG GAT AT AAAGAG G T C A AAAAAGACCTCATCAT TAAGCT TCCCAAGTACTCTCTCT T TGAGCT TGAAAACGGCCGGAAACG AATGCTCGCTAGTGCGGGCGAGCTGCAGAAAGGTAACGAGCTGGCACTGCCCTCTAAATACGT T AAT T TCT TGTATCTGGCCAGCCACTATGAAAAGCTCAAAGGGTCTCCCGAAGATAATGAGCAGA AG C AG CTGT TCGTG GAAC AAC AC AAAC AC T AC C T T GAT GAGAT CAT C GAG C AAAT AAG C GAAT T CTCCAAAAGAGTGATCCTCGCCGACGCTAACCTCGATAAGGTGCT T TCTGCT TACAATAAGCAC AG G GAT AAG C C CAT C AG G GAG C AG G C AGAAAAC AT T AT C C AC T T G T T T AC T C T GAC C AAC T T G G GCGCGCCTG C AG C C T T C AAG T AC T T C GAC AC C AC C AT AGAC AGAAAG C G G T AC AC C T C T AC AAA G GAG G T C C T G GAC G C C AC AC T GAT T CAT C AG T C AAT T AC GGGGCTCTAT GAAAC AAGAAT C GAC CTCTCTCAGCTCGGTGGAGACa g c g c t GGAGGAGGTGGAAGCGGAGGAGGAGGAAGCGGAGGAG GAGGTAGCGGACCTAAGAAAAAGAGGAAGGTGGCGGCCGCTq g a t c c GGAC GGGC TGAC GC AT T
t a a Ac C g g T AAT AAAa GATCt T TAT T T TCAT T aGATCtGTGTGT TGGT T T T T TGTGTGcGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGC TCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCC TCAGTGAGCGAGCGAGCGCGCAGC TGCC TGCAGG
SEQ ID NO:9: AAV Vector 1, example 3 (ITR-promoter-NLS-dSpCas9AHNH-KRAB- NLS-minipA-ITR), ITR sequences in bold, NLS sequence in bold italics, dSpCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, miniPA sequence in grey text, KRAB sequence in grey highlight
CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGCGTCGGGC GACC T T TGGTCGCCCGGCC TCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAAC TCCATC
ACTAGGGGTTCCTACGCGTGATGAGAGCAGCCACTACGGGTCTAGGCTGCCCATGTAAGGAGGC AAGGCCTGGGGACACCCGAGATGCCTGGTTATAATTAACCCAGACATGTGGCTGCCCCCCCCCC CCCAACACCTGCTGCCTGCTAAAAATAACCCTGTCCCTGGTGGccctgcatgcccACTCACGGG GATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGA CTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGG GAGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGATCCGCTAGCCACC ATGgacATCGA TTACAAGGATGACGATGACAAGATGGCCCCCAAGAAGAAGAGGAAGGTGGGCATTCACGGGNYG GACAAGAAGTACTCCATTGGGCTCGATATCGGCACAAACAGCGTCGGCTGGGCCGTCATTACGG ACGAGTACAAGGTGCCGAGCAAAAAATTCAAAGTTCTGGGCAATACCGATCGCCACAGCATAAA GAAGAACCTCATTGGCGCCCTCCTGTTCGACTCCGGGGAGACaGCCGAAGCCACGCGGCTCAAA AGAACAGCACGGCGCAGATATACCCGCAGAAAGAATCGGATCTGCTACCTGCAGGAGATCTTTA GTAATGAGATGGCTAAGGTGGATGACTCTTTCTTCCATAGGCTGGAGGAGTCCTTTTTGGTGGA GGAGGATAAAAAGCACGAGCGCCACCCAATCTTTGGCAATATCGTGGACGAGGTGGCGTACCAT GAAAAGTACCCAACCATATATCATCTGAGGAAGAAGCTTGTAGACAGTACTGATAAGGCTGACT TGCGGTTGATCTATCTCGCGCTGGCGCATATGATCAAATTTCGGGGACACTTCCTCATCGAGGG GGACCTGAACCCAGACAACAGCGATGTCGACAAACTCTTTATCCAACTGGTTCAGACTTACAAT CAGCTTTTCGAAGAGAACCCGATCAACGCATCCGGAGTTGACGCCAAAGCAATCCTGAGCGCTA GGCTGTCCAAATCCCGGCGGCTCGAAAACCTCATCGCACAGCTCCCTGGGGAGAAGAAGAACGG CCTGTTTGGTAATCTTATCGCCCTGTCACTCGGGCTGACCCCCAACTTTAAATCTAACTTCGAC CTGGCCGAAGATGCCAAGCTTCAACTGAGCAAAGACACCTACGATGATGATCTCGACAATCTGC TGGCCCAGATCGGCGACCAGTACGCAGACCTTTTTTTGGCGGCAAAGAACCTGTCAGACGCCAT TCTGCTGAGTGATATTCTGCGAGTGAACACGGAGATCACCAAAGCTCCGCTGAGCGCTAGTATG ATCAAGCGCTATGATGAGCACCACCAAGACTTGACTTTGCTGAAGGCCCTTGTCAGACAGCAAC TGCCTGAGAAGTACAAGGAAATTTTCTTCGATCAGTCTAAAAATGGCTACGCCGGATACATTGA CGGCGGAGCAAGCCAGGAGGAATTTTACAAATTTATTAAGCCCATCTTGGAAAAAATGGACGGC ACCGAGGAGCTGCTGGTAAAGCTTAACAGAGAAGATCTGTTGCGCAAACAGCGCACTTTCGACA ATGGAAGCATCCCCCACCAGATTCACCTGGGCGAACTGCACGCTATCCTCAGGCGGCAAGAGGA TTTCTACCCCTTTTTGAAAGATAACAGGGAAAAGATTGAGAAAATCCTCACATTTCGGATACCC TACTATGTAGGCCCCCTCGCCCGGGGAAATTCCAGATTCGCGTGGATGACTCGCAAATCAGAAG AGACCATCACTCCCTGGAACTTCGAGGAAGTCGTGGATAAGGGGGCCTCTGCCCAGTCCTTCAT CGAAAGGATGACTAACTTTGATAAAAATCTGCCTAACGAAAAGGTGCTTCCTAAACACTCTCTG CTGTACGAGTACTTCACAGTTTATAACGAGCTCACCAAGGTCAAATACGTCACAGAAGGGATGA GAAAGCCAGCATTCCTGTCTGGAGAGCAGAAGAAAGCTATCGTGGACCTCCTCTTCAAGACGAA CCGGAAAGTTACCGTGAAACAGCTCAAAGAAGACTATTTCAAAAAGATTGAATGTTTCGACTCT GTTGAAATCAGCGGAGTGGAGGATCGCTTCAACGCATCCCTGGGAACGTATCACGATCTCCTGA
AAATCATTAAAGACAAGGACTTCCTGGACAATGAGGAGAACGAGGACATTCTTGAGGACATTGT CCTCACCCTTACGTTGTTTGAAGATAGGGAGATGATTGAAGAACGCTTGAAAACTTACGCTCAT CTCTTCGACGACAAAGTCATGAAACAGCTCAAGAGGCGCCGATATACAGGATGGGGGCGGCTGT CAAGAAAACTGATCAATGGGATCCGAGACAAGCAGAGTGGAAAGACAATCCTGGATTTTCTTAA GTCCGATGGATTTGCCAACCGGAACTTCATGCAGTTGATCCATGATGACTCTCTCACCTTTAAG GAGGACATCCAGAAAGCACAAGTTTCTGGCCAGGGGGACAGTCTTCACGAGCACATCGCTAATC TTGCAGGTAGCCCAGCTATCAAAAAGGGAATACTGCAGACCGTTAAGGTCGTGGATGAACTCGT CAAAGTAATGGGAAGGCATAAGCCCGAGAATATCGTTATCGAGATGGCCCGAGAGAACCAAgga ggtagcggtggctctAGGCAGCTTGTTGAGACACGCCAGATCACCAAGCACGTGGCCCAAATTC TCGATTCACGCATGAACACCAAGTACGATGAAAATGACAAACTGATTCGAGAGGTGAAAGTTAT TACTCTGAAGTCTAAGCTGGTCTCAGATTTCAGAAAGGACTTTCAGTTTTATAAGGTGAGAGAG ATCAACAATTACCACCATGCGCATGATGCCTACCTGAATGCAGTGGTAGGCACTGCACTTATCA AAAAATATCCCAAGCTTGAATCTGAATTTGTTTACGGAGACTATAAAGTGTACGATGTTAGGAA AATGATCGCAAAGTCTGAGCAGGAAATAGGCAAGGCCACCGCTAAGTACTTCTTTTACAGCAAT ATTATGAATTTTTTCAAGACCGAGATTACACTGGCCAATGGAGAGATTCGGAAGCGACCACTTA TCGAAACAAACGGAGAAACAGGAGAAATCGTGTGGGACAAGGGTAGGGATTTCGCGACAGTCCG GAAGGTCCTGTCCATGCCGCAGGTGAACATCGTTAAAAAGACCGAAGTACAGACCGGAGGCTTC TCCAAGGAAAGTATCCTCCCGAAAAGGAACAGCGACAAGCTGATCGCACGCAAAAAAGATTGGG ACCCCAAGAAATACGGCGGATTCGATTCTCCTACAGTCGCTTACAGTGTACTGGTTGTGGCCAA AGTGGAGAAAGGGAAGTCTAAAAAACTCAAAAGCGTCAAGGAACTGCTGGGCATCACAATCATG GAGCGATCAAGCTTCGAAAAAAACCCCATCGACTTTCTCGAGGCGAAAGGATATAAAGAGGTCA AAAAAGACCTCATCATTAAGCTTCCCAAGTACTCTCTCTTTGAGCTTGAAAACGGCCGGAAACG AATGCTCGCTAGTGCGGGCGAGCTGCAGAAAGGTAACGAGCTGGCACTGCCCTCTAAATACGTT AATTTCTTGTATCTGGCCAGCCACTATGAAAAGCTCAAAGGGTCTCCCGAAGATAATGAGCAGA AGCAGCTGTTCGTGGAACAACACAAACACTACCTTGATGAGATCATCGAGCAAATAAGCGAATT CTCCAAAAGAGTGATCCTCGCCGACGCTAACCTCGATAAGGTGCTTTCTGCTTACAATAAGCAC AGGGATAAGCCCATCAGGGAGCAGGCAGAAAACATTATCCACTTGTTTACTCTGACCAACTTGG GCGCGCCTGCAGCCTTCAAGTACTTCGACACCACCATAGACAGAAAGCGGTACACCTCTACAAA GGAGGTCCTGGACGCCACACTGATTCATCAGTCAATTACGGGGCTCTATGAAACAAGAATCGAC CTCTCTCAGCTCGGTGGAGACagcgctGGAGGAGGTGGAAGCGGAGGAGGAGGAAGCGGAGGAG GAGGTAGCGGACCTAAGAAAAAGAGGAAGGTGGCGGCCGCTqgatecGATGCTAAGTCACTGAC TTGTGTGcGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCG CTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGA GCGAGCGAGCGCGCAGCTGCCTGCAGG
SEQ ID NO: 10: AAV Vector 2, example 1 (ITR-promoter-MCP-VP64-p65-HSFl-polyA- U6-MS2ogRNA), ITR sequences in bold, MCP in grey highlight, VP64-p65-HSFl sequence in bold italics, polyA sequence lowercase underlined, MS2ogRNA sequence in grey text, green fluorescent protein sequence uppercase underlined, NLS in underlined italics cctgcaggcagctgcgcgctcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttc ctgcggcctctagactcgaggcgttgacattgattattgactagttattaatagtaatcaatta cggggtcattagttcatagcccatatatggagttccgcgttacataacttacggtaaatggccc gcctggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgttcccatagta acgccaatagggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttgg cagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaaatggcc cgcctggcattatgcccagtacatgaccttatgggactttcctacttggcagtacatctacgta ttagtcatcgctattaccatggtgatgcggttttggcagtacatcaatgggcgtggatagcggt ttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttttggcacca aaatcaacgggactttccaaaatgtcgtaacaactccgccccattgacgcaaatgggcggtagg cgtgtacggtgggaggtctatataagcagagctctctggctaactaccggtcgccaccATGGTG AGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAA ACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCT GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACC TACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCG CCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGAC CCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGAC TTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCT ATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGA GGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTG CTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGC
GCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCT GTACAAGcaatgtactaactacgctttgttgaaactcgctggcgatgttgaaagtaaccccggt cctgctagc^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^B
||||l|||il||||l||ii;agcgctGGAGGAGGTGGAAGCGGAGGAGGAGGAAGCGGAGGAGGAGGTA GCggacctaagaaaaagaggaaggtggccgctgga tecGGACGGGCTGACGCATTGGACGATTT TGA CTGGA A GCTGGGAAGTGACGCCCTCGATGATTTTGACCTTGACATGCTTGGTTCGGAT GCCCTTGATGACTTTGACCTCGACATGCTCGGCAGTGACGCCCTTGA GA TTCGACCTGGACA TGCTGATTAACGGCTCTGGCAGCGGCAGCCCTTCAGGGCAGA CAGCAACCAGGCCCTGGCTCT GGCCCCTAGCTCCGCTCCAGTGCTGGCCCAGACTATGGTGCCCTCTAGTGCTATGGTGCCTCTG GCCCAGCCACCTGCTCCAGCCCCTGTGCTGACCCCAGGACCACCCCAGTCACTGAGCGCTCCAG TGCCCAAGTCTACACAGGCCGGCGAGGGGACTCTGAGTGAAGCTCTGCTGCACCTGCAGTTCGA CGCTGATGAGGACCTGGGAGCTCTGCTGGGGAACAGCACCGATCCCGGAGTGTTCACAGATCTG GCCTCCGTGGACAACTCTGAGTTTCAGCAGCTGCTGAATCAGGGCGTGTCCATGTCTCATAGTA CAGCCGAACCAATGCTGATGGAGTACCCCGAAGCCATTACCCGGCTGGTGACCGGCAGCCAGCG GCCCCCCGACCCCGCTCCAACTCCCCTGGGAACCAGCGGCCTGCCTAATGGGCTGTCCGGAGAT GAAGACTTCTCAAGCATCGCTGATATGGACTTTAGTGCCCTGCTGTCACAGATTTCCTCTAGTG GGCAGGGAGGAGG GGAAGCGGCTTCAGCGTGGACACCAGTGCCCTGCTGGACCTGTTCAGCCC CTCGGTGACCGTGCCCGACATGAGCCTGCCTGACCTTGACAGCAGCCTGGCCAGTATCCAAGAG CTCCTGTCTCCCCAGGAGCCCCCCAGGCCTCCCGAGGCAGAGAACAGCAGCCCGGATTCAGGGA AGCAGCTGGTGCACTACACAGCGCAGCCGCTGTTCCTGCTGGACCCCGGCTCCGTGGACACCGG GAGCAACGACCTGCCGGTGCTGTTTGAGCTGGGAGAGGGCTCCTACTTCTCCGAAGGGGACGGC TTCGCCGAGGACCCCACCATCTCCCTGCTGACAGGCTCGGAGCCTCCCAAAGCCAAGGACCCCA CTGTCTCCAAqaa11cctagagctcgctgatcagcctcgactgtgccttctagttgccagcca tctgttgtttgcccctcccccgtgccttccttgaccctggaaggtgccactcccactgtccttt cctaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattctattctggggggtgg ggtggggcaggacagcaagggggaggattgggaagagaatagcaggcatgctggggaggtaccG AGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAAT TGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAA
TTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAA CTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGtgagacc gtcgacggtctcaGTTTcAGAGCTAtgctgAACATGAGGATCACCCATGTCTGCAGcagcaTAG CAAGTTgAAATAAGGCTAGTCCGTTATCAACTTggccAACATGAGGATCACCCATGTCTGCAGg gccAAGTGGCACCGAGTCGGTGCTTTTTTTcgtacgaagaagcggccgcaggaacccctagtga tggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgc ccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagctgcctgcagg
SEQ ID NO: 11: AAV Vector 2, example 2 (ITR-promoter-MCP-KRAB-polyA-U6- MS2ogRNA), ITR sequences in bold, MCP in grey highlight, KRAB sequence in bold italics, polyA sequence underlined, MS2ogRNA sequence in grey text, green fluorescent protein sequence uppercase underlined, LS in underlined italics cctgcaggcagctgcgcgctcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttc ctgcggcctctagactcgaggcgttgacattgattattgactagttattaatagtaatcaatta cggggtcattagttcatagcccatatatggagttccgcgttacataacttacggtaaatggccc gcctggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgttcccatagta acgccaatagggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttgg cagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaaatggcc cgcctggcattatgcccagtacatgaccttatgggactttcctacttggcagtacatctacgta ttagtcatcgctattaccatggtgatgcggttttggcagtacatcaatgggcgtggatagcggt ttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttttggcacca aaatcaacgggactttccaaaatgtcgtaacaactccgccccattgacgcaaatgggcggtagg cgtgtacggtgggaggtctatataagcagagctctctggctaactaccggtcgccaccATGGTG AGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAA ACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCT GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACC TACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCG CCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGAC CCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGAC TTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCT ATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGA
GGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTG CTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGC GCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCT GTACAAGcaatgtactaactacgctttgttgaaactcgctggcgatgttgaaagtaaccccggt cctgctagc^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^B
^^^^g^^^^^^^^^g^^g^^g^g^gggg^^^g^^g^^^^g ^g^g^^^g^^^^^^^g^^^^^^^^g^^gg^^^^^g^^i
|i||ll||||llllli|;agcgctGGAGGAGGTGGAAGCGGAGGAGGAGGAAGCGGAGGAGGAGGTA GCggacctaagaaaaagaggaaggtggccgctgga tccGATGCTAAGTCACTGACTGCCTGGTC CCGGACACTGGTGACCTTCAAGGATGTGTTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTG GACACTGCTCAGCAGATCCTGTACAGAAATGTGATGCTGGAGAACTATAAGAACCTGGTTTCCT TGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCT GGTGqqacgtacgtccaaacgcacggctgatggttccgaatttgagtctccgaagaagaaacgc aaagt tgagTAAgaattcctagagctcgctgatcagcctcgactgtgccttctagttgccagcc atctgttgtttgcccctcccccgtgccttccttgaccctggaaggtgccactcccactgtcctt tcctaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattctattctggggggtg gggtggggcaggacagcaagggggaggattgggaagagaatagcaggcatgctggggaggtacc GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAA TTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATA ATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTA ACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGtgagac cgtcgacggtctcaGTTTcAGAGCTAtgctgAACATGAGGATCACCCATGTCTGCAGcagcaTA GCAAGTTgAAATAAGGCTAGTCCGTTATCAACTTggccAACATGAGGATCACCCATGTCTGCAG ggccAAGTGGCACCGAGTCGGTGCTTTTTTTcgtacgaagaagcggccgcaggaacccctagtg atggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcg cccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagctgcctgcagg SEQ ID NO: 12: cDNA sequence of full-length SaCas9 (3159bp), HNH domain encoding sequence in bold
ATGAAGCGGAACTACATCCTGGGCCTGGACATCGGCATCACCAGCGTGGGCTACGGCATCATCG ACTACGAGACACGGGACGTGATCGATGCCGGCGTGCGGCTGTTCAAAGAGGCCAACGTGGAAAA CAACGAGGGCAGGCGGAGCAAGAGAGGCGCCAGAAGGCTGAAGCGGCGGAGGCGGCATAGAATC CAGAGAGTGAAGAAGCTGCTGTTCGACTACAACCTGCTGACCGACCACAGCGAGCTGAGCGGCA TCAACCCCTACGAGGCCAGAGTGAAGGGCCTGAGCCAGAAGCTGAGCGAGGAAGAGTTCTCTGC CGCCCTGCTGCACCTGGCCAAGAGAAGAGGCGTGCACAACGTGAACGAGGTGGAAGAGGACACC GGCAACGAGCTGTCCACCAAAGAGCAGATCAGCCGGAACAGCAAGGCCCTGGAAGAGAAATACG TGGCCGAACTGCAGCTGGAACGGCTGAAGAAAGACGGCGAAGTGCGGGGCAGCATCAACAGATT CAAGACCAGCGACTACGTGAAAGAAGCCAAACAGCTGCTGAAGGTGCAGAAGGCCTACCACCAG CTGGACCAGAGCTTCATCGACACCTACATCGACCTGCTGGAAACCCGGCGGACCTACTATGAGG GACCTGGCGAGGGCAGCCCCTTCGGCTGGAAGGACATCAAAGAATGGTACGAGATGCTGATGGG CCACTGCACCTACTTCCCCGAGGAACTGCGGAGCGTGAAGTACGCCTACAACGCCGACCTGTAC AACGCCCTGAACGACCTGAACAATCTCGTGATCACCAGGGACGAGAACGAGAAGCTGGAATATT ACGAGAAGTTCCAGATCATCGAGAACGTGTTCAAGCAGAAGAAGAAGCCCACCCTGAAGCAGAT CGCCAAAGAAATCCTCGTGAACGAAGAGGATATTAAGGGCTACAGAGTGACCAGCACCGGCAAG CCCGAGTTCACCAACCTGAAGGTGTACCACGACATCAAGGACATTACCGCCCGGAAAGAGATTA TTGAGAACGCCGAGCTGCTGGATCAGATTGCCAAGATCCTGACCATCTACCAGAGCAGCGAGGA CATCCAGGAAGAACTGACCAATCTGAACTCCGAGCTGACCCAGGAAGAGATCGAGCAGATCTCT AATCTGAAGGGCTATACCGGCACCCACAACCTGAGCCTGAAGGCCATCAACCTGATCCTGGACG AGCTGTGGCACACCAACGACAACCAGATCGCTATCTTCAACCGGCTGAAGCTGGTGCCCAAGAA GGTGGACCTGTCCCAGCAGAAAGAGATCCCCACCACCCTGGTGGACGACTTCATCCTGAGCCCC GTCGTGAAGAGAAGCTTCATCCAGAGCATCAAAGTGATCAACGCCATCATCAAGAAGTACGGCC TGCCCAACGACATCATTATCGAGCTGGCCCGCGAGAAGAACTCCAAGGACGCCCAGAAAATGAT CAACGAGATGCAGAAGCGGAACgcaaaGACCAACGAGCGGATCGAGGAAATCATCCGGACCACC GGCAAAGAGAACGCCAAGTACC TGATCGAGAAGATCAAGC TGCACGACATGCAGGAAGGCAAGT GCC TGTACAGCC TGGAAGCCATCCC TC TGGAAGATC TGC TGAACAACCCC T TCAAC TATGAGGT GGACCACATCATCCCCAGAAGCGTGTCCT TCGACAACAGC T TCAACAACAAGGTGC TCGTGAAG CAGGAAGAAAACAGCAAGAAGGGCAACCGGACCCCATTCCAGTACCTGAGCAGCAGCGACAGCA AGATCAGCTACGAAACCTTCAAGAAGCACATCCTGAATCTGGCCAAGGGCAAGGGCAGAATCAG CAAGACCAAGAAAGAGTATCTGCTGGAAGAACGGGACATCAACAGGTTCTCCGTGCAGAAAGAC TTCATCAACCGGAACCTGGTGGATACCAGATACGCCACCAGAGGCCTGATGAACCTGCTGCGGA
GCTACTTCAGAGTGAACAACCTGGACGTGAAAGTGAAGTCCATCAATGGCGGCTTCACCAGCTT TCTGCGGCGGAAGTGGAAGTTTAAGAAAGAGCGGAACAAGGGGTACAAGCACCACGCCGAGGAC GCCCTGATCATTGCCAACGCCGATTTCATCTTCAAAGAGTGGAAGAAACTGGACAAGGCCAAAA AAGTGATGGAAAACCAGATGTTCGAGGAAAAGCAGGCCGAGAGCATGCCCGAGATCGAAACCGA GCAGGAGTACAAAGAGATCTTCATCACCCCCCACCAGATCAAGCACATTAAGGACTTCAAGGAC TACAAGTACAGCCACCGGGTGGACAAGAAGCCTAATAGAGAGCTGATTAACGACACCCTGTACT CCACCCGGAAGGACGACAAGGGCAACACCCTGATCGTGAACAATCTGAACGGCCTGTACGACAA GGACAATGACAAGCTGAAAAAGCTGATCAACAAGAGCCCCGAAAAGCTGCTGATGTACCACCAC GACCCCCAGACCTACCAGAAACTGAAGCTGATTATGGAACAGTACGGCGACGAGAAGAATCCCC TGTACAAGTACTACGAGGAAACCGGGAACTACCTGACCAAGTACTCCAAAAAGGACAACGGCCC CGTGATCAAGAAGATTAAGTATTACGGCAACAAACTGAACGCCCATCTGGACATCACCGACGAC TACCCCAACAGCAGAAACAAGGTCGTGAAGCTGTCCCTGAAGCCCTACAGATTCGACGTGTACC TGGACAATGGCGTGTACAAGTTCGTGACCGTGAAGAATCTGGATGTGATCAAAAAAGAAAACTA CTACGAAGTGAATAGCAAGTGCTATGAGGAAGCTAAGAAGCTGAAGAAGATCAGCAACCAGGCC GAGTTTATCGCCTCCTTCTACAACAACGATCTGATCAAGATCAACGGCGAGCTGTATAGAGTGA TCGGCGTGAACAACGACCTGCTGAACCGGATCGAAGTGAACATGATCGACATCACCTACCGCGA GTACCTGGAAAACATGAACGACAAGAGGCCCCCCAGGATCATTAAGACAATCGCCTCCAAGACC CAGAGCATTAAGAAGTACAGCACAGACATTCTGGGCAACCTGTATGAAGTGAAATCTAAGAAGC ACCCTCAGATCATCAAAAAGGGC
SEQ ID NO: 13: Amino-acid sequence of full length SaCas9, HNH domain sequence in bold
MKRNYILGLDIGITSVGYG11DYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRRRHRI QRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVEEDT GNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGS INRFKTSDYVKEAKQLLKVQKAYHQ LDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLY NALNDLNNLVITRDENEKLEYYEKFQI IENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTSTGK PEFTNLKVYHDIKDITARKEI IENAELLDQIAKILTIYQSSEDIQEELTNLNSELTQEEIEQIS NLKGYTGTHNLSLKAINLILDELWHTNDNQIAI FNRLKLVPKKVDLSQQKEIPTTLVDDFILSP WKRSFIQSIKVINAI IKKYGLPNDI I IELAREKNSKDAQKMINEMQKR AKTNERIEEIIRTT GKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVK QEENSKKGNRTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKD FINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKS INGGFTSFLRRKWKFKKERNKGYKHHAED ALI IANADFI FKEWKKLDKAKK\/MENQMFEEKQAESMPEIETEQEYKEI FITPHQIKHIKDFKD YKYSHRVDKKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPEKLLMYHH DPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDD YPNSRNKWKLSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQA EFIAS FYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRPPRI IKTIASKT QS IKKYSTDILGNLYEVKSKKHPQI IKKG
SEQ ID NO: 14: cDNA sequence of SaCas9AHNH (2697bp), GGSGGS linker replacing the HNH domain in lowercase bold
ATGAAGCGGAACTACATCCTGGGCCTGGACATCGGCATCACCAGCGTGGGCTACGGCATCATCG ACTACGAGACACGGGACGTGATCGATGCCGGCGTGCGGCTGTTCAAAGAGGCCAACGTGGAAAA CAACGAGGGCAGGCGGAGCAAGAGAGGCGCCAGAAGGCTGAAGCGGCGGAGGCGGCATAGAATC C AG AGAG T G AAG AAG CTGCTGTTC G AC T AC AAC C T G C T G AC C G AC C AC AG C GAG C T GAG C G G C A TCAACCCCTACGAGGCCAGAGTGAAGGGCCTGAGCCAGAAGCTGAGCGAGGAAGAGTTCTCTGC CGCCCTGCTGCACCTGGCCAAGAGAAGAGGCGTGCACAACGTGAACGAGGTGGAAGAGGACACC G G C AAC GAG C T G T C C AC C AAAGAG C AGAT C AG C C G GAAC AG C AAG G C C C T G GAAGAGAAAT AC G TGGCCGAACTGCAGCTGGAACGGCTGAAGAAAGACGGCGAAGTGCGGGGCAGCATCAACAGATT C AAGAC C AG C GAC T AC G T GAAAGAAG C CAAAC AG C T G C T GAAG G T G C AGAAGG C C T AC C AC C AG CTGGACCAGAGCTTCATCGACACCTACATCGACCTGCTGGAAACCCGGCGGACCTACTATGAGG GACCTGGCGAGGGCAGCCCCTTCGGCTGGAAGGACATCAAAGAATGGTACGAGATGCTGATGGG CCACTGCACCTACTTCCCCGAGGAACTGCGGAGCGTGAAGTACGCCTACAACGCCGACCTGTAC AAC G C C C T GAAC GAC C T GAAC AAT C T C G T GAT C AC C AG G GAC GAGAAC GAGAAG C T G GAAT AT T AC GAGAAG T T C C AGAT C AT C GAGAAC G T G T T C AAG C AGAAGAAGAAG C C C AC C C T GAAG C AGAT C G C C AAAGAAAT C C T C G T GAAC GAAGAGGAT AT T AAG G G C T AC AGAG T GAC CAG C AC C G G C AAG C C C GAG T T C AC C AAC C T GAAG G T G T AC CAC GAC AT C AAG GAC AT T AC C G C C C G GAAAGAGAT T A T T GAGAAC G C C GAG C T G C T G GAT C AGAT T G C C AAGAT C C T GAC CAT C T AC C AGAG CAG C GAG GA CAT C CAG GAAGAAC T GAC C AAT C T GAAC T C C GAG C T GAC C CAG GAAGAGAT C GAG C AGAT C T C T AATCTGAAGGGCTATACCGGCACCCACAACCTGAGCCTGAAGGCCATCAACCTGATCCTGGACG AGCTGTGGCACACCAACGACAACCAGATCGCTATCTTCAACCGGCTGAAGCTGGTGCCCAAGAA GGTGGACCTGTCCCAGCAGAAAGAGATCCCCACCACCCTGGTGGACGACTTCATCCTGAGCCCC G T C G T GAAGAGAAG C T T CAT C C AGAG CAT C AAAG T GAT C AAC G C CAT CAT C AAGAAG T AC G G C C TGCCCAACGACATCATTATCGAGCTGGCCCGCGAGAAGAACggaggtagcggtggctctCGGAA
CCTGGTGGATACCAGATACGCCACCAGAGGCCTGATGAACCTGCTGCGGAGCTACTTCAGAGTG AACAACCTGGACGTGAAAGTGAAGTCCATCAATGGCGGCTTCACCAGCTTTCTGCGGCGGAAGT G GAAG T T T AAGAAAGAG C G GAAC AAG G GG T AC AAG C AC C AC G C C GAG GAC G C C C T GAT CAT T G C C AAC GC C GAT T T CAT C T T C AAAGAG T G GAAGAAAC T G GAC AAG G C C AAAAAAG T GAT G GAAAAC C AGAT G T T C GAG GAAAAG C AG G C C GAGAG CAT G C C C GAGAT C GAAAC C GAG CAG GAG T AC AAAG AGAT C T T C AT C AC C C C C C AC C AGAT C AAG C AC AT T AAG GAC T T C AAG GAC T AC AAG T AC AG C C A C C G G G T G GAC AAGAAG C C T AAT AGAGAGC T GAT T AAC GAC AC C C T G T AC T C CAC C C G GAAG GAC GAC AAG G G C AAC AC C C T GAT C G T GAAC AAT C T GAAC G G C C T G T AC GAC AAG GAC AAT GAC AAG C T GAAAAAG C T GAT C AAC AAGAG C C C C GAAAAG CTGCTGATGTAC CAC CAC GAC C C C CAGAC C T A C C AGAAAC T GAAG C T GAT T AT G GAAC AG T AC G G C GAC GAGAAGAAT C C C C T G T AC AAG T AC T AC GAG GAAAC C G G GAAC T AC C T GAC C AAG T AC T C C AAAAAG GAC AAC GGCCCCGT GAT C AAGAAGA T T AAG TAT T AC G G C AAC AAAC T GAAC G C C CAT C T G GAC AT CAC C GAC GAC T AC C C C AAC AG CAG AAACAAGGTCGTGAAGCTGTCCCTGAAGCCCTACAGATTCGACGTGTACCTGGACAATGGCGTG T AC AAG T T C G T GAC C G T GAAGAAT CTGGATGTGAT C AAAAAAGAAAAC T AC T AC GAAG T GAATA G C AAG T G C T AT GAG GAAG C T AAGAAG C T GAAGAAGAT CAG C AAC CAG G C C GAG TTTATCGCCTC C T T C T AC AAC AAC GAT C T GAT C AAGAT CAAC G G C GAG C T G T AT AGAG T GAT C G G C G T GAAC AAC GAC C T G C T GAAC C G GAT C GAAG T GAAC AT GAT C GAC AT CAC C T AC C G C GAG T AC C T G GAAAAC A T GAAC GAC AAGAG G C C C C C CAG GAT CAT T AAGAC AAT C G C C T C C AAGAC C C AGAG CAT T AAGAA G T AC AG C AC AGAC AT T C T G G G CAAC C T G T AT GAAG T GAAAT C T AAGAAG CAC C C T C AGAT CAT C AAAAAGGGC
SEQ ID NO: 15: Amino-acid sequence of SaCas9AHNH (899aa), linker replacing the HNH domain underlined
MKRNY ILGLDIGIT S VG YG 11 D YE TRDVI DAGVRL FKEANVENNE GRRS KRGARRLKRRRRHR I QRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVEEDT GNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGS INRFKTSDYVKEAKQLLKVQKAYHQ LDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLY NALNDLNNLVITRDENEKLEYYEKFQI IENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTSTGK PEFTNLKVYHDIKDITARKEI IENAELLDQIAKILTIYQSSEDIQEELTNLNSELTQEEIEQIS NLKGYTGTHNLSLKAINLILDELWHTNDNQIAI FNRLKLVPKKVDLSQQKEIPTTLVDDFILSP WKRSFIQS IKVINAI IKKYGLPNDI I IELAREKNGGSGGSRNLVDTRYATRGLMNLLRSYFRV NNLDVKVKS INGGFTSFLRRKWKFKKERNKGYKHHAEDALI IANADFI FKEWKKLDKAKKVMEN
QMFEEKQAESMPEIETEQEYKEI FITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKD DKGNTL I VNNLNGL YDKDNDKLKKL INKS PEKLLMYHHDPQT YQKLKL IMEQYGDEKNPL YKYY EETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKWKLSLKPYRFDVYLDNGV YKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQAEFIAS FYNNDLIKINGELYRVIGVNN DLLNRIEVNMIDITYREYLENMNDKRPPRI IKTIASKTQS IKKYSTDILGNLYEVKSKKHPQI I KKG
SEQ ID NO: 16: All-in-one SaCas9AHNH-based transcription activator in AAV vector, example 1 TR-promoter-NLS-dSaCas9AHNH-NLS-VP64-p65-NLS-minipA-U6- SaogRNA-ITR), ITR sequences in bold, NLS sequences in grey highlight, dSaCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, VP64-p65 sequence in bold italics, miniPA sequence in grey text cctgcaggcagctgcgcgctcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttc ctgcggcctctagactcgagCGCGTGATGAGAGCAGCCACTACGGGTCTAGGCTGCCCATGTAA GGAGGCAAGGCCTGGGGACACCCGAGATGCCTGGTTATAATTAACCCAGACATGTGGCTGCCCC CCCCCCCCCAACACCTGCTGCCTGCTAAAAATAACCCTGTCCCTGGTGGccctgcatgcccACT CACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCA ACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTA CGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGATCcgaccggtgccaccili;
CTGGGCCTGGACATCGGCATCACCAGCGTGGGCTACGGCATCATCGACTACGAGACACGGGACG TGATCGATGCCGGCGTGCGGCTGTTCAAAGAGGCCAACGTGGAAAACAACGAGGGCAGGCGGAG C AAGAGAG G C G C C AGAAG G C T GAAG C G GC G GAG G C G G C AT AGAAT C C AGAGAG T GAAGAAG C T G CTGTTCGACTACAACCTGCTGACCGACCACAGCGAGCTGAGCGGCATCAACCCCTACGAGGCCA GAGTGAAGGGCCTGAGCCAGAAGCTGAGCGAGGAAGAGTTCTCTGCCGCCCTGCTGCACCTGGC C AAG AG AAG AG G C G T G C AC AAC G T G AAC GAG G T G GAAG AG G AC AC C G G C AAC GAG CTGTCCACC AAAG AG C AG AT C AG C C G G AAC AG C AAG G C C C T G G AAG AG AAAT AC G T G G C C GAAC TGCAGCTGG AAC G GC T GAAGAAAGAC G G C GAAG T G C GG G G C AG CAT C AAC AGAT T C AAGAC C AG C GAC T AC G T GAAAGAAG C C AAAC AG C T G C T GAAG G T GC AGAAG G C C T AC C AC C AG C T G GAC C AGAG C T T CAT C GACACCTACATCGACCTGCTGGAAACCCGGCGGACCTACTATGAGGGACCTGGCGAGGGCAGCC CCTTCGGCTGGAAGGACATCAAAGAATGGTACGAGATGCTGATGGGCCACTGCACCTACTTCCC
CGAGGAACTGCGGAGCGTGAAGTACGCCTACAACGCCGACCTGTACAACGCCCTGAACGACCTG AACAATCTCGTGATCACCAGGGACGAGAACGAGAAGCTGGAATATTACGAGAAGTTCCAGATCA TCGAGAACGTGTTCAAGCAGAAGAAGAAGCCCACCCTGAAGCAGATCGCCAAAGAAATCCTCGT GAACGAAGAGGATATTAAGGGCTACAGAGTGACCAGCACCGGCAAGCCCGAGTTCACCAACCTG AAGGTGTACCACGACATCAAGGACATTACCGCCCGGAAAGAGATTATTGAGAACGCCGAGCTGC TGGATCAGATTGCCAAGATCCTGACCATCTACCAGAGCAGCGAGGACATCCAGGAAGAACTGAC CAATCTGAACTCCGAGCTGACCCAGGAAGAGATCGAGCAGATCTCTAATCTGAAGGGCTATACC GGCACCCACAACCTGAGCCTGAAGGCCATCAACCTGATCCTGGACGAGCTGTGGCACACCAACG ACAACCAGATCGCTATCTTCAACCGGCTGAAGCTGGTGCCCAAGAAGGTGGACCTGTCCCAGCA GAAAGAGATCCCCACCACCCTGGTGGACGACTTCATCCTGAGCCCCGTCGTGAAGAGAAGCTTC ATCCAGAGCATCAAAGTGATCAACGCCATCATCAAGAAGTACGGCCTGCCCAACGACATCATTA TCGAGCTGGCCCGCGAGAAGAACggaggtagcggtggctctCGGAACCTGGTGGATACCAGATA CGCCACCAGAGGCCTGATGAACCTGCTGCGGAGCTACTTCAGAGTGAACAACCTGGACGTGAAA GTGAAGTCCATCAATGGCGGCTTCACCAGCTTTCTGCGGCGGAAGTGGAAGTTTAAGAAAGAGC GGAACAAGGGGTACAAGCACCACGCCGAGGACGCCCTGATCATTGCCAACGCCGATTTCATCTT CAAAGAGTGGAAGAAACTGGACAAGGCCAAAAAAGTGATGGAAAACCAGATGTTCGAGGAAAAG CAGGCCGAGAGCATGCCCGAGATCGAAACCGAGCAGGAGTACAAAGAGATCTTCATCACCCCCC ACCAGATCAAGCACATTAAGGACTTCAAGGACTACAAGTACAGCCACCGGGTGGACAAGAAGCC TAATAGAGAGCTGATTAACGACACCCTGTACTCCACCCGGAAGGACGACAAGGGCAACACCCTG ATCGTGAACAATCTGAACGGCCTGTACGACAAGGACAATGACAAGCTGAAAAAGCTGATCAACA AGAGCCCCGAAAAGCTGCTGATGTACCACCACGACCCCCAGACCTACCAGAAACTGAAGCTGAT TATGGAACAGTACGGCGACGAGAAGAATCCCCTGTACAAGTACTACGAGGAAACCGGGAACTAC CTGACCAAGTACTCCAAAAAGGACAACGGCCCCGTGATCAAGAAGATTAAGTATTACGGCAACA AACTGAACGCCCATCTGGACATCACCGACGACTACCCCAACAGCAGAAACAAGGTCGTGAAGCT GTCCCTGAAGCCCTACAGATTCGACGTGTACCTGGACAATGGCGTGTACAAGTTCGTGACCGTG AAGAATCTGGATGTGATCAAAAAAGAAAACTACTACGAAGTGAATAGCAAGTGCTATGAGGAAG CTAAGAAGCTGAAGAAGATCAGCAACCAGGCCGAGTTTATCGCCTCCTTCTACAACAACGATCT GATCAAGATCAACGGCGAGCTGTATAGAGTGATCGGCGTGAACAACGACCTGCTGAACCGGATC GAAGTGAACATGATCGACATCACCTACCGCGAGTACCTGGAAAACATGAACGACAAGAGGCCCC CCAGGATCATTAAGACAATCGCCTCCAAGACCCAGAGCATTAAGAAGTACAGCACAGACATTCT GGGCAACCTGTATGAAGTGAAATCTAAGAAGCACCCTCAGATCATCAAAAAGGGC||iiiiii|||i||l|||i; ^^^^^^^^^^^^^^^^^^^^^^^^^ ggatccggaggaggaggaagcggag gaggaggtagcGGACGGGCTGACGCATTGGACGATTTTGATCTGGATATGCTGGGAAGTGACGC
CCTCGATGATTTTGACCTTGACATGCTTGGTTCGGATGCCCTTGATGACTTTGACCTCGACATG
CTCGGCAGTGACGCCCTTGATGATTTCGACCTGGACATGCTGATTAACqgctctggcagcggca qcCCTTCAGGGCAGATCAGCAACCAGGCCCTGGCTCTGGCCCCTAGCTCCGCTCCAGTGCTGGC
CCAGACTATGGTGCCCTCTAGTGCTATGGTGCCTCTGGCCCAGCCACCTGCTCCAGCCCCTGTG
CTGACCCCAGGACCACCCCAGTCACTGAGCGCTCCAGTGCCCAAGTCTACACAGGCCGGCGAGG
GGACTCTGAGTGAAGCTCTGCTGCACCTGCAGTTCGACGCTGATGAGGACCTGGGAGCTCTGCT
GGGGAACAGCACCGATCCCGGAGTGTTCACAGATCTGGCCTCCGTGGACAACTCTGAGTTTCAG
CAGCTGCTGAATCAGGGCGTGTCCATGTCTCATAGTACAGCCGAACCAATGCTGATGGAGTACC
CCGAAGCCATTACCCGGCTGGTGACCGGCAGCCAGCGGCCCCCCGACCCCGCTCCAACTCCCCT
GGGAACCAGCGGCCTGCCTAATGGGCTGTCCGGAGATGAAGACTTCTCAAGCATCGCTGATATG
GACTTTAGTGCCCTGCTGTCACAGATTTCCTCTAGTGGGCAGqgacgtacgtccaaacgcacgg
W^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^aaACTAGTAATAAAaG ATCtTTATTTTCATTaGATCtGTGTGTTGGTTTTTTGTGTGggtaccgagggcctatttcccat gattccttcatatttgcatatacgatacaaggctgttagagagataattggaattaatttgact gtaaacacaaagatattagtacaaaatacgtgacgtagaaagtaataatttcttgggtagtttg cagttttaaaattatgttttaaaatggactatcatatgcttaccgtaacttgaaagtatttcga tttcttggctttatatatcttgtggaaaggacgaaacACCGGAGACCacggcaGGTCTCaGTTT cAGTACTCTGGAAACAGAATCTACTgAAACAAGGCAAAATGCCGTGTTTATCTCGTCAACTTGT TGGCGAGATTTTTgcggccgcaggaacccctagtgatggagttggccactccctctctgcgcgc tcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcc tcagtgagcgagcgagcgcgcagctgcctgcagg
SEQ ID NO: 17: All-in-one SaCas9AHNH-based transcription activator in AAV vector, example 2 aTR-promoter-NLS-dSaCas9AHNH-NLS-VP64-HSFl-NLS-minipA-U6- SaogRNA-ITR), ITR sequences in bold, NLS sequences in grey highlight, dSaCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, VP64-HSF1 sequence in bold italics, miniPA sequence in grey text cctgcaggcagctgcgcgctcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttc ctgcggcctctagactcgagCGCGTGATGAGAGCAGCCACTACGGGTCTAGGCTGCCCATGTAA GGAGGCAAGGCCTGGGGACACCCGAGATGCCTGGTTATAATTAACCCAGACATGTGGCTGCCCC CCCCCCCCCAACACCTGCTGCCTGCTAAAAATAACCCTGTCCCTGGTGGccctgcatgcccACT
CACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCA ACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTA CGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGATCcgaccggtgccacciii;
CTGGGCCTGGACATCGGCATCACCAGCGTGGGCTACGGCATCATCGACTACGAGACACGGGACG TGATCGATGCCGGCGTGCGGCTGTTCAAAGAGGCCAACGTGGAAAACAACGAGGGCAGGCGGAG CAAGAGAGGCGCCAGAAGGCTGAAGCGGCGGAGGCGGCATAGAATCCAGAGAGTGAAGAAGCTG CTGTTCGACTACAACCTGCTGACCGACCACAGCGAGCTGAGCGGCATCAACCCCTACGAGGCCA GAGTGAAGGGCCTGAGCCAGAAGCTGAGCGAGGAAGAGTTCTCTGCCGCCCTGCTGCACCTGGC CAAGAGAAGAGGCGTGCACAACGTGAACGAGGTGGAAGAGGACACCGGCAACGAGCTGTCCACC AAAGAGCAGATCAGCCGGAACAGCAAGGCCCTGGAAGAGAAATACGTGGCCGAACTGCAGCTGG AACGGCTGAAGAAAGACGGCGAAGTGCGGGGCAGCATCAACAGATTCAAGACCAGCGACTACGT GAAAGAAGCCAAACAGCTGCTGAAGGTGCAGAAGGCCTACCACCAGCTGGACCAGAGCTTCATC GACACCTACATCGACCTGCTGGAAACCCGGCGGACCTACTATGAGGGACCTGGCGAGGGCAGCC CCTTCGGCTGGAAGGACATCAAAGAATGGTACGAGATGCTGATGGGCCACTGCACCTACTTCCC CGAGGAACTGCGGAGCGTGAAGTACGCCTACAACGCCGACCTGTACAACGCCCTGAACGACCTG AACAATCTCGTGATCACCAGGGACGAGAACGAGAAGCTGGAATATTACGAGAAGTTCCAGATCA TCGAGAACGTGTTCAAGCAGAAGAAGAAGCCCACCCTGAAGCAGATCGCCAAAGAAATCCTCGT GAACGAAGAGGATATTAAGGGCTACAGAGTGACCAGCACCGGCAAGCCCGAGTTCACCAACCTG AAGGTGTACCACGACATCAAGGACATTACCGCCCGGAAAGAGATTATTGAGAACGCCGAGCTGC TGGATCAGATTGCCAAGATCCTGACCATCTACCAGAGCAGCGAGGACATCCAGGAAGAACTGAC CAATCTGAACTCCGAGCTGACCCAGGAAGAGATCGAGCAGATCTCTAATCTGAAGGGCTATACC GGCACCCACAACCTGAGCCTGAAGGCCATCAACCTGATCCTGGACGAGCTGTGGCACACCAACG ACAACCAGATCGCTATCTTCAACCGGCTGAAGCTGGTGCCCAAGAAGGTGGACCTGTCCCAGCA GAAAGAGATCCCCACCACCCTGGTGGACGACTTCATCCTGAGCCCCGTCGTGAAGAGAAGCTTC ATCCAGAGCATCAAAGTGATCAACGCCATCATCAAGAAGTACGGCCTGCCCAACGACATCATTA TCGAGCTGGCCCGCGAGAAGAACggaggtagcggtggctctCGGAACCTGGTGGATACCAGATA CGCCACCAGAGGCCTGATGAACCTGCTGCGGAGCTACTTCAGAGTGAACAACCTGGACGTGAAA GTGAAGTCCATCAATGGCGGCTTCACCAGCTTTCTGCGGCGGAAGTGGAAGTTTAAGAAAGAGC GGAACAAGGGGTACAAGCACCACGCCGAGGACGCCCTGATCATTGCCAACGCCGATTTCATCTT CAAAGAGTGGAAGAAACTGGACAAGGCCAAAAAAGTGATGGAAAACCAGATGTTCGAGGAAAAG CAGGCCGAGAGCATGCCCGAGATCGAAACCGAGCAGGAGTACAAAGAGATCTTCATCACCCCCC ACCAGATCAAGCACATTAAGGACTTCAAGGACTACAAGTACAGCCACCGGGTGGACAAGAAGCC
TAATAGAGAGCTGATTAACGACACCCTGTACTCCACCCGGAAGGACGACAAGGGCAACACCCTG ATCGTGAACAATCTGAACGGCCTGTACGACAAGGACAATGACAAGCTGAAAAAGCTGATCAACA AGAGCCCCGAAAAGCTGCTGATGTACCACCACGACCCCCAGACCTACCAGAAACTGAAGCTGAT TATGGAACAGTACGGCGACGAGAAGAATCCCCTGTACAAGTACTACGAGGAAACCGGGAACTAC CTGACCAAGTACTCCAAAAAGGACAACGGCCCCGTGATCAAGAAGATTAAGTATTACGGCAACA AACTGAACGCCCATCTGGACATCACCGACGACTACCCCAACAGCAGAAACAAGGTCGTGAAGCT GTCCCTGAAGCCCTACAGATTCGACGTGTACCTGGACAATGGCGTGTACAAGTTCGTGACCGTG AAGAATCTGGATGTGATCAAAAAAGAAAACTACTACGAAGTGAATAGCAAGTGCTATGAGGAAG CTAAGAAGCTGAAGAAGATCAGCAACCAGGCCGAGTTTATCGCCTCCTTCTACAACAACGATCT GATCAAGATCAACGGCGAGCTGTATAGAGTGATCGGCGTGAACAACGACCTGCTGAACCGGATC GAAGTGAACATGATCGACATCACCTACCGCGAGTACCTGGAAAACATGAACGACAAGAGGCCCC CCAGGATCATTAAGACAATCGCCTCCAAGACCCAGAGCATTAAGAAGTACAGCACAGACATTCT GGGCAACCTGTATGAAGTGAAATCTAAGAAGCACCCTCAGATCATCAAAAAGGGC^^^^^¾ ^^^^^^^^^^^^^^^^^^^^^^^^^^ggatccggaggaggaggaagcggag gaggaggtagcGGACGGGCTGACGCATTGGACGATTTTGATCTGGATATGCTGGGAAGTGACGC CCTCGATGATTTTGACCTTGACATGCTTGGTTCGGATGCCCTTGATGACTTTGACCTCGACATG CTCGGCAGTGACGCCCTTGATGATTTCGACCTGGACATGCTGATTAACqgctctggcagcggca gcGGCTTCAGCGTGGACACCAGTGCCCTGCTGGACCTGTTCAGCCCCTCGGTGACCGTGCCCGA CATGAGCCTGCCTGACCTTGACAGCAGCCTGGCCAGTATCCAAGAGCTCCTGTCTCCCCAGGAG CCCCCCAGGCCTCCCGAGGCAGAGAACAGCAGCCCGGATTCAGGGAAGCAGCTGGTGCACTACA CAGCGCAGCCGCTGTTCCTGCTGGACCCCGGCTCCGTGGACACCGGGAGCAACGACCTGCCGGT GCTGTTTGAGCTGGGAGAGGGCTCCTACTTCTCCGAAGGGGACGGCTTCGCCGAGGACCCCACC ATCTCCCTGCTGACAGGCTCGGAGCCTCCCAAAGCCAAGGACCCCACTGTCTCCqgacgtacgt
TAGTAATAAAaGATCtTTATTTTCATTaGATCtGTGTGTTGGTTTTTTGTGTGggtaccgaggg cctatttcccatgattccttcatatttgcatatacgatacaaggctgttagagagataattgga attaatttgactgtaaacacaaagatattagtacaaaatacgtgacgtagaaagtaataatttc ttgggtagtttgcagttttaaaattatgttttaaaatggactatcatatgcttaccgtaacttg aaagtatttcgatttcttggctttatatatcttgtggaaaggacgaaacACCGGAGACCacggc aGGTCTCaGTTTcAGTACTCTGGAAACAGAATCTACTgAAACAAGGCAAAATGCCGTGTTTATC TCGTCAACTTGTTGGCGAGATTTTTgcggccgcaggaacccctagtgatggagttggccactcc ctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggcttt gcccgggcggcctcagtgagcgagcgagcgcgcagctgcctgcagg
SEQ ID NO: 18: All-in-one SaCas9AHNH-based transcription repressor in AAV vector fITR-promoter-NLS-dSaCas9AHNH-NLS-KRAB-NLS-minipA-U6-SaogRNA-ITR): ITR sequences in bold, LS sequences in grey highlight, dSaCas9AHNH sequence underlined with linker replacing HNH domain in underlined lowercase, KRAB sequence in bold italics, miniPA sequence in grey text cctgcaggcagctgcgcgctcgctcgctcactgaggccgcccgggcgtcgggcgacctttggtc gcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttc ctgcggcct ct agact cgagCGCGTGATGAGAGCAGCCACTACGGGTCTAGGCTGCCCATGTAA GGAGGCAAGGCCTGGGGACACCCGAGATGCCTGGT TATAAT TAACCCAGACATGTGGCTGCCCC CCCCCCCCCAACACCTGCTGCCTGCTAAAAATAACCCTGTCCCTGGTGGccct gcat gcccACT CACGGGGAT T TCCAAGTCTCCACCCCATTGACGTCAATGGGAGT T TGT T T TGGCACCAAAATCA ACGGGACT T TCCAAAATGTCGTAACAACTCCGCCCCAT TGACGCAAATGGGCGGTAGGCGTGTA CGGTGGGAGGTCTATATAAGCAGAGCTGGT T TAGTGAACCGTCAGATCcgaccggt gccaccB
CTGGGCCTGGACATCGGCATCACCAGCGTGGGCTACGGCATCATCGACTACGAGACACGGGACG TGATCGATGCCGGCGTGCGGCTGT TCAAAGAGGCCAACGTGGAAAACAACGAGGGCAGGCGGAG C AAGAGAG G C G C C AGAAG G C T GAAG C G GC G GAG G C G G C AT AGAAT C C AGAGAG T GAAGAAG C T G CTGT TCGACTACAACCTGCTGACCGACCACAGCGAGCTGAGCGGCATCAACCCCTACGAGGCCA GAGTGAAGGGCCTGAGCCAGAAGCTGAGCGAGGAAGAGT TCTCTGCCGCCCTGCTGCACCTGGC C AAG AG AAG AG G C G T G C AC AAC G T G AAC GAG G T G GAAG AG G AC AC C G G C AAC GAG CTGTCCACC AAAG AG C AG AT C AG C C G G AAC AG C AAG G C C C T G G AAG AG AAAT AC G T G G C C GAAC TGCAGCTGG AAC G GC T GAAGAAAGAC G G C GAAG T G C GG G G C AG CAT C AAC AGAT T C AAGAC C AG C GAC T AC G T GAAAGAAG C C AAAC AG C T G C T GAAG G T GC AGAAG G C C T AC C AC C AG C T G GAC C AGAG C T T CAT C GACACCTACATCGACCTGCTGGAAACCCGGCGGACCTACTATGAGGGACCTGGCGAGGGCAGCC CCT TCGGCTGGAAGGACATCAAAGAATGGTACGAGATGCTGATGGGCCACTGCACCTACT TCCC CGAGGAACTGCGGAGCGTGAAGTACGCCTACAACGCCGACCTGTACAACGCCCTGAACGACCTG AAC AAT C T C G T GAT C AC C AG G GAC GAGAAC GAGAAG C T G GAAT AT T AC GAGAAG T T C C AGAT C A T C GAGAAC G T G T T C AAG C AGAAGAAGAAG C C C AC C C T GAAG C AGAT C G C C AAAGAAAT C C T C G T GAAC GAAGAG GAT AT T AAG G G C T AC AGAG T GAC C AG C AC C G G C AAG C C C GAG T T C AC C AAC C T G AAGGTGTACCACGACATCAAGGACATTACCGCCCGGAAAGAGATTATTGAGAACGCCGAGCTGC
TGGATCAGATTGCCAAGATCCTGACCATCTACCAGAGCAGCGAGGACATCCAGGAAGAACTGAC CAATCTGAACTCCGAGCTGACCCAGGAAGAGATCGAGCAGATCTCTAATCTGAAGGGCTATACC GGCACCCACAACCTGAGCCTGAAGGCCATCAACCTGATCCTGGACGAGCTGTGGCACACCAACG ACAACCAGATCGCTATCTTCAACCGGCTGAAGCTGGTGCCCAAGAAGGTGGACCTGTCCCAGCA GAAAGAGATCCCCACCACCCTGGTGGACGACTTCATCCTGAGCCCCGTCGTGAAGAGAAGCTTC ATCCAGAGCATCAAAGTGATCAACGCCATCATCAAGAAGTACGGCCTGCCCAACGACATCATTA TCGAGCTGGCCCGCGAGAAGAACggaggtagcggtggctctCGGAACCTGGTGGATACCAGATA CGCCACCAGAGGCCTGATGAACCTGCTGCGGAGCTACTTCAGAGTGAACAACCTGGACGTGAAA GTGAAGTCCATCAATGGCGGCTTCACCAGCTTTCTGCGGCGGAAGTGGAAGTTTAAGAAAGAGC GGAACAAGGGGTACAAGCACCACGCCGAGGACGCCCTGATCATTGCCAACGCCGATTTCATCTT CAAAGAGTGGAAGAAACTGGACAAGGCCAAAAAAGTGATGGAAAACCAGATGTTCGAGGAAAAG CAGGCCGAGAGCATGCCCGAGATCGAAACCGAGCAGGAGTACAAAGAGATCTTCATCACCCCCC ACCAGATCAAGCACATTAAGGACTTCAAGGACTACAAGTACAGCCACCGGGTGGACAAGAAGCC TAATAGAGAGCTGATTAACGACACCCTGTACTCCACCCGGAAGGACGACAAGGGCAACACCCTG ATCGTGAACAATCTGAACGGCCTGTACGACAAGGACAATGACAAGCTGAAAAAGCTGATCAACA AGAGCCCCGAAAAGCTGCTGATGTACCACCACGACCCCCAGACCTACCAGAAACTGAAGCTGAT TATGGAACAGTACGGCGACGAGAAGAATCCCCTGTACAAGTACTACGAGGAAACCGGGAACTAC CTGACCAAGTACTCCAAAAAGGACAACGGCCCCGTGATCAAGAAGATTAAGTATTACGGCAACA AACTGAACGCCCATCTGGACATCACCGACGACTACCCCAACAGCAGAAACAAGGTCGTGAAGCT GTCCCTGAAGCCCTACAGATTCGACGTGTACCTGGACAATGGCGTGTACAAGTTCGTGACCGTG AAGAATCTGGATGTGATCAAAAAAGAAAACTACTACGAAGTGAATAGCAAGTGCTATGAGGAAG CTAAGAAGCTGAAGAAGATCAGCAACCAGGCCGAGTTTATCGCCTCCTTCTACAACAACGATCT GATCAAGATCAACGGCGAGCTGTATAGAGTGATCGGCGTGAACAACGACCTGCTGAACCGGATC GAAGTGAACATGATCGACATCACCTACCGCGAGTACCTGGAAAACATGAACGACAAGAGGCCCC CCAGGATCATTAAGACAATCGCCTCCAAGACCCAGAGCATTAAGAAGTACAGCACAGACATTCT GGGCAACCTGTATGAAGTGAAATCTAAGAAGCACCCTCAGATCATCAAAAAGGG^^^^^^ . ^^^^^^^^^g^^g^^^^^^g^^^ ggatccggaggaggaggaagcggag gaggaggtagc A TGCTAAGTCACTGACTGCCTGGTCCCGGACACTGGTGACCTTCAAGGA TGT GTTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGACACTGCTCAGCAGATCCTGTACAGA AA GTGA GCTGGAGAACTA AAGAACCTGGTTTCCTTGGGTTA TCAGCTTACTAAGCCAGA TG TGA TCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGqgacgtacgtccaaacgcacggc ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^¾taaACTAGTAA AAAaGA. TCtTTATTTTCATTaGATCtGTGTGTTGGTTTTTTGTGTGggtaccgagggcctatttcccatg attccttcatatttgcatatacgatacaaggctgttagagagataattggaattaatttgactg taaacacaaagatattagtacaaaatacgtgacgtagaaagtaataatttcttgggtagtttgc agttttaaaattatgttttaaaatggactatcatatgcttaccgtaacttgaaagtatttcgat ttcttggctttatatatcttgtggaaaggacgaaacACCGGAGACCacggcaGGTCTCaGTTTc AGTACTCTGGAAACAGAATCTACTgAAACAAGGCAAAATGCCGTGTTTATCTCGTCAACTTGTT GGCGAGATTTTTgcggccgcaggaacccctagtgatggagttggccactccctctctgcgcgct cgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcct cagtgagcgagcgagcgcgcagctgcctgcagg
SEQ ID NO: 19: cDNA sequence of SpCas9 HNH domain actacccagaagggacagaagaacagtagggaaaggatgaagaggattgaagagggtataaaag aactggggtcccaaatccttaaggaacacccagttgaaaacacccagcttcagaatgagaagct ctacctgtactacctgcagaacggcagggacatgtacgtggatcaggaactggacatcaatcgg ctctccgactacgacgtggatcatatcgtgccccagtcttttctcaaagatgattctattgata ataaagtgttgacaagatccgataaaaatagagggaagagtgataacgtcccctcagaagaagt tgtcaagaaaatgaaaaattattggcggcagctgctgaacgccaaactgatcacacaacggaag ttcgataatctgactaaggctgaacgaggtggcctgtctgagttggataaagccggcttcatca aa
SEQ ID NO: 20: cDNA sequence of SaCas9 HNH domain
TCCAAGGACGCCCAGAAAATGATCAACGAGATGCAGAAGCGGAACgcaaaGACCAACGAGCGGA TCGAGGAAATCATCCGGACCACCGGCAAAGAGAACGCCAAGTACCTGATCGAGAAGATCAAGCT GCACGACATGCAGGAAGGCAAGTGCCTGTACAGCCTGGAAGCCATCCCTCTGGAAGATCTGCTG AACAACCCCTTCAACTATGAGGTGGACCACATCATCCCCAGAAGCGTGTCCTTCGACAACAGCT TCAACAACAAGGTGCTCGTGAAGCAGGAAGAAAACAGCAAGAAGGGCAACCGGACCCCATTCCA GTACCTGAGCAGCAGCGACAGCAAGATCAGCTACGAAACCTTCAAGAAGCACATCCTGAATCTG GCCAAGGGCAAGGGCAGAATCAGCAAGACCAAGAAAGAGTATCTGCTGGAAGAACGGGACATCA ACAGGTTCTCCGTGCAGAAAGACTTCATCAAC SEQ ID NO: 21: cDNA sequence of CjCas9 HNH domain
GGGAAGAACCAT T C T CAGCGAGCAAAGAT CGAGAAAGAGCAGAAT GAGAAC TACAAAGCCAAGA AAGACGCCGAACTGGAGTGCGAAAAGCTGGGGCTTAAAATAAACAGTAAAAACATCCTGAAATT AAGAT T G T T C AAAGAG C AAAAG GAG TTTTGCGCC T AC T C AG G G GAAAAAAT CAAAAT AT C AGAC C T G C AG GAC GAGAAAAT G C T G GAGAT C GAC C AC AT C T AT C C G TAT AG C AG G T CAT T T GAC GAT T C C T ACAT GAACAAAG TGCTTGTGTTTAC C AAAC AGAAC C AAGAAAAG C T GAAC CAAAC C C C C T T TGAGGCTTTCGGAAACGACTCAGCCAAGTGGCAGAAAATCGAAGTCCTAGCCAAGAATCTGCCT ACAAAAAAACAAAAGAGGAT T C T T GATAAGAAC TATAAGGACAAGGAACAGAAAAAC T T TAAAG AC
SEQ ID NO: 22: cDNA sequence of dCjCas9AHNH (2523bp), linker replacing the HNH domain in bold italics
ATGGCCCGCATCCTCGCTTTCGCCATCGGAATCTCTAGTATCGGATGGGCCTTCTCTGAAAACG ACGAACTGAAAGACTGCGGCGTGAGAATCTTCACAAAGGTTGAAAACCCTAAAACAGGCGAGTC TTTAGCTCTGCCACGTAGGTTGGCCCGCTCCGCCCGAAAAAGGCTGGCTCGGCGGAAGGCTCGC C T C AAC C AC T T GAAG C AT T T GAT AG C T AAT GAG T T CAAAC T GAAC T AC G AAGAT T AC C AG T C C T TCGACGAGTCATTGGCAAAAGCCTACAAAGGCAGCCTTATCAGTCCTTATGAGTTGAGATTTCG CGCACTCAACGAACTGCTTTCTAAGCAAGACTTTGCTAGGGTCATTCTGCACATCGCAAAACGG C GAG G T T AT GAC GAT AT C AAGAAC T C C GAC GAT AAAGAAAAG G GAG C CAT T C T C AAG G C GAT C A AAC AGAAT GAG GAAAAAT T G G CAAAC T AC C AGAG T G T G G G C GAG T AT C T G TAT AAAGAG T AT T T CCAGAAGT T T AAG GAAAAC AG C AAG GAG T T T ACAAAC G T CAGAAAT AAAAAGGAG T C T T AC GAG AGAT GC AT C G C G C AG T CAT T C C T C AAAGAT GAG C T GAAG C T GAT AT T T AAGAAG C AAC G C GAAT TTGGTTTCTCATTCTCTAAGAAGTTCGAAGAGGAGGTTCTTTCCGTGGCGTTTTACAAGAGGGC GCTCAAAGACTTCTCCCACCTGGTTGGTAACTGTAGTTTCTTCACGGATGAGAAGCGAGCTCCC AAAAATTCTCCCCTGGCTTTCATGTTTGTTGCCCTGACTCGGATCATTAACCTGCTGAACAACC T GAAAAAT AC T GAAG G GAT C T T G TAT AC GAAG GAC GAC C T AAAT G C AC T C C T GAAT GAAG T G C T C AAAAAC G GAAC T C T AAC C TAT AAAC AGAC C AAGAAAT T AC TGGGGCTCTCT GAC GAC T AC GAG T T C AAG G G C GAGAAG G G T AC T T AT T T TAT C GAAT T C AAAAAG TAT AAG GAG T T CAT TAAAG CAT T G G G GGAAC AC AAC C T C AG C C AG GAC GAT C T C AAT GAAAT T G C C AAG GAC AT C AC G C T GAT T AA AGAC GAGAT AAAAC T G AAAAAG G C AC T GG C C AAG TAT GAC C T C AAC C AGAAC C AGAT C GAC T C T CTGTCCAAGCTGGAGTTCAAAGACCACCTAAACATATCCTTCAAAGCCCTGAAACTGGTCACCC CTCTAATGCTCGAAGGAAAAAAATACGACGAGGCGTGTAATGAACTGAATCTTAAGGTGGCCAT
CAAT GAG GAT AAGAAG GAC T T T C T T C C AG C C T T T AAC GAGAC AT AT T AC AAAGAC GAG G T C AC A AACCCGGTTGTGCTGAGGGCCATAAAAGAGTATCGGAAGGTTCTGAATGCCCTCCTGAAGAAGT AC G G CAAAG T G C AC AAAAT AAAT AT C GAAT T G G C TAG G GAG G T G GGAGGTAGCGGTGGCTC AG GAAC C T GAAT GAC AC GAG G T AC AT T G C GC GAC T G G T T C TAAAC TAT AC C AAAGAC T AC C T G GAT T T C C T C C C T C T GAG C GAC GAC GAGAAT AC TAAAC T GAAT GAT AC C CAGAAAGG C T CAAAG G T C C ACGTTGAGGCTAAGTCCGGGATGCTGACTAGCGCCCTCCGCCACACGTGGGGCTTCAGCGCCAA AGAT C G GAAT AAT CAT C T T CAT C AC G C TAT T GAT G C AG T AAT CAT AG C C T AC G C T AAC AAC AG C ATCGTGAAAGCCTTCTCCGATTTCAAGAAAGAACAGGAGTCTAATAGCGCCGAGTTGTACGCCA AGAAAAT T T C C GAAT T G GAC TAT AAAAAT AAGAGAAAAT T C T T C GAAC C C T T C T C C G G G T T T C G C C AAAAG G T C T T AGAT AAGAT C GAC GAGAT T T T C G T T T C C AAG C C C GAAAGAAAAAAG C C T T C A G G G G CAC T G C AC GAAGAGAC AT T C C G C AAG GAAGAG GAAT T T T AC CAAT C T T AC G G T G G T AAAG AGGGAGTTCTGAAGGCTCTGGAGCTTGGGAAGATCCGCAAGGTAAACGGGAAAATCGTGAAAAA C G G G GAC AT G T T C AG G G T G GAT AT C T T CAAG C AC AAAAAGAC C AAC AAG T T C T AC G C AG T AC C C ATCTACACTATGGATTTCGCTTTAAAGGTTCTCCCAAATAAGGCGGTGGCTCGATCGAAGAAAG GAGAGAT CAAG GAC TGGAT C T T AAT G GAT G AAAAT TACGAGT T TTGCTTCTCGCTC TACAAAGA TAGCCTGATTCTGATC C AGAC AAAAGACAT G C AG GAAC CAGAAT TTGTTTAT TAT AAC G C C T T C AC GAGC AG T AC AG TGTCCCTGATTGT GAG CAAG CAT GAT AAC AAG T T C GAGAC T C T G T C T AAGA AT C AGAAAAT C C T T T T C AAGAAC G C C AAC GAGAAG GAG G T CAT C G CAAAG T CAAT T G G CAT C C A AAAC C T GAAG G T G T T C GAGAAAT AC AT AG T G T C C G C AC T C G G T GAAG T AAC T AAAG C C GAAT T T C GAC AG C G C GAG GAT T T T AAGAAAAG C
SEQ ID NO: 23: Amino-acid sequence of dCjCas9AHNH (841 aa), linker replacing the HNH domain in bold italics
MARILAFAIGISS IGWAFSENDELKDCGVRI FTKVENPKTGESLALPRRLARSARKRLARRKAR LNHLKHLIANE FKLNYEDYQSFDESLAKAYKGSLISPYELRFRALNELLSKQDFARVILHIAKR RGYDDIKNSDDKEKGAILKAIKQNEEKLANYQSVGEYLYKEYFQKFKENSKEFTNVRNKKESYE RCIAQSFLKDELKLI FKKQREFGFSFSKKFEEEVLSVAFYKRALKDFSHLVGNCSFFTDEKRAP KNSPLAFMFVALTRI INLLNNLKNTEGILYTKDDLNALLNEVLKNGTLTYKQTKKLLGLSDDYE FKGEKGTYFIEFKKYKEFIKALGEHNLSQDDLNEIAKDITLIKDEIKLKKALAKYDLNQNQIDS LSKLEFKDHLNISFKALKLVTPLMLEGKKYDEACNELNLKVAINEDKKDFLPAFNETYYKDEVT NPWLRAIKEYRKVLNALLKKYGKVHKINIELAREVGGSGGSRNLNDTRYIARLVLNYTKDYLD FLPLSDDENTKLNDTQKGSKVHVEAKSGMLTSALRHTWGFSAKDRNNHLHHAIDAVI IAYANNS
IVKAFSDFKKEQESNSAELYAKKISELDYKNKRKFFEPFSGFRQKVLDKIDEI FVSKPERKKPS GALHEETFRKEEEFYQSYGGKEGVLKALELGKIRKVNGKIVKNGDMFRVDI FKHKKTNKFYAVP IYTMDFALKVLPNKAVARSKKGEIKDWILMDENYEFCFSLYKDSLILIQTKDMQEPEFVYYNAF TSSTVSLIVSKHDNKFETLSKNQKILFKNANEKEVIAKS IGIQNLKVFEKYIVSALGEVTKAEF RQREDFKKS
SEQ ID NO: 24: All-in-one dCjCas9AHNH-based transcription activator in AAV vector,
(ITR-promoter-HA-M.S-dCjCas9AHNH-NLS-W64-p65- LS-minipA-U6-SaogRNA-ITR), ITR in lowercase grey highlighting, HA in lowercase italics, LS in lowercase bold underlining, dCjCas9AHNH in uppercase underlining, VP64-p65 in uppercase grey highlighting, minipA in uppercase bold italics, U6 in grey text, SaogRNA in uppercase underlined italics
llgcggcctctagactcgagCGCGTGATGAGAGCAGCCACTACGGGTCTAGGCTGCCCATGTAA GGAGGCAAGGCCTGGGGACACCCGAGATGCCTGGTTATAATTAACCCAGACATGTGGCTGCCCC CCCCCCCCCAACACCTGCTGCCTGCTAAAAATAACCCTGTCCCTGGTGGccctgcatgcccACT CACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCA ACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTA CGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGATCcgaccggtgccaccat gtaccca tacga tgttccagattacgcttcgccgaagaaaaagcgcaaggtcgaagcgtccGCC CGCATCCTCGCTTTCGCCATCGGAATCTCTAGTATCGGATGGGCCTTCTCTGAAAACGACGAAC TGAAAGACTGCGGCGTGAGAATCTTCACAAAGGTTGAAAACCCTAAAACAGGCGAGTCTTTAGC TCTGCCACGTAGGTTGGCCCGCTCCGCCCGAAAAAGGCTGGCTCGGCGGAAGGCTCGCCTCAAC CACTTGAAGCATTTGATAGCTAATGAGTTCAAACTGAACTACGAAGATTACCAGTCCTTCGACG AGTCATTGGCAAAAGCCTACAAAGGCAGCCTTATCAGTCCTTATGAGTTGAGATTTCGCGCACT CAACGAACTGCTTTCTAAGCAAGACTTTGCTAGGGTCATTCTGCACATCGCAAAACGGCGAGGT TATGACGATATCAAGAACTCCGACGATAAAGAAAAGGGAGCCATTCTCAAGGCGATCAAACAGA ATGAGGAAAAATTGGCAAACTACCAGAGTGTGGGCGAGTATCTGTATAAAGAGTATTTCCAGAA GTTTAAGGAAAACAGCAAGGAGTTTACAAACGTCAGAAATAAAAAGGAGTCTTACGAGAGATGC ATCGCGCAGTCATTCCTCAAAGATGAGCTGAAGCTGATATTTAAGAAGCAACGCGAATTTGGTT TCTCATTCTCTAAGAAGTTCGAAGAGGAGGTTCTTTCCGTGGCGTTTTACAAGAGGGCGCTCAA AGACTTCTCCCACCTGGTTGGTAACTGTAGTTTCTTCACGGATGAGAAGCGAGCTCCCAAAAAT
TCTCCCCTGGCTTTCATGTTTGTTGCCCTGACTCGGATCATTAACCTGCTGAACAACCTGAAAA AT AC T GAAG G GAT C T T G TAT AC GAAG GAC GAC C T AAAT G C AC T C C T GAAT GAAG T G C T C AAAAA C G GAAC T C T AAC C T AT AAAC AGAC C AAGAAAT T AC TGGGGCTCTCT GAC GAC T AC GAG T T C AAG G G C GAGAAG G G T AC TTATTTTATC GAAT T C AAAAAG T AT AAG GAG T T CAT T AAAG CAT T G G G G G AAC ACAAC C T C AG C C AG GAC GAT C T C AAT GAAAT T G C C AAG GAC AT C AC G C T GAT T AAAGAC GA GATAAAACTGAAAAAGGCACTGGCCAAGTATGACCTCAACCAGAACCAGATCGACTCTCTGTCC AAGCTGGAGTTCAAAGACCACCTAAACATATCCTTCAAAGCCCTGAAACTGGTCACCCCTCTAA TGCTCGAAGGAAAAAAATACGACGAGGCGTGTAAT GAAC T GAAT CTTAAGGTGGCCATCAATGA G GAT AAGAAG GAC T T T C T T C C AG C C T T T AAC GAGAC AT AT T AC AAAGAC GAGG T C AC AAAC C C G GTTGTGCTGAGGGCCATAAAAGAGTATCGGAAGGTTCTGAATGCCCTCCTGAAGAAGTACGGCA AAGTGCACAAAATAAATATCGAATTGGCTAGGGAGGTGGGAGGTAGCGGTGGCTCTAGGAACCT GAAT GAC AC GAG G T AC AT T G C G C GAC T GG T T C T AAAC TAT AC C AAAGAC T AC C T G GAT T T C C T C C C T C T GAG C GAC GAC GAGAAT AC T AAAC T GAAT GAT AC C CAGAAAG G C T C AAAG G T C C AC G T T G AGGCTAAGTCCGGGATGCTGACTAGCGCCCTCCGCCACACGTGGGGCTTCAGCGCCAAAGATCG GAAT AAT CAT C T T CAT C AC G C T AT T GAT G C AG T AAT CAT AG C C T AC G C T AACAAC AG CAT C G T G AAAG CCTTCTCC GAT T T C AAG AAAG AAC AG GAG T C T AAT AG C G C C GAG TTGTACGC C AAG AAAA TTTCCGAATTGGACTATAAAAATAAGAGAAAATTCTTCGAACCCTTCTCCGGGTTTCGCCAAAA GGTCTTAGATAAGATCGACGAGATTTTCGTTTCCAAGCCCGAAAGAAAAAAGCCTTCAGGGGCA C T G C AC GAAGAGAC AT T C C G C AAG GAAGAG GAAT T T T AC C AAT C T T AC G G T GG T AAAGAG G GAG TTCTGAAGGCTCTGGAGCTTGGGAAGATCCGCAAGGTAAACGGGAAAATCGTGAAAAACGGGGA CAT G T T C AG G G T G GAT AT C T T C AAG C ACAAAAAGAC C AAC AAG T T C T AC G C AG T AC C CAT C T AC ACTATGGATTTCGCTTTAAAGGTTCTCCCAAATAAGGCGGTGGCTCGATCGAAGAAAGGAGAGA TCAAGGACTGGATCTTAATGGATGAAAATTACGAGTTTTGCTTCTCGCTCTACAAAGATAGCCT GAT T C T GAT C C AGAC AAAAGAC AT G C AGGAAC CAGAAT TTGTTTAT TAT AAC G C C T T C AC GAG C AG TACAG TGTCCCTGATTGT GAG C AAG CAT GAT AAC AAG T T C GAGAC T C T G T C TAAGAAT C AGA AAAT C C T T T T C AAG AAC G C C AAC GAGAAG GAG G T C AT C G C AAAG T C AAT T G G C AT C C AAAAC C T GAAG G T G T T C GAGAAAT AC AT AG T G T C C G C AC T C G G T GAAG T AAC T AAAG C C GAAT T T C GAC AG CGCGAGGATTTTAAGAAAAGCGGATCCGGAGGAGGAGGAAGCGGAGGAGGAGGTAGCli llll
^^^^^^^^^^^^^^^^^^^^GGC T C T GGCAGCGGCAGC||ll|i||l|||i|||!|i
CCCAAAGCCAAGGACCCCACTGTCTCCqqacgtacgtccaaacgcacggctgatggttccgaat ttgagtctccgaagaagaaacgcaaagttgagtaaACTAGTAATAAAaGATCt TATTTTCATT aGATCtGTGTGTTGGTTTTTTGTGTGqqtaccgagggcctatttcccatgattccttcatattt gcatatacgatacaaggctgttagagagataattggaattaatttgactgtaaacacaaagata ttagtacaaaatacgtgacgtagaaagtaataatttcttgggtagtttgcagttttaaaattat gttttaaaatggactatcatatgcttaccgtaacttgaaagtatttcgatttcttggctttata tatcttgtggaaaggacgaaacACCGcGAGACCacggcaGGTCTCaGTTrcAGTCCCTGAAAAG GGACTgAAATAAAGAGTTTGCGGGACTCTGCGGGGTTACAATCCCCTAAAACCGCTTTTTTTqc ggccgcii|||pe

Claims

WHAT IS CLAIMED IS:
1. A system for modulating gene expression, the system comprising:
a modified Cas9 protein, wherein the modified Cas9 protein comprises a HNH domain deletion;
a guide RNA bound to the modified Cas9 protein, wherein the guide RNA comprises a DNA binding sequence; and
at least one effector domain bound to the modified Cas9 protein, the guide RNA, or both.
2. The system of claim 1, wherein the guide RNA is operably linked to the effector domain via an aptamer.
3. The system of either of claims 1 or 2, wherein the effector domain is fused to an aptamer-binding protein.
4. The system of any one of claims 1-3, wherein the at least one effector domain is an activator.
5. The system of any one of claims 1-3, wherein the at least one effector domain is a repressor.
6. The system of any one of claims 1-5, wherein one or more polynucleotides encoding the modified Cas9 protein, the guide RNA, and the effector domain are packaged within an adeno-associated virus (AAV).
7. The system of any one of claims 1-6, wherein the modified Cas9 protein is a modified Streptococcus pyogenes Cas9 protein (SpCas9).
8. The system of claim 7, wherein the nucleic acid sequence of the modified SpCas9 is SEQ ID NO: 3.
9. The system of claim 7, wherein the amino acid sequence of the modified SpCas9 is SEQ ID NO: 4.
10. The system of any one of claims 1-6, wherein the modified Cas9 protein is a modified Streptococcus aureus Cas9 protein (SaCas9).
11. The system of claim 10, wherein the nucleic acid sequence of modified SaCas9 is SEQ ID NO: 14.
12. The system of claim 10, wherein the amino acid sequence of modified SaCas9 is SEQ ID NO: 15.
13. The system of any one of claims 1-6, wherein the modified Cas9 protein is a modified Campylobacter jejuni Cas9 protein (CjCas9).
14. The system of claim 13, wherein the nucleic acid sequence of modified CjCas9 is SEQ ID NO: 22.
15. The system of claim 13, wherein the amino acid sequence of modified CjCas9 is SEQ ID NO: 23.
16. The system of any one of claims 1-15, wherein the modified Cas9 protein comprises at least one additional effector domain.
17. The system of any one of claims 1-16, wherein the nucleic acid sequence of the deleted HNH domain is selected from the group consisting of SEQ ID NO: 19, SEQ ID NO: 20, and SEQ ID NO: 21.
18. An adeno-associated virus (AAV) vector system for in vivo delivery of a Cas9- based gene expression modulation system, the AAV vector system comprising;
a first expression cassette comprising a polynucleotide encoding a modified Cas9 protein, the modified Cas9 protein comprising an HNH domain deletion; a second expression cassette comprising a polynucleotide encoding a guide RNA; and
a third expression cassette comprising a polynucleotide encoding an effector domain;
wherein the first, second, and third expression cassettes are located on one or more adeno-associated virus (AAV) vectors.
19. The AAV vector system of claim 18, wherein the first, second, and third expression cassettes are comprised on two AAV vectors.
20. The AAV vector system of either of claims 18 or 19, wherein the first expression cassette is comprised on a first AAV vector.
21. The AAV vector system of claim 20, wherein the second expression cassette is comprised on a second AAV vector.
22. The AAV vector system of claim 21, where the third expression cassette is comprised on the one of the first AAV vector or the second AAV vector.
23. The AAV vector system of claim 22, wherein a fourth expression cassette comprising a polynucleotide encoding an additional effector domain is comprised on the other of the first AAV vector or the second AAV vector.
24. The AAV vector system of any one of claims 20-23, wherein the nucleotide sequence of the first AAV vector comprises a sequence selected from the group consisting of SEQ ID NO: 7, SEQ ID NO: 8 and SEQ ID NO: 9.
25. The AAV vector system of any one of claims 21-24, wherein the nucleotide sequence of the second AAV vector comprises a sequence selected from the group consisting of SEQ ID NO: 10 and SEQ ID NO: 11.
26. The AAV vector system of any one of claims 18-25, wherein the effector domain of the third cassette is an activator domain.
27. The AAV vector system of any one of claims 18-25, wherein the effector domain of the third cassette is a repressor domain.
28. The AAV vector system of claim 18, wherein the first, second, and third expression cassettes are comprised on the same adeno-associated virus (AAV) vector.
29. The AAV vector system of either of claims 18 or 28, wherein the effector domain is an activator domain.
30. The AAV vector system of claim 29, wherein the nucleic acid sequence of the vector is selected from the group consisting of SEQ ID NO: 16, SEQ ID NO: 17, and SEQ ID NO: 24.
31. The AAV vector system of either of claims 18 or 28, wherein the effector domain is a repressor domain.
32. The AAV vector system of claim 31, wherein the nucleic acid sequence of the vector is SEQ ID NO: 18.
33. A method of modulating gene expression, the method comprising:
introducing a system for modulating gene expression into a cell, the system comprising; a modified Cas9 protein, wherein the modified Cas9 protein comprises a HNH domain deletion;
a guide RNA bound to the modified Cas9 protein, wherein the guide RNA comprises a DNA binding sequence; and
an effector domain bound to the modified Cas9 protein or the guide RNA,
hybridizing the DNA binding sequence to a target sequence;
directing the effector domain to the target sequence; and
modulating the transcription of a target gene that is at or adjacent to the target sequence.
34. The method of claim 33, wherein the system is comprised on one or more adeno- associated virus (AAV) vectors.
35. The method of either one of claims 33 or 34, wherein modulating the transcription of the target gene comprises increasing expression levels of the target gene.
36. The method of either one of claims 33 or 34, wherein modulating the transcription of the target gene comprises decreasing expression levels of the target gene.
PCT/US2018/058685 2017-11-01 2018-11-01 Highly compact cas9-based transcriptional regulators for in vivo gene regulation Ceased WO2019089910A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762580189P 2017-11-01 2017-11-01
US62/580,189 2017-11-01

Publications (1)

Publication Number Publication Date
WO2019089910A1 true WO2019089910A1 (en) 2019-05-09

Family

ID=66332325

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/058685 Ceased WO2019089910A1 (en) 2017-11-01 2018-11-01 Highly compact cas9-based transcriptional regulators for in vivo gene regulation

Country Status (1)

Country Link
WO (1) WO2019089910A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019235627A1 (en) * 2018-06-08 2019-12-12 株式会社モダリス MODIFIED Cas9 PROTEIN AND USE THEREOF
WO2020085441A1 (en) * 2018-10-24 2020-04-30 株式会社モダリス Modified cas9 protein, and use thereof
WO2021178432A1 (en) * 2020-03-03 2021-09-10 The Board Of Trustees Of The Leland Stanford Junior University Rna-guided genome recombineering at kilobase scale

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140356959A1 (en) * 2013-06-04 2014-12-04 President And Fellows Of Harvard College RNA-Guided Transcriptional Regulation
WO2016196655A1 (en) * 2015-06-03 2016-12-08 The Regents Of The University Of California Cas9 variants and methods of use thereof
US20160362667A1 (en) * 2015-06-10 2016-12-15 Caribou Biosciences, Inc. CRISPR-Cas Compositions and Methods
WO2016205759A1 (en) * 2015-06-18 2016-12-22 The Broad Institute Inc. Engineering and optimization of systems, methods, enzymes and guide scaffolds of cas9 orthologs and variants for sequence manipulation
WO2017049407A1 (en) * 2015-09-23 2017-03-30 UNIVERSITé LAVAL Modification of the dystrophin gene and uses thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140356959A1 (en) * 2013-06-04 2014-12-04 President And Fellows Of Harvard College RNA-Guided Transcriptional Regulation
WO2016196655A1 (en) * 2015-06-03 2016-12-08 The Regents Of The University Of California Cas9 variants and methods of use thereof
US20160362667A1 (en) * 2015-06-10 2016-12-15 Caribou Biosciences, Inc. CRISPR-Cas Compositions and Methods
WO2016205759A1 (en) * 2015-06-18 2016-12-22 The Broad Institute Inc. Engineering and optimization of systems, methods, enzymes and guide scaffolds of cas9 orthologs and variants for sequence manipulation
WO2017049407A1 (en) * 2015-09-23 2017-03-30 UNIVERSITé LAVAL Modification of the dystrophin gene and uses thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XU ET AL.: "CRISPR-Mediated Genome Editing Restores Dystrophin Expression and Function in mdx Mice", MOLECULAR THERAPY, vol. 24, no. 3, 5 January 2016 (2016-01-05), pages 564 - 569, XP055419710, DOI: doi:10.1038/mt.2015.192 *
ZALATAN ET AL.: "Engineering Complex Synthetic Transcriptional Programs with CRISPR RNA Scaffolds", CELL, vol. 160, no. 1-2, 15 January 2015 (2015-01-15), pages 339 - 350, XP055278878, DOI: doi:10.1016/j.cell.2014.11.052 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019235627A1 (en) * 2018-06-08 2019-12-12 株式会社モダリス MODIFIED Cas9 PROTEIN AND USE THEREOF
WO2020085441A1 (en) * 2018-10-24 2020-04-30 株式会社モダリス Modified cas9 protein, and use thereof
WO2021178432A1 (en) * 2020-03-03 2021-09-10 The Board Of Trustees Of The Leland Stanford Junior University Rna-guided genome recombineering at kilobase scale

Similar Documents

Publication Publication Date Title
JP7469433B2 (en) Mutant CAS9 Protein
JP7555822B2 (en) Compositions and methods for genome editing
JP7036511B2 (en) RNA-induced transcriptional regulation
EP3359677B1 (en) Compositions and methods for treating fragile x syndrome and related syndromes
RU2713328C2 (en) Hybrid dna/rna polynucleotides crispr and methods of appliance
CA2906553C (en) Rna-guided targeting of genetic and epigenomic regulatory proteins to specific genomic loci
CA3219570A1 (en) Compositions and methods for improved protein translation from recombinant circular rnas
WO2017181107A2 (en) Modified cpf1 mrna, modified guide rna, and uses thereof
WO2016205728A1 (en) Crispr mediated recording of cellular events
BR112020026246A2 (en) compositions, systems and amplification methods based on crispr double nickase
AU2019419691B2 (en) Non-replicative transduction particles with one or more non-native tail fibers and transduction particle-based reporter systems
WO2019089910A1 (en) Highly compact cas9-based transcriptional regulators for in vivo gene regulation
WO2020007325A1 (en) Cas9 variants and application thereof
KR20230021081A (en) Compositions and methods for epigenome editing
AU2022292659A1 (en) Systems, methods, and compositions comprising miniature crispr nucleases for gene editing and programmable gene activation and inhibition
EP3277812A1 (en) Novel expression regulating rna-molecules and uses thereof
US20210180053A1 (en) Synthetic rnas and methods of use
Meng et al. Engineered Cas12j‐8 is a Versatile Platform for Multiplexed Genome Modulation in Mammalian Cells
CN117062912A (en) Fusion proteins for CRISPR-based transcriptional repression
JP2019502394A (en) Mini-III RNase, method for changing specificity of RNA sequence cleavage by Mini-III RNase, and use thereof
WO2023232024A1 (en) System and methods for duplicating target fragments
CN103194450A (en) Method for increasing activity of myogenin (MyoG) gene promoter
RU2791447C1 (en) DNA CUTTER BASED ON THE ScCas12a PROTEIN FROM THE BACTERIUM SEDIMENTISPHAERA CYANOBACTERIORUM
EP4198124A1 (en) Engineered cas9-nucleases and method of use thereof
US20250027134A1 (en) Screening of cas nucleases for altered nuclease activity

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18874906

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18874906

Country of ref document: EP

Kind code of ref document: A1