WO2018001858A1 - Assemblage sur mesure d'un polypeptide modulaire de bourgeon - Google Patents
Assemblage sur mesure d'un polypeptide modulaire de bourgeon Download PDFInfo
- Publication number
- WO2018001858A1 WO2018001858A1 PCT/EP2017/065392 EP2017065392W WO2018001858A1 WO 2018001858 A1 WO2018001858 A1 WO 2018001858A1 EP 2017065392 W EP2017065392 W EP 2017065392W WO 2018001858 A1 WO2018001858 A1 WO 2018001858A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- sequence
- dna
- fragment
- fragments
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/64—General methods for preparing the vector, for introducing it into the cell or for selecting the vector-containing host
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1093—General methods of preparing gene libraries, not provided for in other subgroups
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/10—Libraries containing peptides or polypeptides, or derivatives thereof
Definitions
- the present invention relates to methods for assembling modular DNA binding polypeptides. Further disclosed are vectors encoding modular DNA binding
- polypeptides and their use in gene editing.
- Precise genome editing meaning the ability to modify specific regions of the genome to provide new functions to a cell or to correct malfunctions, has always been a very appealing topic.
- scientists are now looking with rising interest into the possibility of generating modified organisms such as bacteria, crops or livestock to face global issues such as coping with the scarcity of natural resources and finding treatments for i.e. cancer, neurodegenerative and monogenic diseases.
- Precise genome editing can be used, together with the available technology of reprogrammed stem cells, in personalized gene therapy. In an approach called ex vivo gene therapy, adult cells carrying a genetic disorder are first reprogrammed into stem cells, then using precise genome editing, the genetic mutation is corrected, and finally, the now healthy cells are reinserted into the host.
- Precise genome editing can also be used to generate of animal models that allow studying metabolic pathways and even for production of vaccines for diseases such as HIV.
- Genomes contain from few up to billion nucleotide sequences.
- the key to precise genome editing is to identify and mark the one sequence that needs to be modified among the large number of sequences present and avoid off-target effects that can be caused by editing sequences at other locations. Once the sequence is marked, it is possible to introduce the desired changes according to the wanted applications.
- the important player in precise genome editing is thus the molecule that is able to recognize and bind a specific sequence amongst many other unwanted sequences.
- BurrH a modular DNA binding protein called BurrH (Juillerat A., Bertonati C, Dubois G. et al. (2014) BurrH: a new modular DNA binding protein for genome engineering. Sci Rep.
- BurrH was proved to be quite specific in the contest of the large genomes, meaning that it was able to recognize and bind just the target sequence. Besides, BurrH has several advantages over other emerging genome editing tools such as the Cas9/CRISPR, Cpf1 systems: higher specificity, can be more easily fused with other peptides and enzymatic activities, it is smaller and so easier to transfect into a host cell.
- the present disclosure relates to a new, flexible and cost effective method to assemble BurrH-derived polypeptides engineered to recognize specific target sequences.
- the method is based on the use of Ligase Independent Cloning (LIC) and a library of DNA fragments designed to be assembled using LIC.
- LIC Ligase Independent Cloning
- One aspect of the present disclosure relates to a method for assembling a vector encoding a BurrH-derived modular DNA binding polypeptide comprising or consisting of a BuD array, wherein the BuD array is a DNA binding domain comprising or consisting of 18 DNA binding sub-domains called BuD, each sub-domain
- each sub-library consists of 16 different DNA fragments, each fragment consisting of between 150 and 300 nucleotides and encoding two consecutive DNA binding sub-domains of said BuD array;
- each fragment comprises at least three conserved regions, CR, and at least two variable regions, VR, positioned in an alternating manner, as depicted in formul
- n is an integer and is equal to at least 1
- each of the at least three conserved regions is identical for all 16 fragments of the same sub-library
- variable regions being independently selected from the group consisting of SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO: 31 , SEQ ID NO:
- the 3' sequence of the DNA fragment selected from the sub-library Gn is operably linked to the 5' sequence of the DNA fragment selected from the sub-library Gn+1 ; thereby assembling a vector encoding the BurrH-derived modular DNA binding polypeptide that recognizes an 18 nucleotide-long DNA target sequence.
- Another aspect of the present disclosure relates to a method for production of a BurrH- derived modular DNA binding polypeptide that recognizes an 18 nucleotide-long target DNA sequence comprising the steps of:
- a vector encoding the modular DNA binding polypeptide, and ii. expressing said final vector in an eukaryotic host cell, thereby producing said DNA binding polypeptide.
- Another aspect of the present disclosure relates to a method for editing a target nucleic acid sequence using a BuD array, comprising transfecting or transducing a eukaryotic host cell with a final vector assembled according to the methods disclosed herein.
- Another aspect of the present disclosure relates to a library comprising the 144 DNA fragments of SEQ ID NO:1 -9, wherein the DNA fragments are suitable for LIC, wherein each of SEQ ID NO:1 -9 is a consensus sequence for 16 unique sequences.
- Another aspect of the present disclosure relates to a fragment comprising at least three conserved regions, CR, and at least two variable regions, VR, positioned in an alternating manner, as depicted in formula I:
- n is an integer and is equal to at least 1 , and
- CRa is a nucleotide sequence selected from the group consisting of SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID NO:89, SEQ ID NO:90, SEQ ID NO:91 , SEQ ID NO:92, SEQ ID NO:93 and SEQ ID NO:94, being a non-palindromic sequence and encoding part of a DNA binding domain;
- CRb is a nucleotide sequence selected from the group consisting of SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO:100, SEQ ID NO:101 , SEQ ID NO:102, and SEQ ID NO:103, encoding for part of a DNA binding domain;
- CRc is a nucleotide sequence selected from the group consisting of SEQ ID NO:104, SEQ ID NO:105, SEQ ID NO:106, SEQ ID NO:107, SEQ ID NO:108, SEQ ID NO:109, SEQ ID NO:1 10, SEQ ID NO:1 1 1 andSEQ ID NO:1 12, being a non-palindromic sequence and encoding part of a DNA binding domain; and
- VRa and VRb are a nucleotide sequence independently selected from the group consisting of SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO: 31 , SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41 , SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 51 , SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, SEQ ID NO: 55, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 58, SEQ ID NO: 59, SEQ
- Another aspect of the present disclosure relates to a polynucleotide encoding a BurrH- derived modular DNA binding polypeptide capable of recognizing an 18 nucleotide-long DNA target sequence, the polynucleotide comprising or consisting of SEQ ID NO:206.
- Another aspect of the present disclosure relates to a vector encoding a BurrH-derived modular DNA binding polypeptide comprising or consisting of a BuD array, wherein the BuD array is a DNA binding domain, wherein said BuD array comprises 18 DNA binding sub-domains called BuD, each sub-domain independently recognising one nucleotide, said vector being assembled according to the method comprising the steps of:
- each sub-library consists of 16 DNA fragments, each fragment consisting of between 150 and 300 nucleotides and encoding two consecutive DNA binding sub-domains of the BuD array;
- the 3' sequences of the fragments of one library Gn are compatible with the 5' sequences of the fragments of the library Gn+1 ,
- each fragment comprises at least three conserved regions and at least two variab gions, as depicted formula I:
- n is an integer and is equal to at least 1 ,
- each of the at least three conserved regions is identical for all 16 fragments of the same sub-library
- the at least two variable regions being independently selected from the group consisting of SEQ ID NO:28, SEQ ID NO: 41 , SEQ ID NO:44 and SEQ ID NO: 46; the combination of at least two variable regions of any one fragment being different from the combination of at least two variable regions of any one of the other fragments of the same library,
- BuD array is a DNA binding domain comprising or consisting of 18 DNA binding sub-domains called BuD, each sub-domain independently recognising one nucleotide, the BurrH-derived modular DNA binding polypeptide encoded by a polynucleotide disclosed herein.
- kits of parts comprising:
- iii optionally comprising instructions for assembling a vector encoding a BurrH- derived modular DNA binding polypeptide.
- compositions comprising the fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein or a BurrH-derived modular DNA binding polypeptide as disclosed herein for use as a medicament.
- compositions comprising the fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein or a BurrH-derived modular DNA binding polypeptide as disclosed herein for use in treatment of a disorder in a subject in need thereof.
- compositions comprising the fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein or a BurrH-derived modular DNA binding polypeptide as disclosed herein for use in gene therapy in a subject in need thereof.
- compositions comprising the fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein or a BurrH-derived modular DNA binding polypeptide as disclosed herein for use in treatment of a disorder in a subject in need thereof, said disorder being selected from the group of cancer, neurodegenerative diseases, autoimmune diseases and monogenic diseases.
- Another aspect of the present disclosure relates to the use of a fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein, a BurrH- derived modular DNA binding polypeptide as disclosed herein or a kit of parts as disclosed herein for editing a nucleotide sequence and/or the genome of a host cell.
- Another aspect of the present disclosure relates to the use of a fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein, a BurrH- derived modular DNA binding polypeptide as disclosed herein or a kit of parts as disclosed herein for the production of a genetically modified organism, such as a microorganism, a plant or a mammal.
- Figure 1 Assembly of a vector comprising 9 specific DNA fragments, each encoding 2 consecutive domains of BurrH, that target the DNA sequence 5'- GGATACACCACTACGATG-3' .
- FIG. 1 Details of the ligase independent cloning (LIC) sites used to assemble the intermediate vectors (IV1 , IV2 and IV3) and the final vector (FV).
- LIC7, LIC8, LIC9, LIC10 the sites designed for assembling IV3.
- LIC1 , LIC4, LIC7, LIC10 the sites designed for assembling the 3 intermediate hexamers from IV1 , IV2 and IV3 into FV.
- Figure 3 Details of the restriction enzymes sites and of the LIC sites used for cutting the 3 intermediate hexamers from their respective IV and to assemble them into the FV.
- FIG. 1 Star cassettes in which the FV is incorporated. Each star cassette comprises sites for restriction enzymes, N-terminus and C-terminus domains of the array.
- Figure 7 Scheme of the functional vector comprising a BuD array and the transcription repressor KRAB.
- Figure 8 Scheme of the functional vector comprising a BuD array and the transcription repressor SID.
- Figure 9 Scheme of the functional vector comprising a BuD array and the transcription activation domain VP16.
- BurrH-derived modular DNA binding polypeptide or “modular DNA binding polypeptide” or “DNA binding polypeptide” as used herein refers to a protein comprising a BuD array, as defined below.
- the BurrH-derived modular DNA binding protein is a modified version of the polypeptide BurrHI (uniprot number E5AV36).
- the BurrH-derived modular DNA binding polypeptide may consist of a BuD array.
- the BurrH-derived modular DNA binding polypeptide may also comprise an N-terminal domain and a C-terminal domain as defined below.
- BuD or “sub-domain” as used herein refers to an amino acid or an amino acid sequence that can recognize and bind one nucleotide for example via non- covalent bonds or covalent bonds.
- BuD array or “DNA binding domain” or “binding domain” as used herein refers to at least 2 BuD or sub-domains assembled in sequence. A BuD array is therefore capable of recognizing and binding at least 2 consecutive nucleotides, which may be identical or different. In some embodiments of the present disclosure, a BuD array comprises 18 consecutive BuD and binds an 18-nucleotides long fragment.
- C-terminal domain or “C-terminal array” as used herein refers to a sequence consisting of between 1 and 170 amino acids comprised in an amino acid chain or polypeptide and ending with a free carboxyl group (-COOH).
- codon refers to a triplet of adjacent nucleotides that codes for a specific amino acid.
- compatible refers to two sequences, i.e. two nucleotide sequences, that are able to interact with each other, for example via non-covalent bonds or covalent bonds, or that can be assembled via LIC.
- single stranded overhang sequences generated via digestion with a 3' - 5' exonuclease activity, such as a T4 DNA polymerase, and being complementary to each other are compatible sequences.
- Compatible sequences can bind to each other without the help of ligase enzymes.
- DNA fragment or “fragment” as used herein refers to a nucleotide sequence. Throughout this application, DNA fragments are designated starting from the 5'-end.
- conserved region or “CR” as used herein refers to a nucleotide sequence part of a longer nucleotide sequence or fragment which is identical for all nucleotide sequences or fragments comprised in the same sub-library. For example, all the DNA fragments of one sub-library are characterized by having identical conserved regions.
- the term “Ligase independent cloning” or “Ligation independent cloning” or “LIC” as used herein refers to a method to assemble nucleic acid fragments without using ligase enzymes.
- the method comprises digesting two compatible double-stranded nucleic acid fragments with a polymerase, such as the T4 DNA polymerase, and in presence of only one nucleotide so that each fragment is left with a single stranded overhang sequence at the 5'.
- the two nucleic acid fragments comprise a terminal non- palindromic sequence of at least 8 nucleotides and are designed in such a way that the so generated blunt or sticky ends will be compatible and therefore bind to each other so linking the two fragments.
- N-terminal domain refers to a sequence consisting of between 1 and 180 amino acids comprised in an amino acid chain or polypeptide and ending with a free amine group (-COOH).
- the N-terminal domain may also comprise a Nuclear Localization Sequence (NLS) and Human influenza hemagglutinin (HA) epitope for protein localization.
- HA Human influenza hemagglutinin
- palindromic refers to a double-stranded nucleotide sequence that, if read from the 5' end to the 3' end of one strand is identical than if read from the 5' end to the 3' end of the complementary strand.
- non-palindromic refers to a double-stranded nucleotide sequence that, if read from the 5' end to the 3' end of one strand is different than if read from the 5' end to the 3' end of the complementary strand.
- Non-palindromic nucleotides are for example used for assembling nucleic acid fragments through LIC.
- a nucleic acid sequence encoding one or more polypeptides of interest and one or more transcriptional regulatory sequences are said to be "operably linked" when they are connected in such a way as to permit expression of the nucleic acid sequence when introduced into a cell.
- amino acids are named herein using its 1 -letter code according to the recommendations from lUPAC, see for example http://www. chem.qmw.ac.uk/iupac. If nothing else is specified an amino acid may be of D or L-form.
- sequence identity refers to two polynucleotide sequences that are identical (i.e., on a nucleotide-by-nucleotide basis) over the window of comparison.
- percentage of sequence identity is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity.
- the identical nucleic acid base e.g., A, T, C, G, U, or I
- a degree of identity of amino acid sequences is a function of the number of identical amino acids at positions shared by the amino acid sequences.
- a degree of homology or similarity of amino acid sequences is a function of the number of amino acids, i.e. structurally related, at positions shared by the amino acid sequences.
- the global percentage of sequence identity is determined with the algorithm GAP, BESTFIT, or FASTA in the Wisconsin Genetics Software Package Release 7.0, using default gap weights.
- variable region refers to a nucleotide sequence part of a longer nucleotide sequence or part of a DNA fragment which may be different for all nucleotide sequences or fragments. For example, all the DNA fragments of one library are characterized by having variable regions that may differ.
- gene therapy refers to the insertion of genes into a subject's cells and tissues to treat a disease.
- sub-vector refers to a genetic construct configured for easily insertion and excision of nucleic acid sequence, for example via digestion with restriction enzymes and/or LIC.
- nuclear Localisation Sequence refers to an amino acid sequence which 'tags' a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysines or arginines exposed on the protein surface. Different nuclear localized proteins may share the same NLS. An NLS has the opposite function of a nuclear export signal, which targets proteins out of the nucleus.
- the present disclosure relates to a new, flexible and cost effective method to assemble BurrH-derived polypeptides engineered to recognize specific target sequences.
- the method is based on the use of Ligase Independent Cloning (LIC) and a library of DNA fragments designed to be assembled using LIC.
- BurrH-derived polypeptides and BuD array assembled with the methods of the present invention can be fused with an additional peptide having enzymatic activity and used for gene editing.
- One aspect of the present disclosure relates to a method for assembling a vector encoding a BurrH-derived modular DNA binding polypeptide comprising or consisting of a BuD array, wherein the BuD array is a DNA binding domain comprising or consisting of 18 DNA binding sub-domains called BuD, each sub-domain
- each sub-library consists of 16 different DNA fragments, each fragment consisting of between 150 and 300 nucleotides and encoding two consecutive DNA binding sub-domains of said BuD array;
- the 3' sequences of all the fragments of one sub-library are identical, and the 3' sequences of the fragments of one sub-library Gn are compatible with the 5' sequences of the fragments of the sub-library Gn+1 ,
- each fragment comprises at least three conserved regions (CR) and at least two variable re ions (VR), as depicted in formula I:
- n is an integer and is equal to at least 1 ,
- each of the at least three conserved regions is identical for all 16 fragments of the same sub-library
- variable regions being independently selected from the group consisting of SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO: 31 , SEQ ID NO:
- each fragment comprises or consists of the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, and wherein CRa is identical for all 16 fragments of the same sub-library, CRb is identical for all 16 fragments of the same sub-library and CRc is identical for all 16 fragments of the same sub-library.
- the method further comprises verifying the sequence or testing the binding of the encoded polypeptide to the target sequence.
- the sequence of the encoded polypeptide can be verified by sequencing the encoded polypeptide such as by sequencing the nucleotide sequence encoding the polypeptide, and the binding of the encoded polypeptide to a target sequence can be tested by using techniques that detect binding such as size exclusion chromatography, surface plasmon resonance (SPR) or Isothermal Titration Calorimetry (ITC).
- the vector further comprises a sequence encoding the N- terminal domain and a sequence encoding the C-terminal domain of the BurrH-derived modular DNA binding polypeptide.
- the sequence encoding the N-terminal domain is positioned upstream of the sequence encoding the BuD array; similarly, the sequence encoding the C-terminal is positioned downstream of the sequence encoding the BuD array, so that when the vector is expressed, a functional BurrH-derived modular DNA binding polypeptide is produced.
- each intermediate vector comprises some of the 9 selected fragments.
- the method may involve the assembly of two intermediate vectors, wherein the first intermediate vector comprises 4 of the 9 selected fragments, and the second intermediate vector comprises the other 5 of the 9 selected fragments, or the first intermediate vector comprises 3 of the 9 selected fragments and the second intermediate vector comprises the other 6 of the 9 selected fragments.
- the method further comprises the step of assembling via LIC the 9 selected DNA fragments in 3 sub-vectors, SV1 , SV2 and SV3, wherein each intermediate vector comprises 3 fragments, prior to assembling the 9 selected DNA fragments in a vector.
- Each of the 3 sub-vectors comprises 2 non-palindromic sequences and at least two restriction enzyme recognition sites.
- each of the 3 sub-vectors comprises 2 non-palindromic sequences wherein:
- the first non-palindromic sequence of SV1 is compatible with the 5'-CR
- sequence of the DNA fragments of G1 and the second non-palindromic sequence of SV1 is compatible with the 3'-CR sequence of the DNA fragments of G3;
- the first non-palindromic sequence of SV2 is compatible with the 5'-CR
- sequence of the DNA fragments of G4 and the second non-palindromic sequence of SV2 is compatible with the 3'-CR sequence of the DNA fragments of G6;
- the first non-palindromic sequence of SV3 is compatible with the 5'-CR
- sequence of the DNA fragments of G7 and the second non-palindromic sequence of SV3 is compatible with the 3'-CR sequence of the DNA fragments of G9.
- each sub-vector comprises 2 non-palindromic sequences wherein:
- the first non-palindromic sequence of SV1 comprises or consists of SEQ ID NO:
- the first non-palindromic sequence of SV2 comprises or consists of SEQ ID NO:
- the second non-palindromic sequence of SV2 comprises or consists of SEQ ID NO: 186;
- the first non-palindromic sequence of SV3 comprises or consists of SEQ ID NO:
- the second non-palindromic sequence of SV3 comprises or consists of SEQ ID NO: 27.
- the method therefore comprises assembling 3 sub-vectors wherein:
- SV1 comprises 3 DNA fragments independently selected from each of the DNA sub- libraries G1 , G2 and G3;
- SV2 comprises 3 DNA fragments independently selected from each of the DNA sub- libraries G4, G5 and G6; and c. SV3 comprises 3 DNA fragments independently selected from each of the DNA sub- libraries G7, G8 and G9.
- a fragment selected from G1 is assembled with a fragment selected from G2 and a fragment selected from G3 in a first intermediate vector;
- a fragment selected from G4 is assembled with a fragment selected from G5 and a fragment selected from G6 in a second intermediate vector;
- a fragment selected from G7 is assembled with a fragment selected from G8 and a fragment selected from G9 in a third intermediate vector.
- the fragments assembled in each of the three intermediate vectors are assembled, for example via LIC and/or digestion with restriction enzymes, in a further vector which will so comprise the 9 fragments, each selected from one of the 9 sub-libraries.
- a further aspect of the present disclosure relates to a fragment comprising at least three conserved regions, CR, and at least two variable regions, VR, positioned in an alternating manner, as ted in formula I:
- n is an integer and is equal to at least 1 , and
- CRa is a nucleotide sequence selected from the group consisting of SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID NO:89, SEQ ID NO:90, SEQ ID NO:91 , SEQ ID NO:92, SEQ ID NO:93, SEQ ID NO:94;
- CRb is a nucleotide sequence selected from the group consisting of SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO:100, SEQ ID NO:101 , SEQ ID NO:102, SEQ ID NO:103;
- CRc is a nucleotide sequence selected from the group consisting of SEQ ID NO:104, SEQ ID NO:105, SEQ ID NO:106, SEQ ID NO:107, SEQ ID NO:108, SEQ ID NO:109, SEQ ID NO:1 10, SEQ ID NO:1 1 1 , SEQ ID NO:1 12; and
- VRa and VRb are a nucleotide sequence independently selected from the group consisting of SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO: 31 , SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41 , SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 51 , SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, SEQ ID NO: 55, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 58, SEQ ID NO: 59, SEQ
- the DNA fragment comprises the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, wherein CRa is a nucleotide sequence of SEQ ID NO:86, CRb is a nucleotide sequence of SEQ ID NO:95 and CRc is a nucleotide sequence of SEQ ID NO:104, being a DNA fragment of the sub-library G1 and so that the whole DNA fragment encodes 2 consecutive binding domains recognizing and binding 2 consecutive nucleotides on a target nucleotide sequence.
- the DNA fragment comprises the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, wherein CRa is a nucleotide sequence of SEQ ID NO:87, CRb is a nucleotide sequence of SEQ ID NO:96 and CRc is a nucleotide sequence of SEQ ID NO:105, being a DNA fragment of the sub-library G2 and so that the whole DNA fragment encodes 2 consecutive binding domains recognizing and binding 2 consecutive nucleotides on a target nucleotide sequence.
- the DNA fragment comprises the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, wherein CRa is a nucleotide sequence of SEQ ID NO:88, CRb is a nucleotide sequence of SEQ ID NO:97 and CRc is a nucleotide sequence of SEQ ID NO:106, being a DNA fragment of the sub-library G3 and so that the whole DNA fragment encodes 2 consecutive binding domains recognizing and binding 2 consecutive nucleotides on a target nucleotide sequence.
- the DNA fragment comprises the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, wherein CRa is a nucleotide sequence of SEQ ID NO:89, CRb is a nucleotide sequence of SEQ ID NO:98 and CRc is a nucleotide sequence of SEQ ID NO:107, being a DNA fragment of the sub-library G4 and so that the whole DNA fragment encodes 2 consecutive binding domains recognizing and binding 2 consecutive nucleotides on a target nucleotide sequence.
- the DNA fragment comprises the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, wherein CRa is a nucleotide sequence of SEQ ID NO:90, CRb is a nucleotide sequence of SEQ ID NO:99 and CRc is a nucleotide sequence of SEQ ID NO:108, being a DNA fragment of the sub-library G5 and so that the whole DNA fragment encodes 2 consecutive binding domains recognizing and binding 2 consecutive nucleotides on a target nucleotide sequence.
- the DNA fragment comprises the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, wherein CRa is a nucleotide sequence of SEQ ID NO:91 , CRb is a nucleotide sequence of SEQ ID NO:100 and CRc is a nucleotide sequence of SEQ ID NO:109, being a DNA fragment of the sub- library G6 and so that the whole DNA fragment encodes 2 consecutive binding domains recognizing and binding 2 consecutive nucleotides on a target nucleotide sequence.
- the DNA fragment comprises the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, wherein CRa is a nucleotide sequence of SEQ ID NO:92, CRb is a nucleotide sequence of SEQ ID NO:101 and CRc is a nucleotide sequence of SEQ ID NO:1 10, being a DNA fragment of the sub- library G7 and so that the whole DNA fragment encodes 2 consecutive binding domains recognizing and binding 2 consecutive nucleotides on a target nucleotide sequence.
- the DNA fragment comprises the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, wherein CRa is a nucleotide sequence of SEQ ID NO:93, CRb is a nucleotide sequence of SEQ ID NO:102 and CRc is a nucleotide sequence of SEQ ID NO:1 1 1 , being a DNA fragment of the sub- library G8 and so that the whole DNA fragment encodes 2 consecutive binding domains recognizing and binding 2 consecutive nucleotides on a target nucleotide sequence.
- the DNA fragment comprises the three conserved regions CRa,
- CRa is a nucleotide sequence of SEQ ID NO:94
- CRb is a nucleotide sequence of SEQ ID NO:103
- CRc is a nucleotide sequence of SEQ ID NO:1 12, being a DNA fragment of the sub- library G9 and so that the whole DNA fragment encodes 2 consecutive binding domains recognizing and binding 2 consecutive nucleotides on a target nucleotide sequence.
- each variable region is independently selected from the group consisting of SEQ ID NO: 28, SEQ ID NO: 41 , SEQ ID NO: 44, SEQ ID NO: 46.
- the DNA fragments of the present disclosure are characterized by having non- palindromic terminal ends, which make it possible to assemble them via LIC.
- the non- palindromic terminal ends of the DNA fragments of a sub library are identical, whereas the non-palindromic terminal ends of DNA fragments of different sub-libraries are different from each other.
- the different non-palindromic terminal ends are designed in a way that allows assembling of the DNA fragments via LIC in a certain order, so that a DNA fragment selected from G1 is followed by a DNA fragment selected from G2, which is followed by a DNA fragment selected from G3, which is followed by a DNA fragment selected from G4, which is followed by a DNA fragment selected from G5, which is followed by a DNA fragment selected from G6, which is followed by a DNA fragment selected from G7, which is followed by a DNA fragment selected from G8, which is followed by a DNA fragment selected from G9.
- the 9 sub-libraries have the following characteristics
- each DNA fragment of G1 comprises a 5' non-palindromic sequence of SEQ ID NO:10 and a 3' non-palindromic sequence of SEQ ID NO:1 1 ;
- each DNA fragment of G2 comprises a 5' non-palindromic sequence of SEQ ID NO:12 and a 3' non-palindromic sequence of SEQ ID NO:13;
- each DNA fragment of G3 comprises a 5' non-palindromic sequence of SEQ ID NO:14 and a 3' non-palindromic sequence of SEQ ID NO:15; iv.
- each DNA fragment of G4 comprises a 5' non-palindromic sequence of SEQ ID NO:16 and a 3' non-palindromic sequence of SEQ ID NO:17; v. each DNA fragment of G5 comprises a 5' non-palindromic sequence of SEQ ID NO:18 and a 3' non-palindromic sequence of SEQ ID NO:19; vi. each DNA fragment of G6 comprises a 5' non-palindromic sequence of SEQ ID NO:20 and a 3' non-palindromic sequence of SEQ ID NO:21 ; vii.
- each DNA fragment of G7 comprises a 5' non-palindromic sequence of SEQ ID NO:22 and a 3' non-palindromic sequence of SEQ ID NO:23; viii. each DNA fragment of G8 comprises a 5' non-palindromic sequence of SEQ ID NO:24 and a 3' non-palindromic sequence of SEQ ID NO:25; ix. each DNA fragment of G9 comprises a 5' non-palindromic sequence of
- SEQ ID NO:26 and a 3' non-palindromic sequence of SEQ ID NO:27.
- each DNA fragment further comprises at least two restriction enzyme recognition sites, at least one in the conserved region CRa and at least one in the conserved region CRc, so that the fragments can for example be excised from a vector or genetic construct and inserted in another vector or genetic construct.
- non-palindromic terminal end sequences in addition to facilitating LIC, may also encode parts of the DNA binding domain of the present disclosure.
- said non-palindromic 5' and 3' sequences are comprised in the conserved regions.
- the 5' non-palindromic sequences are comprised in the 5'-CRa sequences.
- the 3' non-palindromic sequences are comprised in the CRc-3' sequences.
- said non-palindromic 5' and 3' sequences are comprised in the sequence encoding the DNA binding domains.
- said non-palindromic 5' and 3' sequences consists of between 10 and 60 nucleotides, such as 20 and 60 nucleotides, such as between 30 and 60 nucleotides, such as between 40 and 60 nucleotides.
- non-palindromic 5' and 3' sequences may be digested by an enzyme with exonuclease activity, such as 3'-5' exonuclease activity, for example T4 DNA polymerase to generate DNA fragments with overhangs.
- exonuclease activity such as 3'-5' exonuclease activity, for example T4 DNA polymerase to generate DNA fragments with overhangs.
- the overhangs generated by digestion of the non-palindromic 5' and 3' sequences with T4 DNA polymerase comprise at least 6 nucleotides, such as at least 7 nucleotides, such as at least 8 nucleotides, such as at least 9 nucleotides, such as at least 10 nucleotides, such as 1 1 nucleotides, such as at least 12 nucleotides, such as at least 13 nucleotides, such as at least 14 nucleotides, such as at least 15 nucleotides, such as at least 16 nucleotides, such as at least 17 nucleotides, such as at least 18 nucleotides, such as at least 19 nucleotides, such as at least 20 nucleotides, such as at least 21 nucleotides, such as at least 22 nucleotides.
- the DNA binding polypeptide of the present disclosure recognizes and binds nucleic acid via amino acid residues. The DNA binding is therefore due to a peptide-DNA interaction.
- the DNA binding domains of the present disclosure comprise amino acid residues that are directly responsible of recognizing and binding one or more nucleotides as well as amino acid residues that give the polypeptide the appropriate structure for interacting with DNA and DNA RNA hybrids.
- each variable region encodes a nucleotide-binding amino acid selected from a group consisting of I, D, R, G, S, T, N, C, E, H, K, L, M, P.
- a nucleotide-binding amino acid selected from a group consisting of I, D, R, G, S, T, N, C, E, H, K, L, M, P.
- the person skilled in the art knows that different amino acid residues bind each nucleotide with different affinity. The skilled person is able to choose which amino acid residue will be responsible for recognition and binding of a certain nucleotide. Studies related to the binding of various amino acids to a certain nucleotide are available, see for example Optimized tuning of TALEN specificity using non-conventional RVDs" by Juillerat A., Pessereau C, Dubois G. et al. 2015. Scientific Reports 5, Article number: 8150 doi:10.1038/srep08150.
- each variable region is independently selected from the group consisting of SEQ ID NO: 28, SEQ ID NO: 41 , SEQ ID NO: 44, SEQ ID NO: 46.
- each variable region encodes a DNA binding amino acid selected from the group consisting of I, D, G, N, wherein
- N recognizes the nucleic acid G
- G recognizes the nucleic acid T.
- DNA fragments described in the above section are grouped into 9 sub-libraries and are all part of a large library, which therefore comprises all the DNA fragments that are needed for assembling of BurrH-derived modular DNA binding polypeptides as disclosed herein.
- the present disclosure further relates to a library of DNA fragments.
- the library consists of 144 DNA fragments, which are divided into 9 sub-libraries so that each sub- library consists of 16 fragments.
- the fragments comprised in the library are especially designed so that they can be assembled via Ligase Independent Cloning (LIC).
- LIC Ligase Independent Cloning
- a further aspect of the present disclosure relates to a library comprising the 144 DNA fragments of SEQ ID NO:1 -9, wherein the DNA fragments are suitable for LIC.
- the 9 sub-libraries (Gn, where n is a number between 1 and 9) are called G1 , G2, G3, G4, G5, G6, G7, G8, G9 and are as follows:
- each DNA fragment comprises two non-palindromic sequences consisting of between 10 and 60 nucleotides, one at the 5' and one at the 3' of the fragment, which allow assembling of the fragments via LIC.
- All fragments of a certain sub-library are characterized by having identical 5' non-palindromic sequences and identical 3' non-palindromic sequences. However, the 5' non palindromic sequence of a fragment differs from the 3' non-palindromic sequence of the same fragment. These non-palindromic sequences allow assembling of the fragments via LIC in a predetermined order.
- the 3' non-palindromic sequence of the fragments of G1 is compatible with the 5' non-palindromic sequence of the fragments of G2 and thereby each fragment of G1 can bind each fragment of G2 via LIC.
- the 3' non-palindromic sequence of the fragments of G2 is compatible with the 5' non-palindromic sequence of the fragments of G3 and thereby each fragment of G2 can bind each fragment of G3 via LIC.
- the 3' non-palindromic sequence of the fragments of G3 is compatible with the 5' non-palindromic sequence of the fragments of G4 and thereby each fragment of G3 can bind each fragment of G4 via LIC.
- the 3' non-palindromic sequence of the fragments of G4 is compatible with the 5' non-palindromic sequence of the fragments of G5 and thereby each fragment of G4 can bind each fragment of G5 via LIC.
- the 3' non-palindromic sequence of the fragments of G5 is compatible with the 5' non-palindromic sequence of the fragments of G6 and thereby each fragment of G5 can bind each fragment of G6 via LIC.
- the 3' non-palindromic sequence of the fragments of G6 is compatible with the 5' non-palindromic sequence of the fragments of G7 and thereby each fragment of G6 can bind each fragment of G7 via LIC. In some embodiments, the 3' non-palindromic sequence of the fragments of G7 is compatible with the 5' non-palindromic sequence of the fragments of G8 and thereby each fragment of G7 can bind each fragment of G8 via LIC.
- the 3' non-palindromic sequence of the fragments of G8 is compatible with the 5' non-palindromic sequence of the fragments of G9 and thereby each fragment of G8 can bind each fragment of G9 via LIC.
- DNA fragments comprised in the library are described in detail in the section below "DNA fragments".
- 9 fragments each selected from one of the 9 sub-libraries are assembled via LIC and the so assembled array, also called a BuD array, is inserted in a vector via LIC.
- the vector has a non-palindromic sequence identical and/or compatible with the 5' non-palindromic sequence of the G1 fragments and a second non-palindromic sequence identical and/or compatible with the 3' non-palindromic sequence of the G9 fragments, so that the 5' and the 3' of the BuD array are compatible with the first and the second non-palindromic sequences of the vector and so that the BuD is inserted in the vector between said first and second non-palindromic sequences.
- a further aspect of the present disclosure relates to a vector encoding a BurrH-derived modular DNA binding polypeptide comprising or consisting of a BuD array, wherein the BuD array is a DNA binding domain, wherein said BuD array comprises 18 DNA binding sub-domains called BuD, each sub-domain independently recognising one nucleotide, said vector being assembled according to a method comprising the steps of:
- each sub-library consists of 16 DNA fragments, each fragment consisting of between 150 and 300 nucleotides and encoding two consecutive DNA binding sub-domains of said BuD array;
- each fragment ii. the 5' and the 3' termini of each fragment are each a non-palindromic sequence consisting of between 10 and 60 nucleotides, wherein
- the 3' sequences of the fragments of one library Gn are compatible with the 5' sequences of the fragments of the library Gn+1 ,
- each fragment comprises at least three conserved regions and at least two variable re ions, as depicted in formula I:
- n is an integer and is equal to at least 1 ,
- each of the at least three conserved regions is identical for all 16 fragments of the same sub-library
- variable regions being independently selected from the group consisting of SEQ ID NO :28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO: 31 , SEQ ID NO:
- each fragment comprises the three conserved regions CRa, CRb and CRc and the two variable regions VRa and VRb, wherein
- CRa is identical for all 16 fragments of the same sub-library
- CRb is identical for all 16 fragments of the same sub-library
- CRc is identical for all 16 fragments of the same sub-library.
- DNA fragments are described in detail in the above section "DNA fragments”.
- the vector further comprises a sequence encoding the N- terminal domain and a sequence encoding the C-terminal domain of the BurrH-derived modular DNA binding domain, wherein the N-terminal domain is operably linked to the C-terminal end of a BuD array and the C-terminal domain is operably linked to the N- terminal end of a BuD array so that via expression of the vector a functional BurrH- derived polypeptide is obtained.
- the vector as described herein allows assembling a BuD array which is able to recognize and bind an 18-nucleotides long target double stranded DNA sequence and/or DNA RNA hybrid sequence.
- the vector comprises a non-palindromic sequence identical and/or compatible to the 5' non-palindromic sequence of the fragments of G1 and a non-palindromic sequence identical and/or compatible to the 3' non-palindromic sequence of the fragments of G9, wherein the two non-palindromic sequences are contiguous and wherein said selected 9 DNA fragments are inserted between said two non-palindromic sequences.
- the vector as disclosed herein may comprise of any one of SEQ ID NO: 1 13; SEQ ID NO: 1 14; SEQ ID NO: 1 15; SEQ ID NO: 1 16; SEQ ID NO: 1 17; SEQ ID NO: 1 18; SEQ ID NO: 1 19; SEQ ID NO: 120; SEQ ID NO: 121 ; SEQ ID NO: 122; SEQ ID NO: 123; SEQ ID NO: 124; SEQ ID NO: 125; SEQ ID NO: 126; SEQ ID NO: 127; SEQ ID NO: 128; SEQ ID NO: 129; SEQ ID NO: 130; SEQ ID NO: 131 ; SEQ ID NO: 132; SEQ ID NO: 133 and SEQ ID NO: 134.
- the present disclosure further relates to a method for assembling a vector encoding a BurrH-derived modular DNA binding polypeptide.
- BurrH-derived modular DNA binding polypeptides have a complex structure and comprise a DNA binding domain, also called BuD array.
- the BuD array is a modular domain usually comprising 18 DNA binding sub-domains also called BuDs. Each sub-domain is an amino acid sequence which specifically recognises one nucleotide. The nucleotides recognised by the different subdomains may be identical or different. Therefore, a BuD array consisting of 18 BuDs recognizes an 18-nucleotides long sequence.
- BurrH-derived polypeptides as disclosed herein recognize and bind nucleotide sequences with high precision and therefore can be used for several purposes, for example for editing genetic material when used in combination with catalytic domains of other moieties as described in the section "Additional features". More regarding possible applications of BurrH-derived polypeptides is described in the section below "Method for gene therapy”.
- the methods of the present disclosure allow assembling several BurrH-derived modular DNA binding polypeptides, one for each one of the possible 18-nucleotide long target sequences.
- the methods of the present disclosure allow the design of BurrH-derived modular polypeptides which are able to recognize a target double stranded DNA and/or
- the methods of the present invention also allow any construct comprising a BuD array, such as BuD array fused with an additional peptide for example a peptide having enzymatic activity or a signal peptide or a fluorescent peptide or moiety, to recognize a target double stranded DNA and/or DNA/RNA hybrids.
- a BuD array such as BuD array fused with an additional peptide for example a peptide having enzymatic activity or a signal peptide or a fluorescent peptide or moiety, to recognize a target double stranded DNA and/or DNA/RNA hybrids.
- the BurrH-derived modular DNA binding domain comprises, in addition to the BuD array, an N-terminal domain and a C-terminal domain, which are described in detail in the section above "Additional features".
- An aspect of the present disclosure relates to a BurrH-derived modular DNA binding polypeptide comprising or consisting of a BuD array, wherein the BuD array is a DNA binding domain comprising or consisting of 18 DNA binding sub-domains called BuD, each sub-domain independently recognising one nucleotide, the BurrH-derived modular binding polypeptide encoded by the polynucleotides disclosed herein.
- BurrH-derived modular DNA binding polypeptides of the present disclosure comprise or consist of any one of SEQ ID NO: 39; SEQ ID NO: 140; SEQ ID NO: 141 ; SEQ ID NO: 142; SEQ ID NO: 143; SEQ ID NO: 144; SEQ ID NO: 145; SEQ ID NO: 146; SEQ ID NO: 147; SEQ ID NO: 148; SEQ ID NO: 149; SEQ ID NO: 150; SEQ ID NO: 151 ; SEQ ID NO: 152; SEQ ID NO: 153; SEQ ID NO: 154; SEQ ID NO: 155; SEQ ID NO: 156; SEQ ID NO: 157; SEQ ID NO: 158; SEQ ID NO: 159 and SEQ ID NO: 160. Additional features
- the polypeptides disclosed herein may be linked to other peptides or moieties when used in various applications. For example, it may be useful to couple the BurrH-derived polypeptides to a catalytic domain, in particular from enzymes that interact with and may modify DNA and/or RNA. It may also be useful to link the BurrH-derived polypeptides to molecular markers, such as peptides to monitor the position of the BurrH-derived polypeptides, for example inside a host cell.
- the present disclosure relates to method for assembling modular DNA binding polypeptides.
- These polypeptides comprise at least a DNA binding domain (BuD array) and in some embodiments they also comprise an N-terminal domain and a C-terminal domain, where the N-terminal domain is positioned at the N-terminal end of a BuD array and the C-terminal domain is positioned at the C-terminal end of a BuD array.
- the N-terminal domain, BuD array and C-terminal domain are operably linked so that the assembled polypeptide is a functional BurrH-derived polypeptide.
- the N-terminal domain and the C-terminal domain are important for sorting and half-life of the BurrH- derived polypeptides, as well as for their folding. They are usually not involved into the binding of a target DNA sequence.
- the N-terminal domain may comprise signal peptide sequences, for example it may comprise a NLS or HA, so that the polypeptide reaches a certain destination in a cell and for protein localization.
- Epitopes for antibodies, such as HA or myc tag, may also be comprised in the N- terminal sequence because they can be used for protein localization.
- the N-terminal domain comprises between 1 and 170 amino acid residues, such as between 10 and 170 amino acid residues, such as between 20 and 170 amino acid residues, such as between 20 and 170 amino acid residues, such as between 30 and 170 amino acid residues, such as between 40 and 170 amino acid residues, such as between 50 and 170 amino acid residues, such as between 60 and 170 amino acid residues, such as between 70 and 170 amino acid residues, such as between 80 and 170 amino acid residues, such as between 90 and 170 amino acid residues, such as between 100 and 170 amino acid residues.
- the N-terminal domain comprises between 70 and 170 amino acid residues, such as between 70 and 160 amino acid residues, such as between 70 and 150 amino acid residues, such as between 70 and 140 amino acid residues, such as between 70 and 130 amino acid residues, such as between 70 and 120 amino acid residues, such as between 70 and 1 10 amino acid residues, such as between 80 and 1 10 amino acid residues, such as between 90 and 1 10 amino acid residues, such as between 100 and 1 10 amino acid residues.
- the N-terminal domain comprises or consists of 106 amino acids residues.
- the N-terminal domain is an amino acid chain comprising or consisting of SEQ ID NO: 84.
- the N-terminal domain is an amino acid chain comprising or consisting of a Nuclear Localization Sequence (NLS).
- NLS Nuclear Localization Sequence
- the NLS is an amino acid chain comprising or consisting of SEQ ID NO: 184.
- the N-terminal domain is an amino acid chain comprising or consisting of SEQ ID NO: 84 and comprising an NLS. In some embodiments, the N-terminal domain is an amino acid chain comprising or consisting of SEQ ID NO: 84 and comprising an NLS and an HA epitope.
- the C-terminal domain comprises between 1 and 180 amino acid residues, such as between 10 and 180 amino acid residues, such as between 20 and 180 amino acid residues, such as between 20 and 180 amino acid residues, such as between 30 and 180 amino acid residues, such as between 40 and 180 amino acid residues, such as between 50 and 180 amino acid residues, such as between 60 and 180 amino acid residues, such as between 70 and 180 amino acid residues, such as between 80 and 180 amino acid residues, such as between 90 and 180 amino acid residues.
- the C-terminal domain comprises between 70 and 180 amino acid residues, such as between 70 and 170 amino acid residues, such as between 70 and 160 amino acid residues, such as between 70 and 150 amino acid residues, such as between 70 and 140 amino acid residues, such as between 70 and 130 amino acid residues, such as between 70 and 120 amino acid residues, such as between 70 and 1 10 amino acid residues, such as between 80 and 1 10 amino acid residues, such as between 90 and 1 10 amino acid residues, such as between 90 and 100 amino acid residues.
- the C-terminal domain comprises or consists of between 96 and 121 amino acids residues.
- the C-terminal domain comprises or consists of 96 amino acids residues.
- the C-terminal domain is an amino acid chain comprising or consisting of SEQ ID NO: 85.
- polypeptides produced with methods of the present disclosure are can recognize and bind DNA or DNA RNA hybrids with exceptionally high specificity. Thanks to this feature, it is of high value to fuse these polypeptides with other moieties that are able to edit nucleotide sequences, so that specific sequences are targeted. In addition, it may also be of interest to fuse the polypeptides disclosed herein with moieties performing additional functions, for example moieties that that can facilitate monitoring of genetic manipulation procedures, such as transfection of a host cell, e.g. additional peptides and proteins may be fused to the BurrH-derived polypeptides.
- the vector further comprises a sequence encoding an additional peptide, such as a protein having enzymatic activity, wherein the additional peptide is operably linked to the BurrH-derived modular DNA binding polypeptide.
- said additional peptide induces double strand breaks on the target sequence.
- the additional peptide may be the catalytic domain of the restriction endonuclease Fok-I, which is a dimer and therefore in such case two Bud- arrays each recognizing one DNA strand would be used.
- the additional peptide may also be a monomeric nuclease such as the catalytic domain of the endonuclease l-tevl as wells as the catalytic domain of more specific restriction endonucleases such as pvul.
- said additional peptide induces transcriptional activation.
- the additional peptide may be a transcription activation domain such as VP16 or VP64.
- said additional peptide induces transcriptional repression.
- the additional peptide may be a transcription repressor such as SID (mSin3 interaction domain) or KRAB (Krijppel-associated box).
- said additional peptide induces modifications on a sequence adjacent or near to the target sequence.
- the additional peptide may be the catalytic domain of a polymerase or of a ligase or of a topoisomerase or of a recombinase or of a kinase or of a phosphatase or of a methylase.
- said additional peptide induces modifications on a protein that interacts with a sequence adjacent to or near the target sequence, such as a histone or a transcription factor.
- the additional peptide may be the catalytic domain of a demethylase or of a methylase or of a deacetylase or of an acetylase or of a phosphatase or of a dephosphatase.
- the additional peptide is selected from a group consisting of nucleases, transcriptional activators, transcriptional repressors, polymerases, ligase, topoisomerases, recombinases, kinases, methylases, demethylases, deacetylases, acetylases, phosphatases and diphosphatases.
- a signal peptide for exampleT2A peptide, and/or a fluorescent peptide are fused to the catalytic domain to facilitate monitoring of the transfection procedure.
- Various types of vectors can be used for the purpose of the present disclosure and the construct comprising a BuD array can easily be transferred from one vector to another vector using common molecular biology techniques such as PCR and restriction enzymes.
- a further aspect of the present disclosure relates to a polynucleotide encoding a BurrH- derived modular DNA binding polypeptide capable of recognizing an 18 nucleotide-long DNA target sequence, the polynucleotide sequence consisting of SEQ ID NO:135.
- Said polynucleotide may for example be inserted in a vector as disclosed herein.
- Said polynucleotide may be assembled using the methods disclosed herein, for example via LIC, or it may be chemically synthesised.
- the encoded BuD array is able to recognize an 18-nucleotides long target double stranded DNA sequence and/or DNA RNA hybrid sequence.
- the polynucleotide encoding a BurrH-derived modular DNA binding polypeptide is comprised in a vector as described in the above section "Vector".
- the polynucleotide comprises a sequence encoding a BuD array.
- the polynucleotide consists of a sequence encoding a BuD array.
- the polynucleotide comprises a sequence encoding a BuD array, a sequence encoding an N-terminal domain of BurrH or a BurrH-derived DNA- binding polypeptide and a C-terminal domain of BurrH or a BurrH-derived DNA-binding polypeptide, wherein the N-terminal domain is positioned upstream of the BuD array and wherein the C-terminal domain is position downstream of the BuD array, so that expression of said polynucleotide would result in production of a functional BurrH- derived DNA binding polypeptide.
- the polynucleotide consists of a sequence encoding a BuD array, a sequence encoding an N-terminal domain of BurrH-derived DNA binding polypeptide and a C-terminal domain of BurrH-derived DNA binding polypeptide, wherein the N-terminal domain is positioned upstream of the BuD array and wherein the C-terminal domain is position downstream of the BuD array, so that expression of said polynucleotide would result in production of a functional BurrH-derived DNA binding polypeptide.
- the polynucleotide is fused to an additional peptide, such as a peptide having enzymatic activity and/or a signal peptide, as described in the section above "Vector".
- the polynucleotide further comprises an N-terminal domain and a C-terminal domain, which are described in detail in the section above "N-termini and C-termini”.
- the polynucleotide of the present disclosure may comprise or consist of a nucleotide sequence encoding for any one of SEQ ID NO: 1 13; SEQ ID NO: 1 14; SEQ ID NO: 1 15; SEQ ID NO: 1 16; SEQ ID NO: 1 17; SEQ ID NO: 1 18; SEQ ID NO: 1 19; SEQ ID NO: 120; SEQ ID NO: 121 ; SEQ ID NO: 122; SEQ ID NO: 123; SEQ ID NO: 124; SEQ ID NO: 125; SEQ ID NO: 126; SEQ ID NO: 127; SEQ ID NO: 128; SEQ ID NO: 129; SEQ ID NO: 130; SEQ ID NO: 131 ; SEQ ID NO: 132; SEQ ID NO: 133 and SEQ ID NO: 134.
- polynucleotides encoding for BurrH-derived DNA binding polypeptides, the polypeptides resulting from expression of said polynucleotides, as well as methods for producing said polynucleotides and said polypeptides.
- a further aspect of the present disclosure relates to a method for production of a BurrH-derived modular DNA binding polypeptide that recognizes an 18 nucleotide- long target DNA sequence comprising the steps of:
- the BurrH-derived modular DNA binding polypeptide is able to recognize an 18-nucleotides long target double stranded DNA sequence and/or DNA RNA hybrid sequence via a BuD array.
- said BurrH-derived modular DNA binding polypeptide comprises or consist of an amino acid sequence having 90% sequence identity to SEQ ID NO:138, such as 91 % sequence identity to SEQ ID NO: 138, such as 92% sequence identity to SEQ ID NO: 138, such as 93% sequence identity to SEQ ID NO: 138, such as 94% sequence identity to SEQ ID NO: 138, such as 95% sequence identity to SEQ ID NO: 138, such as 96% sequence identity to SEQ ID NO: 138, such as 97% sequence identity to SEQ ID NO: 138, such as 98% sequence identity to SEQ ID NO: 138, such as 99% sequence identity to SEQ ID NO: 138.
- said BurrH-derived modular DNA binding polypeptide comprises or consist of any one of SEQ ID NO: 139; SEQ ID NO: 140; SEQ ID NO: 141 ; SEQ ID NO: 142; SEQ ID NO: 143; SEQ ID NO: 144; SEQ ID NO: 145; SEQ ID NO: 146; SEQ ID NO: 147; SEQ ID NO: 148; SEQ ID NO: 149; SEQ ID NO: 150; SEQ ID NO: 151 ; SEQ ID NO: 152; SEQ ID NO: 153; SEQ ID NO: 154; SEQ ID NO: 155; SEQ ID NO: 156; SEQ ID NO: 157; SEQ ID NO: 158; SEQ ID NO: 159 and SEQ ID NO: 160.
- the final vector comprises an additional peptide having enzymatic activity and/or a signal peptide and/or a fluorescent peptide.
- the assembled nucleotide sequence encoding the BuD array may be transferred to another vector.
- the assembled nucleotide sequence encoding the BuD array may be amplified via PCR and transferred to a vector that carries a sequence encoding an additional peptide with enzymatic activity and/or a signal peptide and/or a fluorescent peptide or moiety. More details regarding the additional peptide are found in the section above "Vector".
- the eukaryotic host cell may be an animal cell, such as a plant cell, such as a fungal cell, such as an algal cell.
- the host cells are eukaryotic producer cells from non-mammals, including but not limited to known producer cells such as yeast (Saccharomyces cerevisiae), filamentous fungi such as aspergillus, and insect cells, such as Sf9.
- the eukaryotic host cell may also be a mammalian cell, such as an avian cell, such as a reptilian cell, such as a fish cell, such as a protozoan cell, such as a yeast cell.
- mammalian host cells such as neuronal cells, neuronal precursor cells, neuronal progenitor cells, stem cells, foetal cells, hematopoietic stem cells, muscle stem cells, bone stem cells, cartilage stem cells may be used.
- the host cells may be suitable for biodelivery of the polynucleotide, as well as the polypeptides disclosed herein.
- naked or encapsulated cells that have been transformed or transduced with a vector that carries a sequence encoding a BurrH-derived modular DNA binding polypeptide and which can be transplanted to the patient to deliver said polypeptide locally may be used.
- Such cells may broadly be referred to as therapeutic cells.
- polypeptides of the present disclosure are suitable for administration in the context of gene therapy to a subject in need thereof.
- the polypeptides disclosed herein, as well as the nucleotide sequences coding for them can be used for editing DNA and DNA RNA hybrids.
- a composition comprising the polypeptides disclosed herein, as well as the nucleotide sequences coding for them, may be administered to a subject in need thereof.
- a host cell that has been transformed or transduced with a vector that carries a sequence encoding a BurrH-derived modular DNA binding polypeptide may be administered to a subject in need thereof, e.g. for gene therapy.
- gene therapy seeks to transfer new genetic material to the cells of a patient with resulting therapeutic benefit to the patient.
- benefits include treatment or prophylaxis of a broad range of diseases, disorders and other conditions.
- Ex vivo gene therapy approaches involve modification of isolated cells (including but not limited to stem cells, neural and glial precursor cells, and fetal stem cells), which are then infused, grafted or otherwise transplanted into the patient. See, e.g., U.S. Pat. Nos. 4,868,1 16, 5,399,346 and 5,460,959. In vivo gene therapy seeks to directly target host patient tissue in vivo.
- An aspect of the present disclosure relates to a composition
- a composition comprising the fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein or a BurrH-derived modular DNA binding polypeptide as disclosed herein for use as a medicament.
- compositions comprising the fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein or a BurrH-derived modular DNA binding polypeptide as disclosed herein for use in treatment of a disorder in a subject in need thereof.
- compositions comprising the fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein or a BurrH-derived modular DNA binding polypeptide as disclosed herein for use in gene therapy in a subject in need thereof.
- the gene therapy may be in vivo or ex vivo or ex vitro.
- compositions comprising the fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein or a BurrH-derived modular DNA binding polypeptide as disclosed herein for use in treatment of a disorder in a subject in need thereof, said disorder being selected from the group of cancer, neurodegenerative diseases, monogenic diseases and autoimmune diseases.
- Another aspect relates to the use of a fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein, a BurrH-derived modular DNA binding polypeptide as disclosed herein or a kit of parts as disclosed herein for editing a nucleotide sequence and/or the genome of a host cell.
- Another aspect relates to the use of a fragment as disclosed herein, a polynucleotide as disclosed herein, a vector as disclosed herein, a BurrH-derived modular DNA binding polypeptide as disclosed herein or a kit of parts as disclosed herein for production of a genetically modified organism, such as a microorganism, a plant or a mammal.
- a further aspect of the present disclosure relates to a method for editing a target nucleic acid sequence using a BuD array, comprising transfecting or transforming a host cell with a vector assembled via LIC and comprising a BuD array assembled via LIC of DNA fragments of Formula I as described in the section above "DNA fragments".
- the transfection or transformation of a host cell is carried out in vitro.
- the transfection or transformation of a host cell is carried out in vivo.
- the method further comprises assembling a vector comprising a BuD array and an additional peptide such as a peptide having enzymatic activity and/or a signal peptide, as described in the section above "Vector".
- the method further comprises transferring the BuD array to another vector.
- the sequence encoding the BuD array may be amplified via PCR and transferred to a vector that carries a sequence encoding an additional peptide with enzymatic activity and/or a signal peptide and/or a fluorescent peptide, as described in the section above "Vector".
- the vector comprises, in addition to the BuD array, an N- terminal domain and a C-terminal domain, which are described in detail in the section above "N-termini and C-termini”.
- a further aspect of the present disclosure relates to a kit of parts comprising:
- the kit of parts further comprises a polynucleotide according to the present disclosure comprising the N-terminus and the C-terminus of the BurrH or of a BurrH-derived DNA-binding polypeptide as defined in the section above "N-termini and C-termini”.
- the vector comprises 2 non-palindromic sequences and at least two restriction enzyme recognition sites, wherein the first non-palindromic sequence is compatible with the 5'-CR of the DNA fragments of the library G1 and the second non-palindromic sequence is compatible with the 3'-CR of the DNA fragments of the library G9.
- the vector further comprises a sequence encoding the N- terminal domain and a sequence encoding the C-terminal domain of the DNA binding polypeptide.
- the kit of parts further comprises 3 sub-vectors, SV1 , SV2 and SV3, wherein each sub-vector comprises 2 non-palindromic sequences and wherein: a. the first non-palindromic sequence of SV1 is compatible with the 5'-CR
- sequence of the DNA fragments of G1 and the second non-palindromic sequence of SV1 is compatible with the 3'-CR sequence of the DNA fragments of G3;
- the first non-palindromic sequence of SV2 is compatible with the 5'-CR
- sequence of the DNA fragments of G4 and the second non-palindromic sequence of SV2 is compatible with the 3'-CR sequence of the DNA fragments of G6;
- the first non-palindromic sequence of SV3 is compatible with the 5'-CR
- sequence of the DNA fragments of G7 and the second non-palindromic sequence of SV3 is compatible with the 3'-CR sequence of the DNA fragments of G9.
- each sub-vector comprises 2 non-palindromic sequences wherein:
- the first non-palindromic sequence of SV1 comprises or consists of SEQ ID NO:
- the second non-palindromic sequence of SV1 comprises or consists of SEQ ID NO: 185; e. the first non-palindromic sequence of SV2 comprises or consists of SEQ ID NO: 185 and the second non-palindromic sequence of SV2 comprises or consists of SEQ ID NO: 186;
- the first non-palindromic sequence of SV3 comprises or consists of SEQ ID NO:
- the second non-palindromic sequence of SV3 comprises or consists of
- the kit of parts allows assembling of a BuD array via LIC and its subsequent expression, according to the methods described herein.
- the vector comprises an additional peptide such as a peptide having enzymatic activity and/or a signal peptide, as described in the section above "Vector".
- the assembled nucleotide sequence encoding the BuD array may be transferred to another vector.
- the assembled nucleotide sequence encoding the BuD array may be amplified via PCR and transferred to a vector that carries a sequence encoding an additional peptide with enzymatic activity and/or a signal peptide and/or a fluorescent peptide, as described in the section above "Vector".
- the kit of parts may be used in a method for assembling a vector encoding a BurrH-derived modular DNA binding polypeptide as described herein.
- the kit of parts may be used in a method for production of a BurrH-derived modular DNA binding polypeptide as described herein.
- the kit of parts may be used in a method for editing a target nucleic acid sequence.
- the kit of parts may be used in a method for gene therapy as described herein. Examples
- a library consisting of 144 DNA fragments was prepared.
- the library was divided in 9 sub-libraries, G1 , G2, G3, G4, G5, G6, G7, G8 and G9.
- Each sub-library consisted of 16 DNA fragments, where each DNA fragment in a sub-library encoded two
- each of the 16 DNA fragments in a sub-library encoded two BuD domains engineered to recognize one of the 16 possible base pairs combinations: AA, AT, AC, AG, TA, TT, TC, TG, CA, CT, CC, CG, GA, GT, GC and GG.
- Each fragment was further modified by the addition of two different ligase independent cloning (LIC) sites, one at the 5' and one at the 3'.
- a total of 10 LIC sites were designed and fragments in the same group were modified with the same two LIC, according to the following scheme: LIC1 and LIC2 were added at the 5' and 3', respectively, of the fragments in G1 ; LIC2 and LIC3 were added at the 5' and 3', respectively, of the fragments in G2; LIC3 and LIC4 were added at the 5' and 3', respectively, of the fragments in G3; LIC4 and LIC5 were added at the 5' and 3', respectively, of the fragments in G4; LIC5 and LIC6 were added at the 5' and 3', respectively, of the fragments in G5; LIC6 and LIC7 were added at the 5' and 3', respectively, of the fragments in G6; LIC7 and LIC8 were added at the 5' and 3', respectively, of the
- Results A library of 144 DNA fragments encoding the 18 BuD domains and potentially able to recognize all possible 18 bp DNA or RNA sequences.
- step 1 the intermediate vectors were linearized by restriction enzyme (IV1 linearized by Pstl and Nsil; IV2 linearized by Kpnl; IV3 linearized by Nsil, Fig. 3). Then both, the fragments and the linearized vectors, were digested with T4 DNA polymerase so to generate DNA fragments with overhangs. A specific nucleotide was incorporated in the reactions, so that T4 DNA polymerase digested the 3' ends of the DNA fragments until the chosen nucleotide.
- the DNA sequences at the ends of the fragments as well as the ends of the linearized intermediate vectors were designed to generate, by LIC reactions, fragments having protruding ends that are complementary only to the previous and to the latter fragments in the assembly (Fig. 2), so that the directionality of the assemble is guaranteed.
- the digested IV1 with G1 , G2 and G3; IV2 with G4, G5 and G6 and IV3 with G7, G8 and G9 were then mixed together and transformed into E. coli. A multicloner mixture of plasmids was then isolated and used in the next step.
- step 2 the IH-1 , IH-2 and IH-3 were amplified by PCR using specific primers from the multicloner mixture from step 1 .
- the N-terminus part of the array comprised the following amino acid sequences: from position 4 to 10 an NLS domain, from 22 to 35 an HA (Human influenza hemagglutinin) epitope and from 36 to 106 an amino acid chain corresponding to residues 1 1 to 81 of the N-terminal part of polypeptide BurrHI (uniprot number E5AV36)
- the C-terminus part of the array comprised 96 amino acid residues corresponding to residues 677 to 773 of the polypeptide BurrHI (Uniprot number E5AV36).
- the BuDs-array was transferred to the plasmid p- 2Fokl using the following combinations of restriction enzymes (Fig. 6 - vector):
- the BuDs-array was transferred to the plasmid p- KRAB-EGPF using the following combination of restriction enzymes (Fig. 7):
- the BuDs-array was transferred to the plasmid p- SID-EGPF using the following combination of restriction enzymes (Fig. 8):
- the BuDs-array was transferred to the plasmid p- VP64-EGPF using the following combination of restriction enzymes (Fig. 9):
- BuDs arrays for the Star sequences can be also transfer to any other vectors by PCR using specific primers.
- Table 1 A list of the 22 DNA target sequences (Target DNA) recognized by the assembled BurrH-derived polypeptides; whether the ORF encoding for the
- polypeptides was sequenced and was verified after assembling procedure
- SEQ ID NO: 2 G2-AA a ttTagggagAgggggttctcccaggctgatattgtcaagatcgccggaaacNNNgg cggggcccaggccctgtattccgtgctggacgtggaaccaacTctggggaaacggg ggttctctcgggccgacattgtgaagatcgctgggaacNNNgggggcgcccaagcc ctccacact
- SEQ ID NO: 4 G4-AA a TatTGTGaagatAgcaggaaacNNNgggggcgctcaggcactccagacagtg ctggatttggagcctgccctgtgtgagagggggttttcccaggccaccatcgccaagat ggccgggaatNNNggcggcgcccaagcattgcagaccgtgctcgacctggagccc gcctgaggaagagggatTttcgccaggccgatattatcaaAatc
- SEQ ID NO: 8 G8-AA a tttagggagAggggatttagTcaggccgacatcgtcaaaattgccgggaacNNNg ggggcacccaggccctgcacgtcctcgatctggagagGatgctgggcgagaga gggttttcacgggctgacatcgtgaacgtggccgggaacNNNggcggggcccaag cactcaaggccgtgctcgaacacgaagccactctAaac
- SEQ ID NO: 106 CRc-3- ggC ggc get caa gec etc cag atg gtcctcgacctg gga cct gec ttg
- SEQ ID NO: 1 13 Buds-1 ATG GG CG ATCCCAAG AAG AAAAG G AAAGTG ATTG ATTACC
- SEQ ID NO: 1 14 Buds-2 ATG GG CG ATCCCAAG AAG AAAAG G AAAGTG ATTG ATTACC
- SEQ ID NO: 1 15 Buds-3 ATG GG CG ATCCCAAG AAG AAAAG G AAAGTG ATTG ATTACC
- SEQ ID NO: 1 16 Buds-4 ATG GG CG ATCCCAAG AAG AAAAG G AAAGTG ATTG ATTACC
- SEQ ID NO: 1 17 Buds-5 ATG GG CG ATCCCAAG AAG AAAAG G AAAGTG ATTG ATTACC
- SEQ ID NO: 1 18 Buds-6 ATG GG CG ATCCCAAG AAG AAAAG G AAAGTG ATTG ATTACC
- SEQ ID NO: 1 19 Buds-7 ATG GG CG ATCCCAAG AAG AAAAG G AAAGTG ATTG ATTACC
- SEQ ID NO: 120 Buds-8 ATG GG CG ATCCCAAG AAG AAAAG G AAAGTG ATTG ATTACC
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Biomedical Technology (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- Cell Biology (AREA)
- Peptides Or Proteins (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
La présente invention concerne des procédés d'assemblage de polypeptides de liaison à l'ADN modulaires. L'invention concerne également des vecteurs codant des polypeptides de liaison à l'ADN modulaires, et leur utilisation dans l'édition génétique.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| DKPA201670460 | 2016-06-27 | ||
| DKPA201670460 | 2016-06-27 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018001858A1 true WO2018001858A1 (fr) | 2018-01-04 |
Family
ID=59298435
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2017/065392 Ceased WO2018001858A1 (fr) | 2016-06-27 | 2017-06-22 | Assemblage sur mesure d'un polypeptide modulaire de bourgeon |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2018001858A1 (fr) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4868116A (en) | 1985-07-05 | 1989-09-19 | Whitehead Institute For Biomedical Research | Introduction and expression of foreign genetic material in epithelial cells |
| US5399346A (en) | 1989-06-14 | 1995-03-21 | The United States Of America As Represented By The Department Of Health And Human Services | Gene therapy |
| US5460959A (en) | 1987-09-11 | 1995-10-24 | Whitehead Institute For Biomedical Research | Transduced fibroblasts |
| WO2013074999A1 (fr) * | 2011-11-16 | 2013-05-23 | Sangamo Biosciences, Inc. | Protéines de liaison d'adn modifiées et utilisations de celles-ci |
| WO2014018601A2 (fr) * | 2012-07-24 | 2014-01-30 | Cellectis | Nouveaux domaines de liaison d'acide nucléique spécifiques à la base modulaires à partir de protéines burkholderia rhizoxinica |
| WO2014118719A1 (fr) * | 2013-02-01 | 2014-08-07 | Cellectis | Endonucléases tevl chimériques et leurs sites de clivage préférentiels |
-
2017
- 2017-06-22 WO PCT/EP2017/065392 patent/WO2018001858A1/fr not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4868116A (en) | 1985-07-05 | 1989-09-19 | Whitehead Institute For Biomedical Research | Introduction and expression of foreign genetic material in epithelial cells |
| US5460959A (en) | 1987-09-11 | 1995-10-24 | Whitehead Institute For Biomedical Research | Transduced fibroblasts |
| US5399346A (en) | 1989-06-14 | 1995-03-21 | The United States Of America As Represented By The Department Of Health And Human Services | Gene therapy |
| WO2013074999A1 (fr) * | 2011-11-16 | 2013-05-23 | Sangamo Biosciences, Inc. | Protéines de liaison d'adn modifiées et utilisations de celles-ci |
| WO2014018601A2 (fr) * | 2012-07-24 | 2014-01-30 | Cellectis | Nouveaux domaines de liaison d'acide nucléique spécifiques à la base modulaires à partir de protéines burkholderia rhizoxinica |
| WO2014118719A1 (fr) * | 2013-02-01 | 2014-08-07 | Cellectis | Endonucléases tevl chimériques et leurs sites de clivage préférentiels |
Non-Patent Citations (7)
| Title |
|---|
| ALEXANDRE JUILLERAT ET AL: "BurrH: a new modular DNA binding protein for genome engineering", SCIENTIFIC REPORTS, vol. 4, no. 1, 23 January 2014 (2014-01-23), XP055400287, DOI: 10.1038/srep03831 * |
| JUILLERAT A.; BERTONATI C.; DUBOIS G. ET AL.: "BurrH: a new modular DNA binding protein for genome engineering", SCI REP., vol. 4, 2014, pages 3831 |
| JUILLERAT A.; PESSEREAU C.; DUBOIS G. ET AL.: "Optimized tuning of TALEN specificity using non-conventional RVDs", SCIENTIFIC REPORTS, vol. 5, no. 8150, 2015 |
| STEFANO STELLA ET AL: "BuD, a helix-loop-helix DNA-binding domain for genome modification", ACTA CRYSTALLOGRAPHICA / D. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY, vol. 70, no. 7, 1 July 2014 (2014-07-01), Oxford, pages 2042 - 2052, XP055399979, ISSN: 2059-7983, DOI: 10.1107/S1399004714011183 * |
| STEFANO STELLA ET AL: "Expression, purification, crystallization and preliminary X-ray diffraction analysis of the novel modular DNA-binding protein BurrH in its apo form and in complex with its target DNA", ACTA CRYSTALLOGRAPHICA SECTION F STRUCTURAL BIOLOGY COMMUNICATIONS, vol. 70, no. 1, 1 January 2014 (2014-01-01), pages 87 - 91, XP055399983, DOI: 10.1107/S2053230X13033037 * |
| STELLA S.; MOLINA R.; LOPEZ-MENDEZ B. ET AL., BUD, A HELIX-LOOP-HELIX DNA-BINDING DOMAIN FOR GENOME MODIFICATION ACTA D, vol. 70, 2014, pages 2042 |
| THOMAS GAJ ET AL: "ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering", TRENDS IN BIOTECHNOLOGY, 1 May 2013 (2013-05-01), XP055065263, ISSN: 0167-7799, DOI: 10.1016/j.tibtech.2013.04.004 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2022203146B2 (en) | Engineered CRISPR-Cas9 nucleases | |
| KR101906491B1 (ko) | F. novicida 유래 Cas9을 포함하는 유전체 교정용 조성물 | |
| Li et al. | Genome-wide systematic characterization of the bZIP transcriptional factor family in tomato (Solanum lycopersicum L.) | |
| Ryu et al. | The WEREWOLF MYB protein directly regulates CAPRICE transcription during cell fate specification in the Arabidopsis root epidermis | |
| TW200815593A (en) | Zinc finger nuclease-mediated homologous recombination | |
| EP3382025B1 (fr) | Gène capable d'augmenter la teneur en protéines dans une semence et son procédé d'utilisation | |
| EP4013873A1 (fr) | Trans-épissage cible utilisant crispr/cas13 | |
| Ortt et al. | Derivation of the consensus DNA-binding sequence for p63 reveals unique requirements that are distinct from p53 | |
| Dong et al. | Regulation of biosynthesis and intracellular localization of rice and tobacco homologues of nucleosome assembly protein 1 | |
| Efiok et al. | A key transcription factor for eukaryotic initiation factor-2 alpha is strongly homologous to developmental transcription factors and may link metabolic genes to cellular growth and development. | |
| CN113061171B (zh) | 抗稻瘟病蛋白和基因、分离的核酸及其应用 | |
| CN111410695B (zh) | 基于自噬机制介导Tau蛋白降解的嵌合分子及其应用 | |
| CN112585266A (zh) | 新型转录激活物 | |
| CN115975986A (zh) | 突变的Cas12j蛋白及其应用 | |
| WO2018001858A1 (fr) | Assemblage sur mesure d'un polypeptide modulaire de bourgeon | |
| CN101668856A (zh) | 经过优化的非规范锌指蛋白 | |
| US20250243498A1 (en) | Synergistic promoter activation by combining cpe and cre modifications | |
| KR20040022361A (ko) | 높은 성장율을 갖는 식물체의 생산방법 | |
| JP5207354B2 (ja) | 転写抑制ペプチド及びその遺伝子 | |
| JPWO2005085467A1 (ja) | タンパク質複合体検出方法、およびタンパク質複合体検出キット | |
| AU4284799A (en) | Screening system | |
| CN102643851A (zh) | 拟南芥转录因子Dof1基因的原核表达载体及其应用 | |
| JP4464879B2 (ja) | 根粒の形成開始に関与する遺伝子とその利用 | |
| WO2023196220A2 (fr) | Méthode de traitement de perturbation fonctionnelle à l'échelle du génome de microsatellites humains à l'aide de doigts de zinc modifiés | |
| Meeks | Isolation and characterization of the four Arabidopsis thaliana poly (A) polymerase genes |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17737218 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17737218 Country of ref document: EP Kind code of ref document: A1 |