HK1142096B

HK1142096B - Optimized non-canonical zinc finger proteins

Info

Publication number: HK1142096B
Application number: HK10108528.2A
Authority: HK
Inventors: Qihua C. Cai; Jeffrey Miller; Fyodor Urnov; Vipula K. Shukla; Joseph F. Petolino; Lisa W. Baker; Robbi J. Garrison; Ryan C. Blue; Jon C. Mitchell; Nicole L. Arnold; Sarah E. Worden
Original assignee: 陶氏益农公司; 桑格摩生物科学股份有限公司
Priority date: 2006-12-14
Filing date: 2007-12-13
Publication date: 2018-04-27

Description

Optimized non-canonical zinc finger proteins

Cross reference to related applications

The present application claims the benefit of U.S. provisional application No.60/874,911 filed on 14.2006 and U.S. provisional application No.60/932,497 filed on 30.5.2007, the disclosures of both of which are incorporated herein by reference in their entirety.

Technical Field

The present disclosure is in the fields of genome engineering (genome engineering), gene targeting (genetic targeting), targeted chromosomal integration (targeted chromosomal integration), protein expression (protein expression), and exogenous genome editing (epigenome editing).

Background

Sequence-specific binding of proteins to DNA, RNA, proteins and other molecules involves many cellular processes such as, for example, transcription, replication, chromatin structure, recombination, DNA repair, RNA processing and translation. The binding specificity of cell binding proteins involved in protein-DNA, protein-RNA and protein-protein interactions contribute to development, differentiation and homeostasis.

Zinc Finger Proteins (ZFPs) are proteins that bind DNA in a sequence-specific manner. The zinc finger was originally identified in the transcription factor TFIIIA from Xenopus laevis (African clawed toad) oocytes. The single zinc finger domain of such ZFPs is about 30 amino acids long, and several structural studies have demonstrated that it contains a β -turn (containing two conserved cysteine residues) and an α -helix (containing two conserved histidine residues) that maintain a specific conformation by coordinating the zinc atom via two cysteines and two histidines. This class of ZFPs is also known as C2H2 ZFPs. Other types of ZFPs have also been suggested. See, e.g., Jiang et al (1996) J.biol.chem.271, discussed with respect to Cys-Cys-His-Cys (C3H) ZFP: 10723-10730. To date, over 10,000 zinc finger sequences have been identified among thousands of known or putative transcription factors. The zinc finger domain is not only involved in DNA recognition, but also in RNA binding and protein-protein binding. It is currently estimated that such molecules will account for about 2% of all human genes.

Most zinc finger proteins have conserved cysteine and histidine residues that tetrahedrally coordinate a single zinc atom in each finger domain. Specifically, most ZFPs have the general sequence-Cys- (X)_2-4-Cys-(X)₁₂-His-(X)_3-5The finger member of His- (SEQ ID NO: 1) is characterised in that X represents any amino acid (C2H2 ZFP). The most widely presented type of zinc coordination sequence comprises two cysteines and two histidines with a specific spacing.the folded structure of each finger comprises an antiparallel β -turn, a finger tip region and a short amphipathic α -helix.A metal coordinating ligand binds to zinc ions and, in the case of zif268 type zinc fingers, a short, amphipathic α -helix binds in the major groove of the DNA.in addition, the structure of the zinc finger is stabilised by certain conserved hydrophobic amino acid residues (e.g.the residue immediately preceding the first conserved Cys in the finger and the residue at position +4 of the helix segment) and by zinc coordination of conserved cysteine and histidine residues.

Canonical (C2H2) zinc finger proteins with alterations in the positions that result in direct base contacts, the "supportive" or "supportive" residues immediately adjacent to the base contact positions, and the positions that are capable of contacting the phosphate backbone of DNA have been described. See, e.g., U.S. patent nos. 6,007,988; 6,013,453, respectively; 6,140,081, respectively; 6,866,997, respectively; 6,746,838, respectively; 6,140,081, respectively; 6,610,512, respectively; 7,101,972, respectively; 6,453,242; 6,785,613, respectively; 7,013,219, respectively; PCT WO 98/53059; choo et al (2000) curr. opin. struct. biol.10: 411-416; segal et al (2000) curr. opin. chem.biol.4: 34-39.

In addition, zinc finger proteins comprising zinc fingers with modified zinc coordinating residues have also been described (see, e.g., U.S. patent application Nos. 20030108880; 20060246567; and 20060246588; the disclosures of which are incorporated by reference). However, while zinc finger proteins comprising these non-canonical zinc fingers retain gene transcription regulatory function, their ability to function as Zinc Finger Nucleases (ZFNs) is in some cases reduced relative to zinc finger proteins consisting solely of canonical C2H2 zinc fingers.

Thus, there remains a need (particularly in the construction of zinc finger nucleases) for additional engineered zinc finger binding proteins comprising zinc fingers with optimized non-canonical zinc coordination regions.

Summary of The Invention

The present disclosure provides zinc finger DNA binding domains with alterations in at least one zinc coordinating residue. Specifically, CCHC zinc fingers are described herein. These CCHC zinc fingers may further comprise additional alterations (substitutions, insertions and/or deletions) near the zinc coordinating residue (e.g., in residues around the most C-terminal (C-most) zinc coordinating residue of the zinc finger). Zinc finger polypeptides and fusion proteins comprising one or more of these CCHC zinc fingers, polynucleotides encoding these zinc fingers and fusion proteins, and methods of using these zinc finger polypeptides and/or fusion proteins are also described.

As such, the present disclosure encompasses, but is not limited to, the following numbered embodiments:

1. a zinc finger protein comprising a non-canonical (non-C)₂H₂) A zinc finger, wherein the non-canonical zinc finger has a helical portion involved in DNA binding and wherein the zinc coordination region of the helical portion comprises the amino acid sequence HX₁X₂RCX_L(SEQ ID NO: 2); and wherein the zinc finger protein is engineered to bind to a target sequence.

2. The zinc finger protein of embodiment 1, wherein X₁Is A and X₂Is Q.

3. Detailed description of the preferred embodiments1, wherein X₁Is K and X₂Is E.

4. The zinc finger protein of embodiment 1, wherein X₁Is T, and X₂Is R.

5. The zinc finger protein of embodiment 1, wherein X_LIs G.

6. A zinc finger protein comprising two or more zinc fingers, wherein at least one zinc finger comprises the sequence Cys- (X)^A)_2-4-Cys-(X^B)₁₂-His-(X^C)_3-5-Cys-(X^D)_1-10(SEQ ID NO:3) wherein X^A、X^B、X^CAnd X^DAnd may be any amino acid.

7. The zinc finger protein of any one of embodiments 1to 6, comprising any one of the sequences shown in any one of table 1, table 2, table 3 or table 4.

8. The zinc finger protein of embodiment 6 or 7, wherein X^DComprising the sequence QLV or QKP.

9. The zinc finger protein of embodiment 8, wherein the sequence QLV or QKP is the 3C-terminal amino acid residues of the zinc finger.

10. The zinc finger protein of any one of embodiments 6 to 9, wherein X^DComprising 1, 2 or 3 Gly (G) residues.

11. A zinc finger protein comprising a plurality of zinc fingers, wherein at least one zinc finger comprises a CCHC zinc finger according to any one of embodiments 1to 10.

12. The zinc finger protein of embodiment 11, wherein the zinc finger protein comprises 3,4, 5, or 6 zinc fingers.

13. The zinc finger protein of embodiment 11 or 12, wherein finger 2 comprises the CCHC zinc finger.

14. The zinc finger protein of any one of embodiments 11 to 13, wherein the C-terminal zinc finger comprises the CCHC finger.

15. The zinc finger protein of any one of embodiments 11 to 14, wherein at least two zinc fingers comprise the CCHC zinc finger.

16. The zinc finger protein of any one of embodiments 11 to 15, wherein the zinc finger protein comprises any one of the sequences shown in table 8 and is engineered to bind to a target sequence in the IPP2-K gene.

17. A fusion protein comprising a zinc finger protein of any one of embodiments 1to 16 and one or more functional domains.

18. A fusion protein comprising:

(a) a cleavage half-domain (half-domain),

(b) the zinc finger protein of any one of embodiments 1to 16, and

(c) a ZC linker inserted between the cleavage half-domain and the zinc finger protein.

19. The fusion protein of embodiment 18, wherein said ZC linker is 5 amino acids in length.

20. The fusion protein of embodiment 19, wherein the amino acid sequence of said ZC linker is GLRGS (SEQ ID NO: 4).

21. The fusion protein of embodiment 18, wherein said ZC linker is 6 amino acids in length.

22. The fusion protein of embodiment 21, wherein the amino acid sequence of the ZC linker is GGLRGS (SEQ ID NO: 5).

23. A polynucleotide encoding a zinc finger protein according to any one of embodiments 1to 16 or a fusion protein according to any one of embodiments 17 to 22.

24. A method for targeted cleavage of cellular chromatin in a plant cell, the method comprising expressing in the cell a pair of fusion proteins according to any one of embodiments 18 to 22, wherein:

(a) the target sequences of the fusion proteins are within 10 nucleotides of each other (within 10 nucleotides of each other); and is

(b) The fusion protein dimerizes and cleaves DNA located between the target sequences.

25. A method of targeted genetic recombination in a host plant cell, the method comprising:

(a) expressing in the host cell a pair of fusion proteins according to any one of embodiments 18 to 22, wherein the target sequence of the fusion protein is present in a selected host target locus; and are

(b) Identifying a recombinant host cell that exhibits a sequence change in the host target locus.

26. The method of embodiment 24 or 25, wherein said sequence alteration is a mutation selected from the group consisting of: deletion of genetic material, insertion of genetic material, replacement of genetic material, and any combination thereof.

27. The method of any one of embodiments 24 to 26, further comprising introducing an exogenous polynucleotide into said host cell.

28. The method of embodiment 27, wherein said exogenous polynucleotide comprises a sequence homologous to said host target locus.

29. The method of any one of embodiments 24 to 28, wherein said plant is selected from the group consisting of: monocotyledons, dicotyledons, gymnosperms and eukaryotic algae.

30. The method of embodiment 29, wherein said plant is selected from the group consisting of: corn, rice, wheat, potato, soybean, tomato, tobacco, members of the brassicaceae family (Brassica family), and Arabidopsis (Arabidopsis).

31. The method of any one of embodiments 24 to 29, wherein said plant is a tree.

32. The method of any one of embodiments 24 to 31, wherein said target sequence is in the IPP2K gene.

33. A method for reducing phytic acid levels in seeds comprising inactivating or altering the IPP2-K gene according to embodiment 32.

34. A method for making phosphorus more metabolically available in a seed, the method comprising inactivating or altering the IPP2-K gene according to embodiment 32.

35. A plant cell comprising a zinc finger protein according to any one of embodiments 1to 16, a fusion protein according to any one of embodiments 17 to 22, or a polynucleotide according to embodiment 23.

36. The plant cell of embodiment 35, wherein said cell is a seed.

37. The plant cell of embodiment 36, wherein said seed is a maize seed.

38. The plant cell of any one of embodiments 35 to 37, wherein IPP2-K is partially or completely inactivated.

39. The plant cell of embodiment 38, wherein the phytic acid level in said seed is reduced.

40. The plant cell of embodiments 35 to 39, wherein the metabolically utilizable level of phosphorus in said cell is increased.

Brief Description of Drawings

FIG. 1 is a graph depicting gene correction rates (gene correction rates) measured as a percentage of cells expressing GFP in U.S. Pat. No.2005/0064474 and the GFP cell reporter assay system described below. The ZFN variant is referred to as "X-Y," where "X" refers to a table number and "Y" refers to the number given to the zinc finger in the particular selected table. For example, "2-21" refers to a ZFN having a finger that includes the sequence shown in line 21 of Table 2, i.e., HAQRCGLRGSQLV (SEQ ID NO: 53).

Fig. 2 is a graph depicting the percentage of Cel-1 signal resulting from cutting using various ZFN variant pairs. The results of two experiments are shown for each pair of ZFNs by consulting the sample number. The variant pairs for each sample are shown in boxes in the upper right corner, where "wt 5-8" and "wt 5-9" refer to the canonical ZFN pair disclosed in example 14 (table 17) of U.S. patent application No. 2005/0064474. In samples 3-12, the C-terminal region of the recognition helix of finger 2 or finger 4 of canonical ZFNs 5-8 or 5-9 was replaced with a non-canonical sequence. The partial sequences of the non-canonical ZFN variants, designated 20, 21, 43, 45, 47 and 48, and the finger positions of these variants within the ZFN of 4 fingers are shown in samples 3-12 at the upper left corner above the graph. Asterisks above the bars depicting the experimental 2 results for samples 8 and 9 indicate background in the lane, resulting in an underestimation of ZFN efficacy.

FIG. 3 is a graph depicting the gene correction rate in U.S. Pat. No.2005/0064474 and the GFP cell reporter assay system described herein. The ZFN pairs tested in each sample are shown below each bar, where the zinc fingers numbers 20, 21, 43, 45, 47 and 48 are those described in example 3, while the CCHC zinc fingers 1a to 10a comprise the sequences shown in tables 3 and 4. Zinc fingers 20, 21, 7a, 8a, 9a and 10a are used in fig. 4; zinc fingers 43, 45, 47, 48, 1a, 2a, 3a, 4a, 5a, and 6a are used in fig. 2.

FIG. 4 is a linear representation of plasmid pDAB1585, a target vector for tobacco.

FIG. 5 is a schematic representation of plasmid pDAB1585, a target vector for tobacco

Fig. 6A and 6B depict Zinc Finger Nucleases (ZFNs). Fig. 6A is a schematic depicting ZFN binding. FIG. 6B shows the sequence of the target sequence.

Figure 7 is a schematic representation of plasmid pDAB 1400.

Figure 8 is a schematic representation of plasmid pDAB 782.

Figure 9 is a schematic representation of plasmid pDAB 1582.

Figure 10 is a schematic representation of plasmid pDAB 354.

Figure 11 is a schematic representation of plasmid pDAB 1583.

Figure 12 is a schematic representation of plasmid pDAB 2407.

Figure 13 is a schematic representation of plasmid pDAB 1584.

Figure 14 is a schematic representation of plasmid pDAB 2418.

Figure 15 is a schematic representation of plasmid pDAB 4045.

Figure 16 is a schematic representation of plasmid pDAB 1575.

Figure 17 is a schematic representation of plasmid pDAB 1577.

Figure 18 is a schematic representation of plasmid pDAB 1579.

Figure 19 is a schematic representation of plasmid pDAB 1580.

Figure 20 is a schematic representation of plasmid pDAB 3401.

FIG. 21 is a schematic representation of plasmid pDAB 1570.

FIG. 22 is a schematic representation of plasmid pDAB 1572.

Figure 23 is a schematic representation of plasmid pDAB 4003.

FIG. 24 is a schematic representation of plasmid pDAB 1571.

Figure 25 is a schematic representation of plasmid pDAB 7204.

FIG. 26 is a schematic representation of plasmid pDAB 1573.

Figure 27 is a schematic representation of plasmid pDAB 1574.

Figure 28 is a schematic representation of plasmid pDAB 1581.

FIG. 29 is a schematic representation of plasmid pDAB 1576.

Figure 30 is a schematic representation of plasmid pDAB 1600.

Figure 31 is a schematic representation of plasmid pDAB 3731.

Figure 32 is a schematic representation of plasmid pDAB 4322.

Figure 33 is a schematic representation of plasmid pDAB 4331.

Figure 34 is a schematic representation of plasmid pDAB 4332.

Figure 35 is a schematic representation of plasmid pDAB 4333.

Figure 36 is a schematic representation of plasmid pDAB 4334.

Figure 37 is a schematic representation of plasmid pDAB 4336.

Figure 38 is a schematic representation of plasmid pDAB 4339.

Figure 39 is a schematic representation of plasmid pDAB 4321.

Figure 40 is a schematic representation of plasmid pDAB 4323.

Figure 41 is a schematic representation of plasmid pDAB 4341.

Figure 42 is a schematic representation of plasmid pDAB 4342.

Figure 43 is a schematic representation of plasmid pDAB 4343.

Figure 44 is a schematic representation of plasmid pDAB 4344.

Figure 45 is a schematic representation of plasmid pDAB 4346.

Figure 46 is a schematic representation of plasmid pDAB 4330.

Figure 47 is a schematic representation of plasmid pDAB 4351.

Figure 48 is a schematic representation of plasmid pDAB 4356.

Figure 49 is a schematic representation of plasmid pDAB 4359.

Figure 50 is a schematic representation of plasmid pDAB 7002.

Figure 51 is a schematic representation of plasmid pDAB 7025.

FIG. 52 is a schematic representation of plasmid pDAB 1591.

FIG. 53 is a schematic representation of plasmid pcDNA3.1-SCD27a-L0-Fok1 (DNA template for PCR amplification of Scd27 ZFN).

FIG. 54 is a schematic representation of plasmid pDAB 1594.

FIG. 55 is a schematic representation of plasmid pDAB 1598.

FIG. 56 is a schematic representation of plasmid pDAB 1577.

FIG. 57 is a schematic representation of plasmid pDAB 1578.

Fig. 58 is a schematic representation of plasmid pDAB1601(PAT gene control vector).

FIG. 59 is a schematic depicting predicted in vivo homologous recombination stimulated by IL-1-Fok1 fusion protein.

FIG. 60 is a schematic representation of plasmid pDAB1590 (positive GFP expression control).

FIG. 61 is a schematic drawing depicting predicted homologous recombination between chromosomes stimulated by IL-1 zinc finger-Fok 1 fusion proteins.

FIG. 62 is a schematic drawing depicting predicted interchromosomal homologous recombination stimulated by Scd27 zinc finger-Fok 1 fusion proteins.

FIG. 63 is a gel depicting recombinant PCR analysis. The left front 4 lanes are marked above the gel. Lanes 1-5 of the marker show the HR event from BY2-380 transformation with the C3H IL-1-Fok1 fusion protein gene, and lanes 6-7 of the marker show the HR event from BY2-380 transformation with the C3H SCD27-FokI fusion protein gene.

FIG. 64 shows the maize IPP2K gene sequence (SEQ ID NO: 6), which is derived from HiII cell culture and which serves as an engineered design template for targeting the ZFNs of maize IPP 2K.

Fig. 65 (panels a to E) depicts the ZFN expression vector cloning scheme. Stepwise cloning strategy was used to generate ZFN expression constructs. Each ZFN-encoding gene was cloned into the vectors pVAX-N2A-NLSop2-EGFP-FokMono (A) and pVAX-C2A-NLSop2-EGFP-FokMono (B) to create a dual-protein cassette (C). This cassette was ligated into pDAB3872(D) to generate the final plasmid (E) for expression of ZFN heterodimers.

Fig. 66 depicts ZFN binding in maize IPP2K gene. Two ZFN proteins are required to perform double-stranded cleavage of DNA. The sequence (SEQ ID NO: 7) around the cleavage site (indicated by the downward arrow) is shown. If one protein (8705) binds to sequence CTGTGGGGCCAT (top strand) (SEQ ID NO: 8), the other protein (8684, 8685, or 8686) binds to the downstream sequence (CTTGACCAACTCAGCCAG, bottom strand) (SEQ ID NO: 9).

FIG. 67 depicts the sequences of wild type (top sequence, SEQ ID NO: 10) and ZFN clone 127 (bottom sequence, SEQ ID NO: 11). The cut target for this ZFN is highlighted in a grey box.

Figure 68 shows an alignment of multiple deletions in maize IPP2K gene detected by 454 sequencing resulting from non-homologous end joining (NHEJ) of ZFN-mediated dsDNA fragmentation. The cut target for this ZFN is highlighted in a grey box.

FIG. 69 is a graph depicting the gene correction rate in the GFP cell reporter assay system described in U.S. Pat. No.2005/0064474 and herein. The ZFN pairs tested in each sample are shown below each bar.

Figure 70 depicts plasmid pDAB7471 constructed as described in example 18B.

Figure 71 depicts plasmid pDAB7451 constructed as described in example 18C.

Fig. 72 is a schematic depicting an exemplary autonomous herbicide tolerance gene expression cassette. This construct comprises an intact promoter-transcription unit (PTU) comprising a promoter, an herbicide tolerance gene, and a polyadenylation (polyA) termination sequence as described in example 18D.

Figure 73 depicts plasmid pDAB7422 constructed as described in example 18E. The plasmid contains the complete promoter-transcription unit (PTU) inserted in the position 1 plasmid backbone (backbone), which contains the promoter, herbicide tolerance gene and polyadenylation (polya) termination sequence.

Figure 74 depicts the plasmid pDAB7452 constructed as described in example 18E. The plasmid contains the entire promoter-transcription unit (PTU) inserted in the position 2 plasmid backbone, which contains the promoter, herbicide tolerance gene and polyadenylation (poly a) termination sequence.

Fig. 75 is a schematic depicting an exemplary non-autonomous herbicide tolerance gene expression cassette. This construct contains an incomplete promoter-transcription unit (PTU) comprising an herbicide tolerance gene and a polyadenylation (poly a) termination sequence as described in example 18F.

Figure 76 depicts plasmid pDAB7423 constructed as described in example 18G. The plasmid contains an incomplete promoter-transcription unit (PTU) inserted in position 1 in the plasmid backbone, which contains the herbicide tolerance gene and a polyadenylation (polya) termination sequence.

Figure 77 depicts plasmid pDAB7454 constructed as described in example 18G. This plasmid contains an incomplete promoter-transcription unit (PTU) inserted in the position 2 plasmid backbone as described in example 18G, which contains the herbicide tolerance gene and a polyadenylation (polya) termination sequence.

FIG. 78 depicts plasmid pDAB7424 (an exemplary plasmid via) constructed as described in example 18HPosition 1 autonomous donor for adaptation (adapt).

FIG. 79 depicts plasmid pDAB7425 (an exemplary plasmid via) constructed as described in example 18HPosition 1 autonomous donor adapted).

Figure 80 depicts plasmid pDAB7426 constructed as described in example 18H. pDAB7426 is a composite plasmid comprising a position 1 autonomous donor and ZFN expression cassette.

Figure 81 depicts plasmid pDAB7427 constructed as described in example 18H. pDAB7427 is a composite plasmid comprising a position 1 autonomous donor and ZFN expression cassette.

FIG. 82 depicts amplification of donor DNA specific sequences from genomic DNA. The presence of the 317bp product determined the presence of the donor DNA comprising the PAT gene inserted into the genome of maize callus (callus) line #61-72, as described in example 20C. HiII indicates the wild-type negative control.

FIG. 83 depicts amplification of the 5' boundary between donor DNA and IPP 2K-specific maize genomic sequences. The secondary PCR product (secondary PCR product) derived from targeted integration of the donor into the IPP2K gene was judged by the presence of a 1.65Kbp DNA fragment as described in example 21A. HiII indicates the wild-type negative control.

FIG. 84 depicts amplification of the 3' boundary between donor DNA and IPP 2K-specific maize genomic sequence. The secondary PCR product derived from targeted integration of the donor into the IPP2K gene was judged by the presence of a 1.99Kbp DNA fragment as described in example 21A. HiII indicates the wild-type negative control.

Figure 85 depicts amplification of the upstream (5') boundary between genome and donor. PCR products derived from targeted integration of the donor into the IPP2K gene (5' border) were judged by the presence of a DNA fragment of 1.35Kbp in size, as described in example 21B. HiII indicates the wild-type negative control.

Fig. 86 depicts amplification of the downstream (3') boundary between donor and genome. PCR products derived from targeted integration of the donor into the IPP2K gene (3' border) were judged by the presence of a DNA fragment of 1.66Kbp in size, as described in example 21B. HiII indicates the wild-type negative control.

FIG. 87 depicts the sequences flanking homology at position 15' (SEQ ID NO: 171).

FIG. 88 depicts the sequences flanking homology at position 13' (SEQ ID NO: 172).

FIG. 89 depicts the sequences flanking the homology at position 25' (SEQ ID NO: 139).

FIG. 90 depicts the sequences flanking homology at position 23' (SEQ ID NO: 140).

FIG. 91 depicts the sequence of the upstream (5' -) IPP2K genomic sequence of the ZFN targeting region (SEQ ID NO: 141).

FIG. 92 depicts the sequence of the downstream (3' -) IPP2K genomic sequence of the ZFN targeting region (SEQ ID NO: 142).

Detailed Description

Disclosed herein are compositions comprising zinc finger binding polypeptides (ZFPs) comprising non-canonical zinc fingers of the form Cys-His-Cys. Since zinc coordination provides the main folding energy for zinc fingers, modulation of the zinc coordination residues provides a convenient means for modifying finger stability and structure, which affects a variety of important functional characteristics of zinc finger proteins, including, for example, cell half-life, interaction with other cytokines, DNA binding specificity and affinity, and relative orientation of the domains.

Zinc finger proteins comprising non-canonical zinc fingers, such as those disclosed in U.S. patent application Nos. 20030108880; 20060246567; and 20060246588, have been shown to bind DNA and alter transcription. However, these previously described non-canonical zinc finger proteins sometimes exhibit sub-optimal activity in cleaving target DNA after being incorporated into a zinc finger nuclease (ZFN, see, e.g., U.S. patent application publication No. 2005/0064474).

Described herein are zinc finger proteins comprising one or more CCHC zinc fingers, in which a specific sequence around the C-terminal zinc coordination residue pair has been altered. Also described herein are fusion proteins comprising these optimized non-canonical zinc fingers, e.g., Zinc Finger Nucleases (ZFNs), wherein the ZFNs cut the target DNA at a rate or rate comparable to the cleavage achieved using ZFNs Comprising Canonical (CCHH) zinc fingers.

The fusion polypeptides disclosed herein are capable of enhancing or inhibiting transcription of a gene and/or cleavage of a target sequence. Also provided are polynucleotides encoding optimized non-canonical zinc fingers, and polynucleotides encoding fusion proteins comprising one or more optimized non-canonical zinc fingers. Also provided are pharmaceutical compositions comprising a therapeutically effective amount of any of the zinc finger-nucleotide binding polypeptides or functional fragments thereof described herein or a therapeutically effective amount of a nucleotide sequence encoding any of the improved zinc finger-nucleotide binding polypeptides or functional fragments thereof, in combination with a pharmaceutically acceptable carrier. Also provided are agricultural compositions comprising an agronomically effective amount of any of the zinc finger-nucleotide binding polypeptides or functional fragments thereof described herein or an agronomically effective amount of a nucleotide sequence encoding any of the improved zinc finger-nucleotide binding polypeptides or functional fragments thereof, in combination with an agronomically acceptable carrier. Also provided are screening methods for obtaining improved zinc finger-nucleotide binding polypeptides capable of binding to a genomic sequence.

Genomic sequences include those present in chromosomes, episomes, organelle genomes (e.g., mitochondria, chloroplasts), artificial chromosomes, and any other type of nucleic acid present in a cell, such as, for example, amplification sequences, double minichromosomes, and endogenous or infectious bacterial and viral genomes. The genomic sequence may be normal (i.e., wild-type) or mutated; the mutant sequence may comprise, for example, an insertion, deletion, substitution, translocation, rearrangement, and/or point mutation. The genomic sequence may also comprise one of a number of different alleles.

General technique

Unless otherwise indicated, the preparation and use of the compositions and methods disclosed herein can be practiced using conventional techniques in molecular biology, biochemistry, chromatin structure and analysis, computational chemistry, cell culture, recombinant DNA and related fields, which are within the skill of the art. These techniques are explained fully in the literature. See, e.g., Sambrook et al, MOLECULAR CLONING: ALABORATORY MANUAL, second edition, Cold Spring Harbor laboratory Press, 1989 and third edition, 2001; ausubel et al, Current PROTOCOLS IN MOLECULARBIOLOGY, John Wiley & Sons, New York, 1987 and periodic updates; book Methodsin Enzymatic, Academic Press, San Diego; wolffe, CHROMATINSTRUCUTURE AND FUNCTION, third edition, Academic Press, SanDiego, 1998; (ii) METHODS IN ENZYMOLOGY, Vol.304, "Chromatin" (eds. P.M.Wassarman and A.P.Wolffe), Academic Press, San Diego, 1999; and METHODS IN moleculalibriology, volume 119, "chromatography Protocols" (edited by p.b. becker), Humana Press, Totowa, 1999.

Definition of

The terms "nucleic acid", "polynucleotide", and "oligonucleotide" are used interchangeably to refer to a polymer of deoxyribonucleotides or ribonucleotides in either a linear or cyclic conformation, and either single-or double-stranded form. For the purposes of this disclosure, these terms are not to be construed as limiting with respect to the length of the polymer. The term may encompass known analogs of natural nucleotides as well as nucleotides with modifications in the base, sugar and/or phosphate moiety (e.g., phosphorothioate backbones). In general, analogs of a particular nucleotide have the same base-pairing specificity; i.e., the analog of a will base pair with T.

The terms "polypeptide", "peptide" and "protein" are used interchangeably to refer to a polymer of amino acid residues. The term also applies to amino acid polymers in which one or more amino acids are chemical analogs or modified derivatives of the corresponding naturally occurring amino acid.

"binding" refers to a sequence-specific, non-covalent interaction between macromolecules (e.g., between a protein and a nucleic acid). Not all components of the binding interaction need be sequence specific (e.g., in contact with phosphate residues in the DNA backbone), so long as the interaction as a whole is sequence specific. Generally, such interactions are characterized by a dissociation constant (K)_d) Is 10^-6M^-1Or lower. "affinity" refers to the binding strength: increased binding affinity to K_dThe correlation is reduced.

"binding protein" refers to a protein that is capable of non-covalent binding to another molecule. Binding proteins may bind to, for example, DNA molecules (DNA binding proteins), RNA molecules (RNA binding proteins) and/or protein molecules (protein binding proteins). In the case of a protein binding protein, it may bind to itself (to form homodimers, homotrimers, etc.) and/or it may bind to one or more different molecules of a different protein or proteins. The binding protein may have more than one type of binding activity. For example, zinc finger proteins have DNA binding, RNA binding, and protein binding activities.

A "zinc finger DNA binding protein" (or binding domain) refers to a protein or a domain within a larger protein that binds DNA in a sequence-specific manner via one or more zinc fingers, which are regions of amino acid sequence within the binding domain and whose structure is stabilized by coordination of zinc ions. The term zinc finger DNA binding protein is often abbreviated as zinc finger protein or ZFP.

A zinc finger binding domain may be "engineered" to bind to a predetermined nucleotide sequence. Design and selection are non-limiting examples of methods for engineering zinc finger proteins. The designed zinc finger protein is a protein that does not exist in nature, and its design/composition is mainly derived from rational criteria (ratioargeriteria). Reasonable criteria for design include the application of substitution rules and computerized algorithms for processing information in a database storing existing ZFP designs and binding data information. See, e.g., U.S. patent 6,140,081; 6,453,242; 6,534,261; and 6,785,613; see also WO 98/53058; WO 98/53059; WO 98/53060; WO 02/016536 and WO 03/016496; and U.S. patent 6,746,838; 6,866,997, respectively; and 7,030,215.

"selected" zinc finger proteins refer to proteins not found in nature, the production of which results primarily from empirical methods such as phage display, interaction traps, or hybrid selection. See, e.g., US5,789,538; US5,925,523; US6,007,988; US6,013,453; US6,200,759; US6,733,970; US RE39,229; and WO 95/19431; WO 96/06166; WO 98/53057; WO 98/54311; WO 00/27878; WO 01/60970; WO 01/88197 and WO 02/099084.

A "non-canonical" zinc finger protein refers to a protein that comprises a non-canonical (non-C2H 2) zinc finger. As such, the non-canonical zinc finger comprises a substitution, addition, and/or deletion of at least one amino acid as compared to a naturally occurring C2H2 zinc finger protein. Non-limiting examples of non-canonical zinc fingers include those that contain a Cys-Cys-His-Cys (e.g., C3H) zinc coordinating residue (from amino to carboxyl).

"homologous sequence" refers to a first sequence that shares some degree of sequence identity with a second sequence, and whose sequence may be identical to that of the second sequence. "homologous but non-identical sequence" refers to a first sequence that shares some degree of sequence identity with a second sequence, but whose sequence is not identical to that of the second sequence. For example, a polynucleotide comprising the wild-type sequence of a mutant gene is homologous but not identical to the sequence of the mutant gene. In certain embodiments, the degree of homology between the two sequences is sufficient to allow homologous recombination between them using normal cellular mechanisms. Two homologous but non-identical sequences may be of any length, and their degree of non-homology may be as little as a single nucleotide (e.g., for correcting genomic point mutations by targeted homologous recombination) or as much as 10 kilobases or more (e.g., for inserting genes at predetermined sites in the chromosome). Two polynucleotides comprising homologous but non-identical sequences need not be of the same length. For example, exogenous polynucleotides (i.e., donor polynucleotides) of between 20 and 10,000 nucleotides or nucleotide pairs may be used.

Techniques for determining the identity of nucleic acid and amino acid sequences are known in the art. Typically, such techniques involve sequencing the nucleotide sequence of the mRNA and/or determining the amino acid sequence encoded thereby for the gene and comparing these sequences to a second nucleotide or amino acid sequence. Genomic sequences may also be determined and compared in this manner. In general, identity refers to the exact nucleotide-to-nucleotide (nucleotid-to-nucleotid) correspondence between two polynucleotide sequences or the exact amino acid-to-amino acid (amino acid) correspondence between two polypeptide sequences. Two or more sequences (polynucleotides or amino acids) can be compared by determining their percent identity. The percent identity of two sequences (whether nucleic acid or amino acid sequences) is the number of exact matches between the two aligned sequences divided by the length of the shorter sequence and multiplied by 100. Approximate alignment of nucleic acid sequences is described by Smith and Waterman, advanced applied Mathematics 2: 482 and 489 (1981). The algorithm can be applied to amino acid sequences by using a scoring matrix coded by Dayhoff, Atlas of Protein sequencing and Structure, m.o. Dayhoff, 5 supl.3: 353-358, National biological research foundation, Washington, d.c., USA, and was developed by Gribskov, nucleic acids res.14 (6): 6745 and 6763 (1986). An exemplary implementation of this algorithm to determine percent sequence identity is provided by Genetics computer group (Madison, WI) in the "best fit" utility application (utility application). Default parameters for this method are described in Wisconsin Sequence Analysis Package ProgramManual, 8 th edition (1995) (available from Genetics Computer Group, Madison, WI). An exemplary method of establishing percent identity in the context of the present disclosure is to use the MPSRCH package, which is proprietary to edinburgh, developed by John f.collins and Shane s.sturrok, and sold by IntelliGenetics, Inc. From this package, the Smith-Waterman algorithm can be employed, where default parameters are used for the scoring table (e.g., gap open penalty of 12, gap extension penalty of 1, and gap of 6). From the generated data, the "match" value reflects sequence identity. Other suitable programs for calculating percent identity or similarity between sequences are generally known in the art, for example, another alignment program is BLAST used with default parameters. For example, BLASTN and BLASTP can be used, using the following default parameters: the genetic code is standard; filter (none); strand-double strand (both); cutoff (cutoff) 60; expect (expect) ═ 10; matrix (Matrix) ═ BLOSUM 62; descriptions (Descriptions) 50 sequences; sort (sort by) ═ HIGH SCORE; database (Databases) is non-redundant, GenBank + EMBL + DDBJ + PDB + GenBank CDS translation + Swiss protein + stupdate + PIR. Details of these programs can be found on the internet. With respect to the sequences described herein, the desired range of degrees of sequence identity is about 35% to 100% and any integer value therebetween. Typically, the percent identity between sequences is at least 35% -40%; 40% -45%; 45% -50%; 50% -60%; 60% -70%; 70-75%, preferably 80-82%, more preferably 85% -90%, even more preferably 92%, still more preferably 95%, and most preferably 98% sequence identity.

Alternatively, the degree of sequence similarity between polynucleotides can be determined by hybridizing polynucleotides under conditions that allow for the formation of stable duplexes between homologous regions, followed by digestion with a single strand specific nuclease, and determining the size of the fragments after digestion. Two nucleic acid sequences or two polypeptide sequences are substantially homologous to each other if they exhibit at least about 70% -75%, preferably 80% -82%, more preferably 85% -90%, even more preferably 92%, still more preferably 95%, and most preferably 98% sequence identity over a defined length of molecule as determined using the methods described above. As used herein, substantially homologous also refers to sequences that show complete identity with respect to a specified DNA or polypeptide sequence. Substantially homologous DNA sequences can be identified in Southern hybridization experiments, for example, under stringent conditions (as determined by that particular system). Determination of suitable hybridization conditions is within the skill of the art. See, e.g., Sambrook et al, supra; nucleic Acid Hybridization: aprective Approach, editors b.d.hames and sj.higgins, (1985) Oxford; washington, DC; IRL Press.

Selective hybridization of two nucleic acid fragments can be determined as follows. The degree of sequence identity between two nucleic acid molecules affects the efficiency and intensity of such intermolecular hybridization events. A partially identical nucleic acid sequence at least partially inhibits hybridization of the completely identical sequence to the target molecule. Inhibition of hybridization of identical sequences can be assessed using hybridization assays well known in the art (e.g., southern (DNA) blots, northern (RNA) blots, solution hybridization, etc., see Sambrook et al, Molecular Cloning: A laboratory Manual, second edition, (1989) Cold Spring Harbor, N.Y.). Such assays can be performed using varying degrees of selectivity (e.g., using different conditions from low to high stringency). If low stringency conditions are used, the lack of non-specific binding can be assessed by using a second probe that lacks even a partial degree of sequence identity (e.g., a probe having less than about 30% sequence identity to the target molecule) such that the second probe does not hybridize to the target in the absence of non-specific binding events.

In using a hybridization-based detection system, a nucleic acid probe that is complementary to a reference nucleic acid sequence is selected, and then the probe and reference sequence are selectively hybridized or bound to each other by selecting appropriate conditions to form a duplex molecule. Nucleic acid molecules capable of selectively hybridizing to a reference sequence under moderately stringent hybridization conditions typically hybridize under conditions that permit detection of a target nucleic acid sequence of at least about 10-14 nucleotides in length that shares at least about 70% sequence identity with the selected nucleic acid probe sequence. Stringent hybridization conditions typically allow for detection of target nucleic acid sequences having greater than about 90-95% sequence identity to the selected nucleic acid probe sequence and a length of at least about 10-14 nucleotides. Hybridization conditions useful for probe/reference sequence Hybridization, wherein the probe and reference sequence have a particular degree of sequence identity, can be determined as known in the art (see, e.g., Nucleic Acid Hybridization: analytical Approach, editions B.D.Hames and S.J.Higgins, (1985) Oxford; Washington, DC; IRL Press).

Hybridization conditions are well known to those skilled in the art. Hybridization stringency refers to the degree to which hybridization conditions are unfavorable for the formation of hybrids containing mismatched nucleotides, with higher stringency being associated with lower tolerance to mismatched hybrids. Factors that affect hybridization stringency are well known to those skilled in the art and include, but are not limited to, temperature, pH, ionic strength, and organic solvent (such as, for example, formamide and dimethyl sulfoxide) concentration. As is known to those skilled in the art, hybridization stringency is increased by increasing temperature, decreasing ionic strength, and decreasing solvent concentration.

With respect to stringency conditions for hybridization, it is well known in the art that many equivalent conditions may be employed to establish a particular stringency by varying, for example, the following factors: the length and nature of the sequence, the base composition of the various sequences, the concentration of salts and other hybridization solution components, the presence or absence of blocking agents (e.g., dextran sulfate and polyethylene glycol) in the hybridization solution, the temperature and time parameters of the hybridization reaction; and changing the washing conditions. Selection of a particular set of hybridization conditions is performed following standard methods in the art (see, e.g., Sambrook et al, Molecular Cloning: A Laboratory Manual, second edition, (1989) Cold Spring Harbor, N.Y.).

"recombination" refers to the process of exchanging genetic information between two polynucleotides. For the purposes of this disclosure, "Homologous Recombination (HR)" refers to a specialized form of such exchange that occurs, for example, during repair of a double-strand break in a cell. This process requires nucleotide sequence homology, uses a "donor" molecule for template repair of a "target" molecule (i.e., a molecule that has undergone a double strand break), and is variously referred to as "non-cross gene conversion" or "short-path gene conversion" because it results in the transfer of genetic information from the donor to the target. Without wishing to be bound by any particular theory, such transfer may involve mismatch correction of heteroduplex DNA formed between the fragmented target and donor, and/or "synthesis-dependent strand annealing" (where the donor is used to resynthesize genetic information that becomes part of the target), and/or related processes. Such specialized HR typically results in a change in the sequence of the target molecule, such that part or all of the donor polynucleotide sequence is incorporated into the target polynucleotide.

"cleavage" refers to the breaking of the covalent backbone of a DNA molecule. Cleavage can be initiated by a variety of methods, including but not limited to enzymatic or chemical hydrolysis of phosphodiester bonds. Both single-stranded cleavage and double-stranded cleavage are possible, and double-stranded cleavage can occur as a result of two different single-stranded cleavage events. DNA cleavage can result in the generation of blunt ends or staggered ends. In certain embodiments, the fusion polypeptide is used to target double-stranded DNA cleavage.

A "cleavage domain" comprises one or more polypeptide sequences that possess catalytic activity for DNA cleavage. The cleavage domain may be comprised in a single polypeptide chain, or the cleavage activity may result from the binding of two (or more) polypeptides.

A "cleavage half-domain" refers to a polypeptide sequence that binds to a second polypeptide (either the same or different) to form a complex with cleavage activity, preferably double-stranded cleavage activity.

The terms "cleavage domain" and "cleavage half-domain" include wild-type domains and portions or mutants of cleavage domains or cleavage half-domains that retain the ability to multimerize (e.g., dimerize) to form a functional cleavage domain.

"chromatin" refers to the structure of nucleoproteins comprising the genome of a cell. Cellular chromatin comprises nucleic acids (primarily DNA) and proteins (including histone and non-histone chromosomal proteins). Most eukaryotic chromatin exists in the form of nucleosomes in which the nucleosome core comprises approximately 150 base pairs of DNA bound to an octamer comprising two copies of each of histones H2A, H2B, H3 and H4; and the linker region DNA (of variable length, depending on the organism) extends between the nucleosome cores. Generally, a molecular protein H1 binds to the linker DNA. For the purposes of this disclosure, the term "chromatin" is intended to encompass all types of nuclear proteins, both prokaryotic and eukaryotic. Cellular chromatin includes both chromatin of the chromosomal and episomal types.

"chromosome" refers to a chromatin complex comprising all or part of a cellular genome. Generally, the genome of a cell is characterized by its karyotype, which is the collection of all the chromosomes that make up the genome of the cell (collection). The genome of a cell may comprise one or more chromosomes.

"episome" (episome) refers to a replicative nucleic acid, nucleoprotein complex, or other structure comprising a nucleic acid that is not part of the karyotype of a cell. Examples of episomes include plasmids and certain viral genomes.

An "accessible region" refers to a site in cellular chromatin where a target site present in the nucleic acid can be bound by a foreign molecule that recognizes the target site. Without wishing to be bound by any particular theory, it is believed that the reach region is a region that is not packed into the nucleosome structure. The different accessible region structures can generally be detected by their sensitivity to chemical and enzymatic probes (e.g., nucleases).

"target site" or "target sequence" refers to a nucleic acid sequence that defines the portion of the nucleic acid to which a binding molecule will bind, provided that conditions sufficient for binding to occur are present. For example, the sequence 5 '-GAATTC-3' is the target site for the Eco RI restriction endonuclease.

An "exogenous" molecule refers to a molecule that is not normally present in a cell but can be introduced into a cell by one or more genetic, biochemical, or other means. "normally present in a cell" is determined relative to the particular developmental stage and environmental conditions of the cell. Thus, for example, a molecule that is only present during embryonic development of muscle is an exogenous molecule relative to an adult muscle cell. Similarly, the molecule induced by heat shock is an exogenous molecule relative to a cell that is not heat shocked. The exogenous molecule may comprise, for example, a dysfunctional (functional) version of a dysfunctional (functional) endogenous molecule or a dysfunctional version of a normally functioning endogenous molecule.

The foreign molecule may be a small molecule (such as produced by combinatorial chemistry), or a large molecule (such as a protein, nucleic acid, carbohydrate, lipid, glycoprotein, lipoprotein, polysaccharide), any modified derivative of the above, or any complex comprising one or more of the above, and the like. Nucleic acids include DNA and RNA, and may be single-stranded or double-stranded; may be linear, branched or cyclic; and may be of any length. Nucleic acids include those capable of forming duplexes, as well as triplex-forming nucleic acids. See, for example, U.S. Pat. Nos. 5,176,996 and 5,422,251. Proteins include, but are not limited to, DNA binding proteins, transcription factors, chromatin remodeling factors (chromatin remodeling factors), methylated DNA binding proteins, polymerases, methylases, demethylases, acetylases, deacetylases, kinases, phosphatases, integrases, recombinases, ligases, topoisomerases, gyrases, and helicases.

The exogenous molecule may be the same type of molecule as the endogenous molecule, e.g., an exogenous protein or nucleic acid. For example, the exogenous nucleic acid may comprise an infectious viral genome, an agrobacterium tumefaciens (agrobacterium tumefaciens) T chain, a plasmid or episome introduced into the cell, or a chromosome not normally present in the cell. However, the exogenous nucleic acid or polynucleotide may comprise a sequence that is homologous or identical to the endogenous sequence. "exogenous sequence" refers to a nucleotide sequence that is not present in a particular endogenous genomic region. The exogenous sequence may be present at another endogenous chromosomal location or it may not be present in the genome at all. As such, the exogenous polynucleotide may comprise both exogenous and endogenous sequences: for example, a transgene flanked by sequences homologous to a genomic region. Such exogenous nucleic acids are used in the methods for targeted integration and targeted recombination described below. Methods for introducing foreign molecules into cells are known to those of skill in the art and include, but are not limited to, lipid-mediated transfer (i.e., liposomes, which include neutral and cationic lipids), electroporation, direct injection, cell fusion, particle bombardment, calcium phosphate co-precipitation, DEAE-dextran-mediated transfer, and viral vector-mediated transfer.

In contrast, an "endogenous" molecule refers to a molecule that is normally present in a particular cell at a particular developmental stage under particular environmental conditions. For example, an endogenous nucleic acid can comprise a chromosome, the genome of a mitochondrion, chloroplast or other organelle, or a naturally occurring episomal nucleic acid. Other endogenous molecules may include proteins, such as transcription factors and enzymes.

A "fusion" molecule refers to a molecule in which two or more subunit molecules are linked (e.g., covalently) together. The subunit molecules may be molecules of the same chemical type, or may be molecules of different chemical types. Examples of the first type of fusion molecule include, but are not limited to, fusion proteins (e.g., fusions between ZFPDNA binding domains and cleavage domains) and fusion nucleic acids (e.g., nucleic acids encoding the fusion proteins described above). Examples of the second type of fusion molecule include, but are not limited to, fusions between triplex-forming nucleic acids and polypeptides, and fusions between minor groove binders and nucleic acids.

Expression of the fusion protein in the cell can be derived from delivery of the fusion protein to the cell or by delivery of a polynucleotide encoding the fusion protein to the cell, wherein the polynucleotide is transcribed and the transcript is translated to produce the fusion protein. Expression of proteins in cells may also involve trans-splicing, polypeptide cleavage, and polypeptide ligation. Methods for delivering polynucleotides and polypeptides to cells are set forth elsewhere in this disclosure.

"Gene" for the purposes of this disclosure includes the DNA region encoding the gene product (see below), as well as all DNA regions that regulate the production of the gene product, whether or not such regulatory sequences are adjacent to coding sequences and/or transcribed sequences. Thus, genes include, but are not necessarily limited to, promoter sequences, terminators, translation regulatory sequences such as ribosome binding sites and internal ribosome entry sites, enhancers, silencing genes (silencers), insulators (insulators), boundary elements, origins of replication, matrix attachment sites, and locus control regions.

"Gene expression" refers to the conversion of information contained in a gene into a gene product. The gene product can be a direct transcription product of a gene (e.g., mRNA, tRNA, rRNA, antisense RNA, ribozyme, structural RNA, or any other type of RNA) or a protein produced by translation of mRNA. Gene products also include RNA modified by capping, polyadenylation, methylation and editing processes, and proteins modified by, for example, methylation, acetylation, phosphorylation, ubiquitination, ADP-ribosylation, myristoylation, and glycosylation.

"Regulation" of gene expression refers to a change in gene activity. Expression regulation may include, but is not limited to, gene activation and gene repression.

"plant" cells include, but are not limited to, cells of monocot (monocot) or dicot (dicot) plants. Non-limiting examples of monocots include cereals such as maize, rice, barley, oats, wheat, sorghum, rye, sugarcane, pineapple, onion, banana, and coconut. Non-limiting examples of dicotyledonous plants include tobacco, tomato, sunflower, cotton, sugar beet, potato, lettuce, melon (melon), soybean, canola (rape), and alfalfa. The plant cell may be from any part of the plant and/or from any stage of plant development.

"region of interest" refers to any region of cellular chromatin that is desired to bind to a foreign molecule, such as, for example, a gene or a non-coding sequence within or adjacent to a gene. Binding may be for the purpose of targeted DNA cleavage and/or targeted recombination. For example, the region of interest can be present in a chromosome, episome, organelle genome (e.g., mitochondrial, chloroplast), or infectious viral genome. The region of interest may be within a coding region of the gene, within a transcribed non-coding region (such as, for example, a leader sequence, a trailer sequence, or an intron), or within a non-transcribed region (either upstream of the coding region or downstream of the coding region). The length of the region of interest can be as small as a single nucleotide pair or up to 25,000 nucleotide pairs, or any integer value of nucleotide pairs.

The term "operably linked" is used interchangeably with reference to the juxtaposition of two or more components, such as sequential elements, in which the components are arranged so that both components function properly and to allow for the possibility that at least one of the components may mediate the function imparted to at least one of the other components. For example, a transcriptional regulatory sequence (such as a promoter) is operably linked to a coding sequence if the transcriptional regulatory sequence controls the level of transcription of the coding sequence in response to the presence or absence of one or more transcriptional regulatory factors. Transcriptional regulatory sequences are generally operably linked to a coding sequence in cis, but need not be directly adjacent thereto. For example, an enhancer is a transcriptional regulatory sequence operably linked to a coding sequence even if they are not contiguous.

With respect to fusion polypeptides, the term "operably linked" may refer to the fact that each member performs the same function when linked to another member as it would if not so linked. For example, with respect to a fusion polypeptide in which a ZFP DNA-binding domain is fused to a cleavage domain, the ZFP DNA-binding domain and the cleavage domain are operably linked if in the fusion polypeptide the ZFP DNA-binding domain portion is capable of binding to its target site and/or its binding site, and the cleavage domain is capable of cleaving DNA near the target site.

A "functional fragment" of a protein, polypeptide, or nucleic acid refers to a protein, polypeptide, or nucleic acid that differs in sequence from the full-length protein, polypeptide, or nucleic acid, but retains the same function as the full-length protein, polypeptide, or nucleic acid. Functional fragments may possess a greater, lesser, or the same number of residues as the corresponding native molecule, and/or may comprise one or more amino acid or nucleotide substitutions. Methods for determining a function of a nucleic acid (e.g., encoding a function, ability to hybridize to another nucleic acid) are well known in the art. Similarly, methods for determining protein function are well known. For example, the function of a polypeptide to bind to DNA can be determined by, for example, filter-binding, electrophoretic mobility shift, or immunoprecipitation assay. Cleavage of the DNA can be determined by gel electrophoresis. See Ausubel et al, supra. The ability of one protein to interact with another can be determined by, for example, co-immunoprecipitation, two-hybrid assays, or complementation (both genetic and biochemical). See, e.g., Fields et al, (1989) Nature 340: 245-246; U.S. Pat. No.5,585,245 and PCT WO 98/44350.

Zinc finger binding domains

Described herein are non-canonical zinc finger binding domains and polynucleotides encoding these zinc finger binding domains. In certain embodiments, the non-canonical zinc finger binding domain described herein is a C3H zinc finger in which one of the two conserved zinc-coordinating histidine residues is converted to cysteine. In other embodiments, the C-most histidine residue is converted to a cysteine residue to produce a "CCHC protein".

The zinc finger binding domain may comprise one or more zinc fingers (e.g., 2,3, 4, 5,6, 7, 8, 9 or more zinc fingers) and may be engineered to bind any target sequence (e.g., a genomic sequence). The zinc finger binding domain may bind DNA, RNA, and/or protein. Typically, a single zinc finger domain is about 30 amino acids in length. Zinc finger including norm C₂H₂Zinc fingers (i.e., those in which the zinc ion is coordinated by two cysteine and two histidine residues) and non-canonical zinc fingers (including, for example, C3H zinc fingers, i.e., those in which the zinc ion is coordinated by three cysteine residues and one histidine residue). See also U.S. patent application No. 20030108880; 20060246567, respectively; and 20060246588, the disclosures of which are incorporated by reference.

Structural studies have demonstrated that a canonical zinc finger domain (motif) contains two beta sheets (held in a beta turn containing two invariant cysteine residues) and one alpha helix (containing two invariant histidine residues) that are held in a specific conformation by the coordination of the zinc atoms by two cysteines and two histidines. The non-canonical zinc fingers disclosed herein retain this β - β - α structure.

The non-canonical zinc fingers described herein can be naturally occurring zinc finger binding domains. More typically, however, the non-canonical zinc fingers described herein include one or more zinc finger members in which at least one zinc coordinating cysteine or histidine residue has been replaced with one or more amino acids. For example, in certain embodiments, the C-terminal His residue of the canonical zinc finger binding module is replaced with a Cys residue.

The CCHC zinc fingers described herein may also comprise one or more alterations in the sequence of amino acid residues other than the zinc coordinating residue (relative to a naturally occurring C2H2 zinc finger sequence). Such changes may include substitutions, deletions, and/or insertions. Amino acids can be changed anywhere in the zinc finger. Non-limiting examples of changes include: (1) single residue substitutions around the altered zinc coordinating residue; (2) additional residue insertions before or after the altered zinc coordinating residue, (e.g., where the C-most His residue is converted to a Cys, the addition of an additional amino acid residue may facilitate zinc coordination by compensating for the shorter cysteine side chain); and/or (3) the substitution of residues located between His and Cys residues in a naturally occurring CCHC zinc finger into the corresponding region of a non-canonical CCHC zinc finger.

In certain embodiments, the zinc finger proteins described herein comprise at least one zinc finger, including a non-canonical (non-C2H 2) zinc finger, wherein the non-canonical zinc finger has a helical portion involved in DNA binding and wherein the zinc coordination region of the helical portion comprises the amino acid sequence HX₁X₂RCX_L(SEQ ID NO: 2); and wherein the zinc finger protein is engineered to bind to the target sequence. In certain embodiments, X₁Is A or K or T; x₂Is Q or E or R; and X_LIs G.

In other embodiments, the non-canonical zinc fingers described herein have the general structure: cys- (X)^A)_2-4-Cys-(X^B)₁₂-His-(X^C)_3-5-Cys-(X^D)_1-10(SEQ ID NO:3) wherein X^A、X^B、X^CAnd X^DRepresents any amino acid. At X^CIn embodiments comprising 3 residues, (i) at least one of these residues is altered as compared to a canonical CCHC zinc finger; and/or (ii) X^DComprising at least one deletion, substitution or insertion as compared to a canonical CCHH zinc finger. In certain embodiments, X^DComprising the sequence QLV or QKP. In other embodiments, X^DComprising one or more (e.g., 1, 2,3, 4, 5,6, 7, 8, 9, or 10) gly (g) residues.

Partial amino acid sequences of exemplary non-canonical zinc fingers (C-terminal to and including the 3 rd zinc coordinating residue) are shown in table 1, table 2, table 3, and table 4. In all tables, the two most C-terminal (i.e., 3 rd and 4 th) zinc coordinating residues (H and C) are underlined. Changes (e.g., substitutions, insertions, deletions) compared to the "wild-type" non-canonical finger sequence (line 2 of tables 1 and 3) are shown double underlined.

TABLE 1

TABLE 2

TABLE 3

TABLE 4

As described above, a ZFP may include any number of zinc finger binding domains, e.g., at least 3 zinc fingers. Further, one, more than one, or all of the zinc fingers may be non-canonical zinc fingers as described herein.

In certain embodiments, the C-most digit of a zinc finger protein of the plurality of digits comprises a canonical zinc finger. In other embodiments, the C-most finger of a zinc finger protein of a plurality of fingers refers to a finger comprising a CCHC as described herein, e.g., a CCHC finger comprising one or more amino acid insertions C-terminal to the C-terminal of the C-most zinc coordinating Cys residue. See examples 1-5, which describe 4-finger zinc finger proteins, where finger 2(F2) and/or finger 4(F4) are non-canonical zinc fingers as described herein.

The zinc finger binding domain may be engineered to bind to a selected sequence. See, e.g., Beerli et al (2002) NatureBiotechnol.20: 135-141; pabo et al (2001) Ann. Rev. biochem.70: 313-340; isalan et al (2001) Nature Biotechnol.19: 656-; segal et al (2001) curr. opin. biotechnol.12: 632-637; choo et al (2000) curr. opin. struct. biol.10: 411-416. Engineered zinc finger binding domains may have novel binding specificities compared to naturally occurring zinc finger proteins. Methods of engineering include, but are not limited to, rational design and various types of selection (e.g., methods of screening multiple different zinc finger sequences for a single target nucleotide sequence). Rational design includes, for example, the use of databases comprising triplet (or quadruplet) nucleotide sequences and various zinc finger amino acid sequences, wherein each triplet or quadruplet nucleotide sequence is associated with one or more amino acid sequences of a zinc finger that binds to a particular triplet or quadruplet sequence. See, for example, commonly owned U.S. Pat. Nos. 6,453,242 and 6,534,261. Other design methods are disclosed in, for example, U.S. patent 6,746,838; 6,785,613, respectively; 6,866,997, respectively; and 7,030,215. Enhancement of binding specificity of zinc finger binding domains has been described, for example, in commonly owned U.S. patent No.6,794,136.

Exemplary selection methods (including phage display and two-hybrid systems) are disclosed in U.S. patent 5,789,538; 5,925,523, respectively; 6,007,988, respectively; 6,013,453, respectively; 6,410,248, respectively; 6,140,466, respectively; 6,200,759, respectively; and 6,242,568; and WO 98/37186; WO 98/53057; WO 00/27878; WO 01/88197 and GB 2,338,237.

Since each zinc finger binds to a3 nucleotide (i.e., triplet) sequence (or a4 nucleotide sequence that may overlap by one nucleotide with a4 nucleotide binding site of an adjacent zinc finger), the length of the sequence (e.g., target sequence) to which the zinc finger binding domain is engineered will determine the number of zinc fingers in the engineered zinc finger binding domain. For example, for ZFPs in which the zinc finger motif does not bind overlapping subsites, a 6 nucleotide target sequence is bound by a 2-finger binding domain; a9 nucleotide target sequence is bound by a3 finger binding domain; and so on. The binding sites (i.e., subsites) of the zinc fingers of an individual in a target site need not be contiguous, but may be separated by one or several nucleotides, depending on the length and nature of the amino acid sequences between the zinc fingers (i.e., the interphalangeal linkers) in the multi-finger binding domain. See, e.g., U.S. patent 6,479,626; 6,903,185 and 7,153,949 and U.S. patent application publication No. 2003/0119023; the disclosures of which are incorporated by reference.

In a multi-finger zinc finger binding domain, adjacent zinc fingers may be separated by an amino acid linker sequence of about 5 amino acids (so-called "canonical" inter-finger linkers) or by one or more non-canonical linkers. See, for example, commonly owned U.S. Pat. Nos. 6,453,242 and 6,534,261. For engineered zinc finger binding domains comprising more than three fingers, insertion of longer ("non-canonical") inter-finger linker sequences between certain zinc fingers may improve the affinity and/or specificity of binding domain binding. See, for example, U.S. patent No.6,479,626 and U.S. patent application publication No.2003/0119023, the disclosures of which are incorporated by reference. Thus, multi-finger zinc finger binding domains can also be characterized with respect to the presence and location of non-canonical inter-finger linkers. The use of longer interphalangeal linkers may also facilitate the binding of the zinc finger protein to a target site comprising a non-continuous nucleotide. Thus, one or more subsites in the target site of the zinc finger binding domain may be separated from each other by 1, 2,3, 4, 5, or more nucleotides. To provide just one example, a 4-finger binding domain can bind a 13-nucleotide target site, which in turn comprises two consecutive 3-nucleotide subsites, one intervening nucleotide, and two consecutive triplet subsites.

The target subsite is a nucleotide sequence (typically 3 or 4 nucleotides) bound by a single zinc finger. However, the target site need not be a multiple of three nucleotides. For example, in the case of cross-chain interactions (see, e.g., U.S. Pat. nos. 6,453,242 and 6,794,136), one or more zinc finger individuals of a multi-finger binding domain may bind overlapping quadruplet subsites. See also U.S. patent 6,746,838 and 6,866,997. To provide just one example, a 3-finger binding domain can bind a 10-nucleotide target site that includes three overlapping 4-nucleotide subsites.

Selection of zinc finger domain binding sequences (e.g., target sites) in cellular chromatin can be achieved according to methods disclosed, for example, in commonly owned U.S. Pat. No.6,453,242 (9/17 2002), which also discloses methods for designing ZFPs that bind the selected sequences. It will be clear to the skilled person that the target site may also be selected by simple visual inspection of the nucleotide sequence. Thus, any means for target site selection may be used in the methods described herein.

Multi-finger zinc finger proteins can be constructed by linking zinc finger individuals obtained, for example, by design or selection. Alternatively, binding modules (modules) consisting of two zinc fingers can be linked to each other using canonical or longer, non-canonical, inter-finger linkers (see above) to generate 4-and 6-finger proteins. Such a two-finger module can be obtained, for example, by selecting two adjacent fingers that bind a particular 6-nucleotide target sequence in the context of a multi-finger protein (typically 3 fingers). See, for example, WO 98/53057 and U.S. patent application publication No.2003/0119023, the disclosures of which are incorporated by reference. Alternatively, a module of two fingers can be constructed by assembly of zinc finger individuals.

As such, the zinc finger domains described herein can be used alone or in various combinations to construct a multi-finger zinc finger protein that can bind to any target site.

The distance between sequences (e.g., target sites) refers to the number of nucleotides or nucleotide pairs that are inserted between two sequences, as measured from the edges of the sequences that are closest to each other.

In embodiments using ZFNs, for example, where the cleavage effect relies on the binding of two zinc finger domain/cleavage half domain fusion molecules to separate target sites, the two target sites may be on opposing DNA strands. In other embodiments, the two target sites are on the same DNA strand. See, for example, WO2005/084190, the disclosure of which is incorporated by reference.

Polynucleotides encoding zinc fingers or zinc finger proteins are also within the scope of the disclosure. These polynucleotides may be constructed using standard techniques and may be inserted into a vector, and the vector may be introduced into a cell (see further disclosure below regarding vectors and methods for introducing polynucleotides into cells), such that the encoded protein is expressed in the cell.

Fusion proteins

Also provided are fusion proteins comprising one or more of the non-canonical zinc finger members described herein.

Fusion molecules are constructed by cloning and biochemical conjugation methods well known to those skilled in the art. Fusion molecules comprise a CCHC-containing ZFP and a functional fragment such as a cleavage domain, cleavage half-domain, transcription activation domain, transcription repression domain, component of chromatin remodeling complex, insulator domain, any of these domains; and/or any combination of two or more functional domains or fragments thereof.

In certain embodiments, the fusion molecule comprises a modified plant zinc finger protein and at least two functional domains (e.g., an insulator domain or a methyl binding protein domain, and additionally, a transcriptional activation or repression domain).

Optionally, the fusion molecule further comprises a nuclear localization signal (such as, for example, from SV40T antigen or corn Opaque No.2 (Opaque-2) NLS) and an epitope tag (such as, for example, FLAG or hemagglutinin). Fusion proteins (and nucleic acids encoding them) are designed such that the translational reading frame is maintained between the components of the fusion.

Methods for designing and constructing fusion proteins (and polynucleotides encoding them) are known to those skilled in the art. For example, methods for designing and constructing fusion proteins comprising zinc finger proteins (and polynucleotides encoding them) are described in commonly owned U.S. Pat. nos. 6,453,242 and 6,534,261.

Polynucleotides encoding such fusion proteins are also within the scope of the present disclosure. These polynucleotides can be constructed using standard techniques and can be inserted into a vector, and the vector can be introduced into a cell (see further disclosure below regarding vectors and methods for introducing polynucleotides into cells).

Exemplary functional domains to be used for repressing gene expression for fusion with the ZFP DNA binding domain are the KRAB repression domain from the human KOX-1 protein (see, e.g., Thiesen et al, New biologicst 2, 363-. KOX domains are also suitable for use as repressor domains. Another suitable repressor domain is the methyl binding domain protein 2B (MBD-2B) (see also Hendrich et al (1999) Mamm Genome 10: 906-912 for a description of MBD proteins). Another useful repressor domain is that associated with the v-ErbA protein. See, e.g., Damm et al (1989) Nature 339: 593-; evans (1989) int.J. cancer suppl.4: 26-28; pain et al (1990) New biol.2: 284-294; sap et al (1989) Nature 340: 242-244; zenke et al (1988) Cell 52: 107-119; and Zenke et al (1990) Cell 61: 1035-1049. Additional exemplary repressor domains include, but are not limited to, thyroid hormone receptor (TR), SID, MBD1, MBD2, MBD3, MBD4, MBD-like proteins, DNMT family members (e.g., DNMT1, DNMT3A, DNMT3B), Rb, MeCP1, and MeCP 2. See, e.g., Zhang et al (2000) Ann Rev Physiol 62: 439-466; bird et al (1999) Gell 99: 451-454; tyler et al (1999) Cell 99: 443-; knoepfler et al (1999) Cell 99: 447- > 450; and Robertson et al (2000) Nature Genet.25: 338-342. Additional exemplary repression domains include, but are not limited to, ROM2 and AtHD 2A. See, e.g., Chern et al (1996) Plant Cell 8: 305-321; and Wu et al (2000) Plant J.22: 19-27.

Suitable domains for achieving activation include the HSV VP16 activation domain (see, e.g., Hagmann et al, J.Virol.71, 5952-5962 (1997)); nuclear hormone receptors (see, e.g., Torchia et al, curr. Opin. cell. biol. 10: 373-383 (1998)); the p65 subunit of the nuclear factor κ B (Bitko and Barik, J.Virol.72: 5610-; or artificial chimeric domains such as VP64(Seifpal et al, EMBO J.11, 4961-.

Additional exemplary activation domains include, but are not limited to, p300, CBP, PCAF, SRC1 PvALF, and ERF-2. See, e.g., Robyr et al (2000) mol. endocrinol.14: 329- > 347; collingwood et al (1999) J.mol.Endocrinol.23: 255-275; leo et al (2000) Gene 245: 1to 11; Manteuffel-Cymboorowska (1999) Acta Biochim. Pol.46: 77-89; McKenna et al (1999) J.Steroid biochem.mol.biol.69: 3-12; malik et al (2000) Trends biochem. Sci.25: 277-283; and Lemon et al (1999) curr. opin. genet. dev.9: 499-504. Additional exemplary activation domains include, but are not limited to, OsGAI, HALF-1, C1, AP1, ARF-5, -6, -7, and-8, CPRF1, CPRF4, MYC-RP/GP, and TRAB 1. See, e.g., Ogawa et al (2000) Gene 245: 21-29; okanami et al (1996) Genes Cells 1: 87-99; goff et al (1991) GenesDev.5: 298-309; cho et al (1999) Plant mol. biol.40: 419-429; ulmason et al (1999) proc.natl.acad.sci.usa 96: 5844-5849; Sprenger-Haussels et al (2000) Plant J.22: 1to 8; gong et al (1999) Plant mol. biol.41: 33-44; and Hobo et al (1999) Proc.Natl.Acad.Sci.USA96: 15, 348-15, 353.

Other functional domains are disclosed, for example, in commonly owned U.S. patent No.6,933,113. In addition, insulator domains, chromatin remodeling proteins (such as ISWI-containing domains), and methyl binding domain proteins suitable for use in fusion molecules are described in, for example, commonly owned International publications WO 01/83793 and WO 02/26960.

In other embodiments, the fusion protein is a Zinc Finger Nuclease (ZFN) comprising one or more CCHC zinc fingers and a cleavage domain (or cleavage half-domain) as described herein. The zinc finger may be engineered to recognize a target sequence in any selected genomic region and, upon introduction into a cell, will result in the binding of the fusion protein to its binding site and cleavage within or near the genomic region. This cleavage can result in a nucleotide sequence change (e.g., mutation) in the genomic region following non-homologous end joining. Alternatively, if an exogenous polynucleotide comprising a sequence homologous to the genomic region is also present in the cell, homologous recombination occurs at a high rate (rate) between the genomic region and the exogenous polynucleotide after targeted cleavage by the ZFN. Homologous recombination can result in targeted sequence replacement or targeted integration of the exogenous sequence, depending on the nucleotide sequence of the exogenous polynucleotide.

The non-canonical zinc fingers described herein provide improved cutting function after being introduced into ZFNs. As described in the examples, ZFNs comprising 4 fingers of at least one CCHC finger described herein cut at least as well as nucleases comprising only CCHH fingers. In certain embodiments, where the C-terminal finger comprises a non-canonical CCHC zinc finger, the residues between the 3 rd and 4 th zinc coordinating residues (i.e., between the C-terminal His and Cys residues) are different from those present in a canonical CCHH zinc finger, and one or more glycine residues (e.g., 1, 2,3, 4, 5 or more) are inserted after the C-terminal Cys residue.

The cleavage domain portion of the ZFNs disclosed herein can be obtained from any endonuclease or exonuclease. Exemplary endonucleases from which the cleavage domain can be derived include, but are not limited to, restriction endonucleases and homing endonucleases. See, e.g., 2002-; and Belfort et al (1997) Nucleic Acids Res.25: 3379-3388. Other enzymes which cleave DNA are known (e.g.S 1 nuclease; mung bean nuclease; pancreatic DNase I; Micrococcus nuclease; yeast HO endonuclease; see also Linn et al (eds.) nucleotides, Cold spring harbor laboratory Press, 1993). One or more of these enzymes (or functional fragments thereof) may be used as a source of cleavage domains and cleavage half-domains.

Similarly, the cleavage half-domain may be derived from any nuclease or portion thereof as set forth above, provided that dimerization is required for cleavage activity of the cleavage half-domain. In general, if the fusion protein comprises a cleavage half-domain, targeted cleavage of genomic DNA requires two fusion proteins. Alternatively, a single protein comprising two cleavage half-domains may be used. The two cleavage half-domains may be derived from the same endonuclease, or each cleavage half-domain may be derived from a different endonuclease. In addition, the target sites of the two fusion proteins are arranged relative to each other such that the binding of the two fusion proteins to their respective target sites places the cleavage half-domains in a spatial orientation to each other that allows the cleavage half-domains to form a functional cleavage domain (e.g., by dimerization). Thus, in certain embodiments, the near edge of the target site (near edge) is separated by 5-8 nucleotide pairs or by 15-18 nucleotide pairs. In other embodiments, the target sites are within 10 nucleotide pairs of each other. However, any integer number of nucleotides or nucleotide pairs may be interspersed between two target sites (e.g., from 2 to 50 nucleotides or more). Generally, the cleavage point is located between the target sites.

Restriction endonucleases (restriction enzymes) are present in many species and are capable of sequence-specific binding to DNA (at a recognition site) and cleaving DNA at or near the binding site. Certain restriction enzymes (e.g., type IIS) cleave DNA at sites remote from the recognition site and have separable binding and cleavage domains. For example, the type IIS enzyme Fok I catalyzes double-stranded cleavage of DNA, at 9 nucleotides from its recognition site on one strand and 13 nucleotides from its recognition site on the other. See, e.g., U.S. Pat. nos. 5,356,802; 5,436,150 and 5,487,994; and Li et al (1992) Proc.Natl.Acad.Sci.USA 89: 4275-4279; li et al (1993) Proc. Natl. Acad. Sci.USA90: 2764-2768; kim et al (1994a) proc.natl.acad.sci.usa 91: 883-887; kim et al (1994b) j.biol.chem.269: 31, 978-31, 982. Thus, in one embodiment, the fusion protein comprises a cleavage domain (or cleavage half-domain) from at least one type IIS restriction enzyme and one or more zinc finger binding domains (which may be engineered or not engineered).

An exemplary type IIS restriction enzyme in which the cleavage domain and binding domain are separable is Fok I. This particular enzyme is active as a dimer. Bitinaite et al (1998) proc.natl.acad.sci.usa95: 10, 570-10, 575. Thus, for the purposes of this disclosure, the portion of the Fok I enzyme used in the disclosed fusion proteins is considered to be the cleavage half-domain. Thus, for targeted double-stranded cleavage and/or targeted replacement of cellular sequences using zinc finger-containing ZFN-Fok I fusions, two fusion proteins (each containing a Fok I cleavage half-domain) can be used to reconstitute a catalytically active cleavage domain. Alternatively, a single polypeptide molecule comprising one zinc finger binding domain and two Fok I cleavage half-domains may be used. Parameters for targeted cleavage and targeted sequence alteration using zinc finger-Fok I fusions are provided elsewhere in the disclosure and, for example, U.S. patent application publication No.2005/0064474, the disclosure of which is incorporated by reference.

In other embodiments, the Fok I cleavage half-domain may include one or more mutations at any amino acid residue that affects dimerization. Such mutations can be used to prevent one of a pair of ZFP/Fok I fusions from undergoing homodimerization, which can lead to cleavage at unwanted sequences. For example, amino acid residues 446, 447, 479, 483, 484, 486, 487, 490, 491, 496, 498, 499, 500, 531, 534, 537, and 538 of Fok I are all sufficiently close to the dimerization interface to affect dimerization. Thus, changes in the amino acid sequence at one or more of the above positions can be used to alter the dimerization properties of the cleavage half-domain. Such variations can be introduced, for example, by constructing libraries containing (or encoding) different amino acid residues at these positions and selecting variants with the desired properties, or by rationally designing individual mutants. In addition to preventing homodimerization, it is also possible that some of these mutations may increase cleavage efficiency compared to that obtained using two wild-type cleavage half-domains.

Thus, for targeted cleavage using a pair of ZFP/Fok I fusions, one or both of the fusion proteins can comprise one or more amino acid changes that inhibit self-dimerization (self-dimerization) but allow heterodimerization of the two fusion proteins to occur such that cleavage occurs at the desired target site. In certain embodiments, the alteration is present in both fusion proteins, and the alteration has a superimposed effect; that is, homodimerization of any fusion that results in aberrant cleavage is minimized or eliminated, while heterodimerization of the two fusion proteins is promoted compared to heterodimerization obtained using the wild-type cleavage half-domain.

In certain embodiments, the cleavage domain comprises two cleavage half-domains (both being part of a single polypeptide comprising the binding domain), i.e., a first cleavage half-domain and a second cleavage half-domain. The cleavage half-domains may have the same amino acid sequence or different amino acid sequences as long as they function to cleave DNA.

Cleavage half-domains may also be provided in separate molecules. For example, two fusion polypeptides can be expressed in a cell, wherein each polypeptide comprises a binding domain and a cleavage half-domain. The cleavage half-domains may have the same amino acid sequence or different amino acid sequences as long as they function to cleave DNA. Furthermore, the binding domain binds to the target sequence, which is typically arranged in such a way that upon binding of the fusion polypeptide, the two cleavage half-domains are presented in a spatial orientation to each other that allows reconstitution of the cleavage domains (e.g. by dimerization of the half-domains), thereby placing the half-domains relative to each other to form a functional cleavage domain, resulting in cleavage of cellular chromatin in the region of interest. In general, cleavage by the reconstituted cleavage domain occurs at a site located between the two target sequences. One or both of the proteins may be engineered to bind to its target site.

Expression of two fusion proteins in a cell may result from delivery of the two proteins to the cell; delivering a protein and a nucleic acid encoding one of said proteins to said cell; delivering two nucleic acids each encoding one of said proteins to said cell; or by delivering a single nucleic acid encoding both proteins to the cell. In other embodiments, the fusion protein comprises a single polypeptide chain comprising two cleavage half-domains and a zinc finger binding domain. In this case, a single fusion protein is expressed in the cell and (without wishing to be bound by theory) is thought to cleave DNA due to the formation of intramolecular dimers of the cleavage half-domain.

In certain embodiments, the building blocks of the ZFPs are arranged such that the zinc finger domain is closest to the amino terminus of the fusion protein and the cleavage half-domain is closest to the carboxy terminus. This reflects the relative orientation of the cleavage domains in naturally occurring dimerization cleavage domains (such as those derived from Fok I enzymes) in which the DNA binding domain is closest to the amino terminus and the cleavage half-domain is closest to the carboxy terminus. In these embodiments, the cleavage half-domain dimerization to form a functional nuclease is caused by binding of the fusion protein to a site on the opposing DNA strand, wherein the 5' ends of the binding sites are in proximity to each other.

In this orientation, the C-most zinc finger is proximal to the Fok I cleavage half-domain. It has been previously determined that non-canonical zinc finger proteins bind their DNA target most efficiently when CCHC-type zinc fingers are present in the most C-terminal fingers. Thus, it is possible that the presence of the previously described CCHC-type zinc finger close to the Fok I cleavage half-domain inhibits its function. If so, the optimized CCHC zinc fingers of the present disclosure apparently do not exhibit this putative inhibitory activity.

In other embodiments, the building blocks of the fusion protein (e.g., ZFP-Fok I fusion) are arranged such that the cleavage half-domain is closest to the amino-terminus and the zinc finger domain is closest to the carboxy-terminus of the fusion protein. In these embodiments, the cleavage half-domain dimerization to form a functional nuclease is caused by binding of the fusion protein to a site on the opposing DNA strand, wherein the 3' ends of the binding sites are in proximity to each other.

In other embodiments, the first fusion protein comprises a cleavage half-domain nearest the amino-terminus and a zinc finger domain nearest the carboxy-terminus of the fusion protein, and the second fusion protein is arranged such that the zinc finger domain is nearest the amino-terminus and the cleavage half-domain is nearest the carboxy-terminus of the fusion protein. In these embodiments, both fusion proteins bind the same DNA strand, with the binding site of the first fusion protein comprising the zinc finger domain closest to the carboxy terminus being 5' to the binding site of the second fusion protein comprising the zinc finger domain closest to the amino terminus. See also WO2005/084190, the disclosure of which is incorporated by reference.

The amino acid sequence between the zinc finger domain and the cleavage domain (or cleavage half-domain) is referred to as the "ZC-linker". ZC linkers differ from the inter-finger linkers described above. For details on obtaining ZC joints with optimized cleavage see for example us patent publications 20050064474a1 and 20030232410, and international patent publication WO2005/084190, the disclosures of which are incorporated by reference.

Expression vector

Nucleic acids encoding one or more ZFPs or ZFP fusion proteins (e.g., ZFNs) can be cloned into vectors for transformation into prokaryotic or eukaryotic cells for replication and/or expression. The vector may be a prokaryotic or eukaryotic vector, including but not limited to a plasmid, a shuttle vector, an insect vector, a binary vector (see, e.g., U.S. Pat. No.4,940,838; Horsch et al (1984) Science 233: 496-498, and Fraley et al (1983) Proc. Nat' l.Acad. Sci. USA 80: 4803), and the like. The nucleic acid encoding the ZFP may also be cloned into an expression vector for administration to a plant cell.

To express the fusion protein, the sequence encoding the ZFP or ZFP fusion is typically subcloned into an expression vector containing a promoter that directs transcription. Suitable bacterial and eukaryotic promoters are well known in the art and are described, for example, in Sambrook et al, Molecular Cloning, A Laboratory Manual (2 nd edition, 1989; 3 rd edition, 2001); kriegler, Gene Transfer and Expression: a laboratory Manual (1990); and Current protocols in Molecular Biology (Ausubel et al, supra). Bacterial expression systems for expression of ZFPs are available in, for example, E.coli (E.coli), Bacillus (Bacillus sp.) and Salmonella (Salmonella) (Palva et al, Gene 22: 229-235 (1983)). Kits for these expression systems are commercially available. Eukaryotic expression systems for mammalian cells, yeast and insect cells are well known to those skilled in the art and are also commercially available.

The promoter used to direct expression of the ZFP-encoding nucleic acid depends on the particular application. For example, ZFPs are typically expressed and purified using strong constitutive promoters suitable for the host cell.

In contrast, when ZFPs are applied in vivo to regulate plant genes (see below "delivery of nucleic acids to plant cells") constitutive or inducible promoters are used, depending on the particular use of the ZFP. Non-limiting examples of plant promoters include promoter sequences derived from the group consisting of: arabidopsis (A.thaliana) ubiquitin-3 (ubi-3) (Callis et al, 1990, J.biol.chem.265: 12486-12493); agrobacterium tumefaciens (A. tumifaciens) mannopine synthase (. DELTA.mas) (Petolino et al, U.S. Pat. No.6,730,824); and/or cassava vein mosaic virus (CsVMV) (Verdaguer et al, 1996, Plant Molecular Biology 31: 1129-1139). See also the examples.

In addition to a promoter, an expression vector typically comprises a transcription unit or expression cassette that contains all the other elements required for expression of a nucleic acid in a host cell (either prokaryotic or eukaryotic). Thus, a typical expression cassette comprises a promoter operably linked to, for example, a nucleic acid sequence encoding a ZFP, and the required signals, for example, for efficient polyadenylation of the transcript, transcription termination, ribosome binding site, or translation termination. Additional elements of the cassette may include, for example, enhancers, and heterologous splicing signals.

The particular expression vector used to transport the genetic information into the cell is selected according to the intended use of the ZFP, e.g., expression in plants, animals, bacteria, fungi, protozoa, etc. (see expression vectors described below). Standard bacterial and animal expression vectors are known in the art and are described in detail in, for example, U.S. patent publication No.20050064474 a1 and international patent publication No. WO 05/084190; WO 05/014791; and WO03/080809, the disclosures of which are incorporated by reference.

Standard transfection Methods can be used to generate bacterial, mammalian, yeast or insect cell lines expressing large amounts of Protein, which can then be purified using standard techniques (see, e.g., Colley et al, J.biol. chem.264: 17619-17622 (1989); Guide to Protein Purification, Methods in Enzymology, Vol. 182(Deutscher Ed., 1990)). Transformation of eukaryotic and prokaryotic cells is carried out according to standard techniques (see, e.g., Morrison, J.Bact.132: 349-351 (1977); Clark-Curtiss and Curtiss, Methods in enzymology 101: 347-362(Wu et al eds., 1983)).

Any of the well-known procedures for introducing foreign nucleotide sequences into such host cells may be used. These include the use of calcium phosphate transfection, Polybrene, protoplast fusion, electroporation, ultrasonic methods (e.g., sonoporation), liposomes, microinjection, naked DNA, plasmid vectors, viral vectors (both episomal and integrative), and any other well-known method for introducing cloned genomic DNA, cDNA, synthetic DNA, or other foreign genetic material into a host cell (see, e.g., Sambrook et al, supra). It is only necessary that the specific genetic engineering protocol used be able to successfully introduce at least one gene into a host cell capable of expressing a selected protein.

Delivering nucleic acids to plant cells

As described above, the DNA construct may be introduced into the desired plant host (e.g., into its genome) by a variety of conventional techniques. For a review of such techniques, see, e.g., Weissbach and Weissbach, Methods for plant Molecular Biology (1988, Academic Press, N.Y.), section VIII, page 421-; and Grierson and Corey, Plant Molecular Biology (1988, 2 nd Ed.), Blackie, London, chapters 7-9.

For example, the DNA construct may be introduced into plant cells using techniques such as electroporation and microinjection of plant cell protoplasts, or the DNA construct may be introduced directly into plant tissue using biolistic methods such as DNA particle bombardment (see, e.g., Klein et al (1987) Nature 327: 7073). Alternatively, the DNA construct may be combined with suitable T-DNA flanking regions and introduced into a conventional Agrobacterium tumefaciens host vector. Agrobacterium tumefaciens-mediated transformation techniques, including the use and elimination of binary vectors (disarming), are well documented in the scientific literature. See, e.g., Horsch et al (1984) Science 233: 496498, and Fraley et al (1983) Proc. nat' l. Acad. Sci. USA80: 4803.

furthermore, gene transfer can be achieved using bacteria or viruses other than agrobacterium, such as Rhizobium NGR234(Rhizobium sp. NGR234), sinorhizobium meliloti (sinorhizobium meliloti), Mesorhizobium palmatum (Mesorhizobium loti), potexvirus, cauliflower mosaic virus, and cassava vein mosaic virus and/or tobacco mosaic virus. See, e.g., Chung et al (2006) Trends Plant Sci.11 (1): 1-4.

Virulence functions of the A.tumefaciens host will direct the insertion of constructs and adjacent markers into plant cell DNA after infection of the cells by the bacterium using a binary T DNA vector (Bevan (1984) Nuc. acid Res.12: 8711-8721) or co-cultivation protocol (Horsch et al (1985) Science 227: 1229-1231). In general, Agrobacterium transformation systems are used to engineer dicotyledonous plants (Bevan et al (1982) Ann. Rev. Genet 16: 357-384; Rogers et al (1986) methods enzymol.118: 627-641). The Agrobacterium transformation system may also be used to transform and transfer DNA to monocots and plant cells. See U.S. Pat. Nos. 5,591,616; hernalsteen et al (1984) EMBO J3: 3039-3041; hooykass Van Slogteren et al (1984) Nature 311: 763 764; grimsley et al (1987) Nature 325: 1677-; boulton et al (1989) Plant mol. biol. 12: 31-40; and Gould et al (1991) plantaphysiol.95: 426-434.

Alternative methods of gene transfer and transformation include, but are not limited to, protoplast transformation by calcium, polyethylene glycol (PEG) or electroporation mediated uptake of naked DNA (see Paszkowski et al (1984) EMBO J3: 2717-2722; Potrykus et al (1985) Molec. Genet. 199: 169-177; Fromm et al (1985) Proc. Nat. Acad. Sci. USA 82: 5824-5828; and Shimamoto (1989) Nature 338: 274-276) and electroporation of Plant tissues (D' Halluin et al (1992) Plant Cell 1505 4: 1495-1495). Additional methods for Plant Cell transformation include microinjection, silicon carbide-mediated DNA uptake (Kaeppler et al (1990) Plant Cell Reporter 9: 415-.

The disclosed methods and compositions can be used to insert an exogenous sequence into a predetermined location in the genome of a plant cell. This is useful because the expression of a transgene introduced into the plant genome is critically dependent on its site of integration. Thus, genes encoding, for example, nutrients, antibiotics or therapeutic molecules can be inserted by targeted recombination into regions of the plant genome that facilitate their expression.

Transformed plant cells produced by any of the above transformation techniques can be cultured to regenerate a whole plant possessing the transformed genotype and thus the desired phenotype. Such regeneration techniques rely on the manipulation of certain phytohormones in the tissue culture growth medium, typically on biocide and/or herbicide markers that have been introduced with the desired nucleotide sequence. Regeneration of plants from cultured Protoplasts is described by Evans et al, "protoplants Isolation and Culture" in Handbook of Plant Cell Culture, pp.124-176, Macmillian publishing company, New York, 1983; and Binding, Regeneration of Plants, Plant promoters, pp.21-73, CRC Press, Boca Raton, 1985. Regeneration may also be obtained from plant callus (callus), explants, organs, pollen, embryos or parts thereof. Such regeneration techniques are generally described in Klee et al (1987) Ann. Rev. of plantaPhys.38: 467-486.

The nucleic acid introduced into the plant cell can be used to confer a desired trait to essentially any plant. The nucleic acid constructs of the present disclosure and the various transformation methods described above can be used to engineer a wide variety of plants and plant cell systems for desirable physiological and agronomic characteristics as described herein. In certain embodiments, the plant and plant cells of interest for modification include, but are not limited to, those monocotyledonous and dicotyledonous plants, such as crop plants including cereal crops (e.g., wheat, corn, rice, millet (millet), barley), fruit crops (e.g., tomato, apple, pear, strawberry, citrus/orange), forage crops (e.g., alfalfa), root vegetable crops (e.g., carrot, potato, beet, yam), leafy vegetable crops (e.g., lettuce, spinach); flowering plants (e.g. petunia, rose/rose, chrysanthemum), conifer and pine trees (e.g. fir, spruce); plants for plant decontamination (phytochemical) (e.g., heavy metal accumulating plants); oil crops (e.g. sunflower, rapeseed) and plants used for experimental purposes (e.g. arabidopsis). As such, the disclosed methods and compositions find use in a wide range of plants, including but not limited to species from the genera: asparagus (Asparagus), Avena (Avena), Brassica (Brassica), Citrus (Citrus), Citrullus (Citrullus), Capsicum (Capsicum), Cucurbita (Cucurbita), Daucus (Daucus), Glycine (Glycine), Gossypium (Gossypium), Hordeum (Hordeum), Lactuca (Lactuca), Lycopersicon (Lycopersicon), Malus (Malus), Manihot (Manihot), Nicotiana (Nicotiana), Oryza (Oryza), avocado (Persea), Pisum (Pisum), Pyritum (Pyrus), Prunus (Prunus), Raphanus (Raphanus), Secale (Secale), Solanum (Solanum), Sorghum (Sorghum), Triticum), Vitis (Vitis), Vigna (Zegna), and Zenia (Zebra).

One skilled in the art will recognize that after the expression cassette is stably incorporated into a transgenic plant and is proven to be operable, it can be introduced into other plants by sexual crossing. Any of a number of standard breeding techniques may be used, depending on the species to be crossed.

Transformed plant cells, callus, tissues or plants may be identified and isolated by selecting or screening the transformed plant material for a trait encoded by a marker gene present on the transformed DNA. For example, selection can be performed by culturing the modified plant material on a medium containing a suppressing amount of an antibiotic or herbicide to which the transformed genetic construct confers resistance. In addition, transformed plants and plant cells can also be identified by screening for the activity of any visible marker gene (e.g., β -glucuronidase, luciferase, B or C1 genes) that may be present on the recombinant nucleic acid construct. Such selection and screening methods are well known to those skilled in the art.

Physical and biochemical methods can also be used to identify plants or plant cell transformants containing the inserted gene construct. These methods include, but are not limited to: 1) sourther analysis or PCR amplification for detecting and determining the structure of the recombinant DNA insert; 2) northern blotting, S1 RNase protection, primer extension or reverse transcriptase-PCR amplification for detecting and examining RNA transcripts of the gene construct; 3) an enzymatic assay for detecting enzymatic or ribozyme activity, wherein such gene products are encoded by a gene construct; 4) protein gel electrophoresis, Western blotting techniques, immunoprecipitation, or enzyme-linked immunoassay, wherein the gene construct product is a protein. Other techniques, such as in situ hybridization, enzymatic staining, and immunostaining, can also be used to detect the presence or expression of the recombinant construct in specific plant organs and tissues. Methods for performing all of these assays are well known to those skilled in the art.

The effect of genetic manipulation using the methods disclosed herein can be observed, for example, by Northern blotting of RNA (e.g., mRNA) isolated from a tissue of interest. Typically, if the amount of mRNA is increased, it can be assumed that the corresponding endogenous gene is being expressed at a faster rate than before. Other methods of measuring gene and/or CYP74B activity may be used. Different types of enzyme assays may be used, depending on the substrate used and the method of detecting an increase or decrease in reaction products or byproducts. In addition, the level of expressed protein and/or CYP74B protein can be measured by immunochemical methods, i.e., ELISA, RIA, EIA and other antibody-based assays known to those skilled in the art, such as by electrophoretic detection assays (either using staining or using Western blotting). The transgene may be selectively expressed in some tissues of the plant or at some developmental stage, or the transgene may be expressed in substantially all plant tissues, substantially throughout its life cycle. However, any combination of expression patterns is also applicable.

The present disclosure also encompasses seeds of the transgenic plants described above, wherein the seeds have a transgene or genetic construct. The present disclosure also encompasses progeny, clones, cell lines or cells of the above transgenic plants, wherein said progeny, clones, cell lines or cells have said transgene or genetic construct.

The ZFPs and expression vectors encoding the ZFPs can be administered directly to plants for targeted cleavage and/or recombination.

The effective amount is applied by any route normally used for introducing ZFPs and eventually contacting the plant cells to be treated. The ZFP is administered in any suitable manner. Suitable methods of administering such compositions are available and well known to those skilled in the art, and while more than one route may be used to administer a particular composition, a particular route may generally provide a more immediate (immediate) and more effective response than another route.

A carrier can also be used and will be determined in part by the particular composition being administered, and by the particular method used to administer the composition. Thus, there are a wide variety of suitable Pharmaceutical composition formulations that can be used (see, e.g., Remington's Pharmaceutical Sciences, 17 th edition, 1985)).

Applications of

Zinc finger proteins comprising one or more non-canonical zinc fingers as described herein are useful for all genome regulation and editing applications currently performed using canonical C2H2 ZFPs, including but not limited to: activating genes; gene suppression; genome editing (cutting, targeted insertion, replacement or deletion); and exogenous genome editing (targeting via covalent modification of histones or DNA).

ZFNs comprising the non-canonical zinc fingers disclosed herein can be used to cut DNA at a region of interest in cellular chromatin (e.g., at a desired or predetermined site in the genome, e.g., in a mutant or wild-type gene). For such targeted DNA cleavage, the zinc finger binding domain is engineered to bind to a target site at or near the predetermined cleavage site and express a fusion protein comprising the engineered zinc finger binding domain and cleavage domain in a cell. After the zinc finger portion of the fusion protein binds to the target site, the DNA is cleaved near the target site by the cleavage domain. The exact site of cleavage may depend on the length of the ZC linker.

Alternatively, two ZFNs, each comprising a zinc finger binding domain and a cleavage half-domain, are expressed in the cell and bind to the target site, juxtaposed in a manner such that a functional cleavage domain is reconstituted and DNA is cleaved near the target site. In one embodiment, cleavage occurs between the target sites of the two zinc finger binding domains. One or both of the zinc finger binding domains may be engineered.

For targeted cleavage using a zinc finger binding domain-cleavage domain fusion polypeptide, the binding site may encompass the cleavage site, or the proximal edge of the binding site may be 1, 2,3, 4, 5,6, 10, 25, 50 or more nucleotides (or any integer value between 1 and 50 nucleotides) from the cleavage site. The exact location of the binding site relative to the cleavage site will depend on the length of the particular cleavage domain and ZC linker. For approaches in which two fusion proteins are used, each comprising a zinc finger binding domain and a cleavage half-domain, the binding site typically spans the cleavage site. Thus, the proximal edge of the first binding site can be 1, 2,3, 4, 5,6, 10, 25, or more nucleotides (or any integer value between 1 and 50 nucleotides) to one side of the cleavage site, while the proximal edge of the second binding site can be 1, 2,3, 4, 5,6, 10, 25, or more nucleotides (or any integer value between 1 and 50 nucleotides) to the other side of the cleavage site. Methods for mapping the cleavage site in vitro and in vivo are known to those skilled in the art.

Once introduced into or expressed in a target cell, the fusion protein binds to the target sequence and is cleaved at or near the target sequence. The exact site of cleavage depends on the nature of the cleavage domain and/or the presence and/or nature of a linker sequence between the binding and cleavage domains. In the case where two ZFNs are used, each comprising a cleavage half-domain, the distance between the proximal edges of the binding sites may be 1, 2,3, 4, 5,6, 7, 8, 9, 10, 25 or more nucleotides (or any integer value between 1 and 50 nucleotides). The optimal level of cleavage may also depend on both the distance between the binding sites of the two ZFNs (see, e.g., Smith et al (2000) Nucleic Acids Res.28: 3361-. See also U.S. patent publication 20050064474a1 and international patent publication WO 05/084190; WO 05/014791; and WO03/080809, the disclosures of which are incorporated by reference.

Two ZFNs, each comprising a cleavage half-domain, can bind with the same or opposite polarity in the region of interest, and their binding sites (i.e., target sites) can be separated by any number of nucleotides (e.g., from 0 to 50 nucleotide pairs or any integer value therebetween). In certain embodiments, the binding sites of two fusion proteins each comprising a zinc finger binding domain and a cleavage half-domain may be separated by between 5 and 18 nucleotide pairs, e.g., by 5-8 nucleotide pairs, or by 15-18 nucleotide pairs, or by 6 nucleotide pairs, or by 16 nucleotide pairs, or within 10 nucleotide pairs from each other, as measured by the closest edge of each binding site to the other binding site, and cleavage occurs between the binding sites.

The site at which the DNA is cleaved is generally located between the binding sites of the two fusion proteins. Double-stranded breaks in DNA typically result from two single-stranded breaks (breaks) or "nicks" (nicks) that are offset (offset) by 1, 2,3, 4, 5,6, or more nucleotides (e.g., cleavage of double-stranded DNA by natural Fok I results from a single-stranded break that is offset by 4 nucleotides). Thus, cleavage need not occur at exactly opposite sites on each DNA strand. In addition, the structure of the fusion protein and the distance between the target sites can influence whether cleavage occurs adjacent to a single nucleotide pair, or whether cleavage occurs at several sites. However, for many applications (including targeted recombination and targeted mutagenesis), cleavage within a range of nucleotides is generally sufficient, and cleavage between specific base pairs is not required.

As described above, one or more fusion proteins can be expressed in a cell after introduction of the polypeptide and/or polynucleotide into the cell. For example, two polynucleotides each comprising a sequence encoding one of the above polypeptides can be introduced into a cell and cleaved at or near the target sequence after expression of the polypeptides and binding of each polypeptide to its target sequence. Alternatively, a single polynucleotide comprising sequences encoding both fusion polypeptides is introduced into the cell. The polynucleotide may be DNA, RNA or any modified form or analog of DNA and/or RNA.

In certain embodiments, targeted cleavage by ZFNs in a genomic region results in a change in the nucleotide sequence of the region following repair of the cleavage event by non-homologous end joining (NHEJ).

In other embodiments, targeted cleavage by ZFNs in a genomic region may also be part of a procedure in which a genomic sequence (e.g., a region of interest in cellular chromatin) is replaced with a homologous but non-identical sequence via a homology-dependent mechanism (i.e., by targeted recombination) (e.g., insertion of a donor sequence comprising an exogenous sequence and one or more sequences that are identical or homologous but not identical to a predetermined genomic sequence (i.e., a target site)). Because double-strand breaks in cellular DNA stimulate cellular repair mechanisms up to thousands of fold near the cleavage site, targeted cleavage with the ZFNs described herein allows sequence alterations or substitutions (repair via homology guidance) at virtually any site in the genome.

In addition to the ZFNs described herein, targeted replacement of selected genomic sequences also requires the introduction of exogenous (donor) polynucleotides. The donor polynucleotide can be introduced into the cell prior to, concurrently with, or after expression of the ZFN. The donor polynucleotide contains sufficient homology to the genomic sequence to support homologous recombination (or homology-directed repair) between it and the genomic sequence to which it has homology. Sequence homology of about 25, 50, 100, 200, 500, 750, 1,000, 1,500, 2,000 nucleotides or more (or any integer value between 10 and 2,000 nucleotides, or more) will support homologous recombination. The length of the donor polynucleotide can range from 10 to 5,000 nucleotides (or any integer value of nucleotides therebetween) or longer.

It will be apparent that typically the nucleotide sequence of the donor polynucleotide is not identical to the genomic sequence it replaces. For example, the sequence of the donor polynucleotide may contain one or more substitutions, insertions, deletions, inversions or rearrangements relative to the genomic sequence, so long as there is sufficient homology to the chromosomal sequence. Such sequence variations may be of any size, and may be as small as a single nucleotide pair. Alternatively, the donor polynucleotide may comprise a non-homologous sequence (i.e., an exogenous sequence, which is distinguished from an exogenous polynucleotide) flanked by two regions of homology. Alternatively, the donor polynucleotide may constitute a vector molecule which contains a sequence which is not homologous to a region of interest in cellular chromatin. Generally, the homologous regions of the donor polynucleotide will have at least 50% sequence identity with the genomic sequence for which recombination is desired. In certain embodiments, there is 60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9% sequence identity. Any number between 1% and 100% sequence identity may be present, depending on the length of the donor polynucleotide.

The donor molecule may comprise several discrete regions of homology to cellular chromatin. For example, for targeted insertion of a sequence that is not normally present in the region of interest, the sequence may be present in the donor nucleic acid molecule and flanked by regions of homology to the sequence in the region of interest.

To simplify the assays (e.g., hybridization, PCR, restriction enzyme digestion) used to determine successful insertion of a sequence from a donor polynucleotide, certain sequence differences may be present in the donor sequence compared to the genomic sequence. Preferably, such nucleotide sequence differences, if located in the coding region, do not alter the amino acid sequence or produce silent amino acid changes (i.e., changes that do not affect protein structure or function). The donor polynucleotide can optionally comprise a change in the sequence in the region of interest corresponding to the zinc finger domain binding site to prevent cleavage of a donor sequence that has been introduced into cellular chromatin by homologous recombination.

The polynucleotide may be introduced into the cell as part of a vector molecule having additional sequences, such as, for example, an origin of replication, a promoter, and a gene encoding antibiotic resistance. In addition, the donor polynucleotide may be introduced in the form of naked nucleic acid, in the form of nucleic acid complexed with an agent, such as a liposome or a Poloxamer, or may be delivered by bacteria or viruses, such as Agrobacterium, Rhizobium NGR234, Sinorhizobium meliloti, Mesorhizobium parviflorum, tobacco mosaic virus, potexvirus, cauliflower mosaic virus, and cassava vein mosaic virus. See, e.g., Chung et al (2006) Trends Plant Sci.11 (1): 1-4.

For chromosomal sequence changes, it is not necessary to copy the entire sequence of the donor into the chromosome, as long as the donor sequence is copied sufficiently to effect the desired sequence change.

The efficiency of insertion of a donor sequence by homologous recombination is inversely proportional to the distance between the double strand break in the cellular DNA and the site at which recombination is desired. In other words, higher efficiency of homologous recombination is observed when the double-strand break is closer to the site where recombination is desired. In cases where the exact recombination site is not predetermined (e.g., the desired recombination event can occur within a stretch of genomic sequence), the length and sequence of the donor nucleic acid, and the cleavage site, are selected to obtain the desired recombination event. Where the desired event is designed to alter the sequence of a single nucleotide pair in the genomic sequence, cellular chromatin is cleaved within 10,000 nucleotides on either side of that nucleotide pair. In certain embodiments, cleavage occurs within 1,000, 500, 200, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10,5, or 2 nucleotides, or any integer value between 2 and 1,000 nucleotides, on either side of the nucleotide pair whose sequence is to be changed.

Targeted insertion of exogenous sequences into genomic regions is achieved by targeted cleavage in the genomic region using ZFNs, while providing an exogenous (donor) polynucleotide comprising the exogenous sequence. Typically, the donor polynucleotide further comprises sequences flanking the exogenous sequence that contain sufficient homology to the genomic region to support homology-directed repair of a double-stranded break in the genomic sequence, thereby inserting the exogenous sequence into the genomic region. Thus, the donor nucleic acid can be any size sufficient to support integration of the exogenous sequence by homology-dependent repair mechanisms (e.g., homologous recombination). Without wishing to be bound by any particular theory, it is believed that the regions of homology flanking the exogenous sequence provide a template for the chromosomal end of the break to resynthesize the genetic information at the site of the double-strand break.

Targeted integration of the exogenous sequences described above can be used to insert a marker gene at a selected chromosomal location. Marker genes include, but are not limited to, sequences encoding proteins that mediate antibiotic resistance (e.g., ampicillin resistance, neomycin resistance, G418 resistance, puromycin resistance), sequences encoding colored or fluorescent or luminescent proteins (e.g., green fluorescent protein, enhanced green fluorescent protein, red fluorescent protein, luciferase), and proteins that mediate enhanced cell growth and/or gene amplification (e.g., dihydrofolate reductase). Thus, exemplary marker genes include, but are not limited to, β -Glucuronidase (GUS), phosphinothricin N-acetyltransferase (PAT, BAR), neomycin phosphotransferase, β -lactamase, catechol dioxygenase, α -amylase, tyrosinase, β -galactosidase, luciferase, aequorin, EPSP synthase, nitrilase, acetolactate synthase (ALS), dihydrofolate reductase (DHFR), dalapon dehalogenase, and anthranilate synthase. In certain embodiments, targeted integration may be used to insert RNA expression constructs, such as sequences responsible for modulated expression of micrornas or sirnas. Promoters, enhancers or other transcriptional regulatory sequences may also be incorporated into the RNA expression constructs.

Further increase in the efficiency of targeted recombination in cells comprising a zinc finger/nuclease fusion molecule and a donor DNA molecule is achieved by blocking G in the cell cycle by the cell₂At this point, homology-driven repair processes are maximally active. This retardation can be achieved in many ways. For example, cells can be treated with drugs, compounds and/or small molecules that affect cell cycle progression, for example, such that the cells are arrested at G₂And (4) period. Exemplary molecules of this type include, but are not limited to, compounds that affect microtubule polymerization (e.g., vinblastine (vinblastine), nocodazole (nocodazole), Taxol (Taxol)), compounds that interact with DNA (e.g., cis-platinum (II) diammine dichloride (cis-platinum (II)) Cisplatin (cissplatin), doxorubicin (doxobicin)), and/or compounds that affect DNA synthesis (e.g., thymidine, hydroxyurea, L-mimosine (L-mimosine), etoposide (etoposide), 5-fluorouracil). The re-increase in recombination efficiency is achieved by using Histone Deacetylase (HDAC) inhibitors that alter chromatin structure (e.g., D)Sodium, trichostatin A) to make genomic DNA more accessible to the cellular recombination machinery (machinery).

Other methods for cell cycle arrest include over-expressing a protein that inhibits CDK cell cycle kinase activity, for example by introducing into the cell a cDNA encoding the protein or by introducing into the cell an engineered ZFP that activates expression of a gene encoding the protein. Cell cycle arrest can also be achieved by inhibiting the activity of cyclins and CDKs, for example, using RNAi methods (e.g., U.S. patent No.6,506,559), or by introducing into a cell an engineered ZFP that suppresses the expression of one or more genes involved in the progression of the cell cycle, such as, for example, cyclin and/or CDK genes. See, e.g., commonly owned U.S. Pat. No.6,534,261 for methods of synthesizing engineered zinc finger proteins for use in modulating gene expression.

As described above, the disclosed methods and compositions for targeted cleavage can be used to induce mutations in genomic sequences. Targeted cleavage can also be used to generate gene knockouts, e.g., for functional genomics or target validation (targeting), and to facilitate targeted insertion of sequences into the genome (i.e., knock-in). Insertion can be achieved by replacing chromosomal sequences by homologous recombination or by targeted integration, wherein a new sequence flanked by sequences homologous to a region of interest in the chromosome (i.e., a sequence not present in the region of interest) is inserted at a predetermined target site. The same method can also be used to replace the wild-type sequence with a mutant sequence, or to convert one allele to a different allele.

Targeted cleavage of an infected or integrated plant pathogen can be used to treat pathogen infection in a plant host, for example by cleaving the genome of the pathogen such that its pathogenicity is reduced or eliminated. In addition, targeted cleavage of genes encoding plant viral receptors can be used to block expression of such receptors, thereby preventing viral infection and/or viral transmission in plants.

Exemplary plant pathogens include, but are not limited to, plant viruses such as alfalfa mosaic virus (Alfamovirus), Alphacryptovirus (Alphacryptovirus), baculovirus (Badnavirus), Betacryptovirus (Betacryptovirus), Bigeminivirus, Bromus mosaic virus (Bromovirus), barley yellows mosaic virus (Bymovirus), hairy virus (Capillovirus), carnation virus (Carmovirus), carnation mottle virus (Carmovirus), Cauliflower mosaic virus group (Caulovirus), long-line virus group (Clostracovirus), cowpea mosaic virus group (Comovirus), cucumber mosaic virus group (Cucumovirus), rhabdovirus (Cytorhabdvirus), Dirubivirus (Diavirous), pea mosaic virus group (Enterovirus), cucumber mosaic virus group (Fujivirus), pseudovirus (Pseudovirus), pseudolaris (Pseudovirus), pseudovirus (Pseudovirus), pseudovirus (Rhamnus), pseudovirus (Rhus), Rhus (Pseudovirus), Rhus (Pseudovirus), Rhus (Pseudovirus), Rhus (Rh, Sweet potato mild mottle virus (Ipomovirus), barley yellow dwarf virus (Luteovirus), maize chlorotic mottle virus (machlovirus), citrus mosaic virus (Macluravirus), zearalya polynavirus (Marafivirus), Monogeminivirus, dwarf virus (nanvirus), necrotic virus (Necrovirus), nematode polyhedrosis virus (Nepovirus), nuclear rhabdovirus (nucelovirus), oryza sativa virus (Oryzavirus), ouravirus, plant enterovirus (phytoviridus), Potexvirus (Potexvirus), Potexvirus (Poteyvirus), Potexvirus (Potexvirus), rymosaic virus (ryvirus), satellite RNA, satellite virus (satellivirus), pseudovirovirus (setoviruses), Potexvirus (setum), Potexvirus (Potexvirus), Potexvirus (tobavirus), Potexvirus (tobivirus), pseudovirovirus (tobivirus), Potexvirus (tobivirus), pseudobulbophyllovirus (tobivirus), pseudobyvirus (tobivirus), Potexvirus (tobivirus), Potexvirus (Potexvirus), Potexvirus (potex, Turnip yellow mosaic virus (Tymovirus), Umbravirus (Umbravirus), Macrovirus (Varicosavirus), and dwarf virus (Waikavirus); fungal pathogens such as smut (e.g. ustilaginoides (Ustilaginales)), rust (Uredinales), ergot (Clavicepts pupurea) and mildew; moulds (Oomycetes) such as potato late blight (Phytophthora infestans) (potato blight); bacterial pathogens such as Erwinia (e.g. Erwinia herbicola), Pseudomonas (Pseudomonas) (e.g. Pseudomonas aeruginosa, Pseudomonas syringae, Pseudomonas fluorescens and Pseudomonas putida), Ralstonia (e.g. r. solanacearum), Agrobacterium (Agrobacterium) and Xanthomonas (Xanthomonas); nematodes (Nematoda); and Phytomyxea (Polymyxa and Rhizophora).

The disclosed methods for targeted recombination can be used to replace any genomic sequence with a homologous but non-identical sequence. For example, the mutant genomic sequence may be replaced with its wild type counterpart (counterpart), thereby providing a method for treating plant disease; providing resistance to a plant pathogen; increase crop yield, and the like. Similarly, one allele of a gene may be replaced with a different allele using the targeted recombination methods disclosed herein.

In many of these cases, the region of interest comprises a mutation, while the donor polynucleotide comprises the corresponding wild-type sequence. Similarly, the wild-type genomic sequence may be replaced with a mutant sequence if so desired. Indeed, any pathology that depends in any way on a particular genomic sequence can be corrected or alleviated using the methods and compositions disclosed herein.

Targeted cleavage and targeted recombination can also be used to alter non-coding sequences (e.g., regulatory sequences such as promoters, enhancers, initiators, terminators, splice sites) to alter the expression level of a gene product. Such methods can be used, for example, for therapeutic purposes, alterations in cellular physiology and biochemistry, functional genomics, and/or target validation studies.

The methods and compositions described herein can also be used to activate and repress gene expression using fusions between non-canonical zinc finger binding domains and functional domains. Such methods are disclosed, for example, in commonly owned U.S. Pat. nos. 6,534,261; 6,824,978, respectively; and 6,933,113, the disclosures of which are incorporated by reference. Other suppression methods include the use of antisense oligonucleotides and/or small interfering RNAs (sirnas or RNAi) targeting the gene sequence to be suppressed.

In other embodiments, one or more fusions between a zinc finger binding domain and a recombinase (or functional fragment thereof) in addition to or in place of the zinc finger-cleavage domain fusions disclosed herein may be used to facilitate targeted recombination. See, e.g., commonly owned U.S. Pat. No.6,534,261 and Akopian et al (2003) Proc.Natl.Acad.Sci.USA 100: 8688-8691.

In other embodiments, the disclosed methods and compositions can be used to provide fusions of ZFP binding domains with transcriptional activation or repression domains whose activity requires dimerization (either homo-or heterodimerization). In these cases, the fusion polypeptide comprises a zinc finger binding domain and a functional domain monomer (e.g., a monomer from a dimeric transcription activation or repression domain). Binding of two such fusion polypeptides to appropriately located target sites allows dimerization, such that functional transcriptional activation or repression domains are reconstituted.

Examples

The invention is further explained in the following examples, in which all parts and percentages are by weight and temperatures are on the celsius scale, unless otherwise indicated. It should be understood that these examples, while indicating certain embodiments of the invention, are given by way of illustration only. From the above discussion and these examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions.

Example 1: ZFN expression vector

Expression vectors comprising sequences encoding the 4-finger ZFNs (referred to as "5-8" and "5-9") described in examples 2 and 14 of U.S. patent publication 2005/0064474 (the disclosure of which is incorporated by reference) (see example 2 of this application) were modified as follows. Briefly, 5-8 and 5-9 ZFNs (which comprise 4 zinc finger domains fused via a4 amino acid ZC linker to the nuclease domain of the type IIS restriction enzyme Fok I (Wah et al (1998) amino acid 384-. Other modifications (substitutions and insertions) of residues between the C-terminal His and Cys zinc coordination structures and/or C-terminal residues referring to C-terminal Cys of finger 2 and/or finger 4 are also made.

Example 2: gene correction of eGFP in reporter cell lines

In Urnov (2005) Nature 435 (7042): 646-51 and U.S. patent publication No.20050064474 (e.g., examples 6-11) test the ability of ZFNs comprising CCHC zinc fingers as described herein to promote homologous recombination in the GFP system. Briefly, 50ng of each ZFN and 500ng of a promoter-free GFP donor (Urnov (2005) Nature) were transfected into 500,000 reporter cells per sample using 2 μ L Lipofectamine2000 per sample according to the Invitrogen Lipofectamine2000 protocol.

Vinblastine was added at a final concentration of 0.2 μ M24 hours post transfection and removed 72 hours post transfection.

GFP expression was determined for cells 5 days post-transfection by measuring 40,000 cells per transfection on a Guava desk top FACS analyzer.

As shown in fig. 1, most ZFNs comprising the altered CCHC zinc fingers shown in tables 1 and 2 above promote homologous recombination at the reporter (GFP) locus, resulting in GFP expression at levels higher than unmodified CCHC zinc fingers, and several ZFNs perform comparable to ZFNs comprising CCHH zinc fingers. The best performing variant when placed in finger 4(F4) comprises the following sequence (C-terminal to and including the His zinc coordinating residue): HAQRCGLRGSQLV (SEQ ID NO: 53) (zinc finger designated #21 in Table 2 and shown as "2-21" in FIG. 1). The best performing variant when placed in finger 2(F2) comprises the following sequence (C-terminal to and including the His zinc coordinating residue): HIRTCTGSQKP (SEQ ID NO: 75) (the zinc finger designated #43 in Table 2 and shown as "2-43" in FIG. 1).

Example 3: editing chromosomal IL2R gamma gene by targeted recombination

Also in Urnov (2005) Nature 435 (7042): 646-51 and us patent publication No.20050064474, example 2, the ZFNs described herein were determined in the endogenous IL2R γ assay. Briefly, 2.5 μ g of each ZFN expression construct was transfected into 500,000K 562 cells using nucleofector (amaxa). Genomic DNA was harvested and gene disruption was determined at the endogenous IL2R γ locus using the Surveyor endonuclease kit.

The upper left part of fig. 2 shows ZFN. Specifically, the altered zinc finger 20 refers to a CCHC zinc finger comprising sequence HTRRCGLRGSQLV; zinc finger 21 comprises sequence HAQRCGLRGSQLV (SEQ ID NO: 53); zinc finger 43 comprises sequence HIRTCTGSQKP (SEQ ID NO: 75); zinc finger 45 comprises sequence HIRTGCTGSQKP; zinc finger 47 comprises sequence HIRRCTGSQKP; and zinc finger 48 comprises sequence HIRRGCTGSQKP. Zinc fingers 20 and 21 are used in finger 4 of the 4-finger ZFN, and zinc fingers 43, 45, 47, and 48 are used in finger 2 of the 4-finger ZFN.

The ZFN pairs tested are shown in fig. 2 above and to the right of the figure and in table 5:

TABLE 5

To determine whether a mutation has been induced at the cleavage site, amplification products are analyzed using the Cel-1 assay, in which the amplification products are denatured and renatured, followed by treatment with mismatch-specific Cel-1 nuclease. See, e.g., Oleykowski et al (1998) Nucleic Acids Res.26: 4597. about. 4602; qui et al (2004) BioTechniques 36: 702-707; yeung et al (2005) BioTechniques 38: 749-758.

The results of two experiments are shown in figure 2 for each sample. Experiment #2 of samples 8 and 9 had significant background noise in the traces that reduced the apparent efficacy of these ZFNs.

As shown in fig. 2, certain CCHC variants are essentially identical to the wild-type C2H2 ZFN. The zinc fingers 21 at finger 4 (samples 5 and 9) gave better results than the zinc fingers 20 at finger 4 (samples 4 and 8). Among fingers 2, zinc finger 43 produces the best results.

Example 4: gene correction of eGFP in reporter cell lines

Based on the results shown in fig. 1 and 2, CCHC zinc fingers (referred to as 1a to 10a) shown in tables 3 and 4 above were generated. These zinc fingers were incorporated into 5-8 and 5-9 ZFNs and tested in the GFP gene correction assay described in example 2 above. The ZFN pairs tested in each sample are shown below each bar, with the zinc finger numbers 20, 21, 43, 45, 47 and 48 being those described in example 3, and the CCHC zinc fingers 1a to 10a comprising the sequences shown in tables 3 and 4 above. Zinc fingers 20, 21, 7a, 8a, 9a and 10a are used in finger 4; zinc fingers 43, 45, 47, 48, 1a, 2a, 3a, 4a, 5a, and 6a are used in finger 2.

The results are shown in fig. 3. The first row below each column is a finger that is incorporated into the zinc finger in ZFN 5-8, and the last row below each column is a finger that is incorporated into the zinc finger in ZFN 5-9. For example, the 2 nd column from the left on fig. 3 refers to samples transfected with 5-8 and 5-9 ZFNs, where the F4 of both ZFNs comprise the sequence of the zinc finger 20. As shown, many ZFNs comprising CCHC zinc fingers performed comparable to wild-type (CCHH) ZFNs.

Example 5: design and Generation of targeting vectors

A. General Structure of target sequence

The target construct for tobacco (dicot) comprises the following 7 components, as shown in figures 4 and 5: i) a Hygromycin Phosphotransferase (HPT) expression cassette comprising a gene driving the escherichia coli HPT gene terminated by the agrobacterium tumefaciens (a. tumifaciens) open reading frame-24 (orf-24) 3' untranslated region (UTR) (Gelvin et al, 1987, EP222493) (Waldron et al, 1985, Plant mol. biol. 18: 189-200) from arabidopsis thaliana (a. thaliana) ubiquitin-3 (ubi-3) promoter (Callis et al, 1990, j.bio. chem.265: 12486-12493); ii) a homologous sequence-1 comprising the tobacco (N.tabacum) RB7 Matrix Attachment Region (MAR) (Thompson et al, 1997, WO 9727207); iii) 5' Green Fluorescent Protein (GFP) gene fragment (Evagen Joint Stock Company, Moscow, Russia), which is driven by the modified Agrobacterium tumefaciens mannopine synthase (. DELTA.mas) promoter (Petolino et al, U.S. Pat. No.6,730,824); iv) a β -Glucuronidase (GUS) expression cassette comprising a nucleic acid sequence that drives the expression of the nucleotide sequence encoded by Agrobacterium tumefaciens nopaline synthase (nos) 3' UTR (DePicker et al, 1982, J.mol.appl.Genet.1: 561-573) and (Jefferson, 1987, plantal. biol. Rep.5: 387-405) of the cassava vein mosaic virus (CsVMV) promoter (Verdaguer et al, 1996, plant molecular Biology 31: 1129-1139); v) the expression of the gene sequence encoded by Agrobacterium tumefaciens orf-13' UTR (Huang et al, J.Bacteriol.172: 1814-1822) terminated 3' GFP gene fragment (Evagen Joint Stock Company, Moscow, Russia); vi) homologous sequence-2 comprising the Arabidopsis thaliana 4-coumaroyl-CoA synthase (4-CoAS) intron-1 (locus At3g21320, GenBank NC 003074); and vii) Streptomyces viridochromogenes (S.viridochromogens) phosphinothricin Phosphotransferase (PAT) terminated by the Agrobacterium tumefaciens ORF-25/263' UTR (Gelvin et al, 1987, EP222493) (Wohlleben et al, 1988, Gene 70: 25-37) 3' gene fragment.

The zinc finger-Fok 1 fusion protein binding site (IL-1-L0-Fok1) (Urnov et al, 2005, US2005/0064474) was inserted downstream of the CsVMV promoter (Verdaguer et al, 1996, Plant molecular biology 31: 1129-one 1139) and fused at the N-terminus to the GUS coding sequence (Jefferson, 1987, Plant mol. biol. Rep.5: 387-one 405). The 5 'and 3' GFP gene fragments (Evagen Joint Stockcompany, Moscow, Russia) were flanked by two copies of a second zinc finger-Fok 1 fusion protein binding site (Scd27-L0-Fok1) (Urnov et al, 2005, US 2005/0064474). Each binding site contained four tandem repeats of a specific zinc finger-Fok 1 fusion protein recognition sequence such that each binding site was approximately 200bp in size (fig. 6A). This design is intended to ensure that the recognition sequence will be accessible by the zinc finger-Fok 1 fusion protein in a complex chromatin environment. Each recognition sequence included an inverted repeat to which a single zinc finger-Fok 1 fusion protein bound in homodimer form and cleaved double-stranded DNA (fig. 6B). The 5 'and 3' GFP gene fragments overlap by up to 540bp, which provides homology within the target sequence, and a stop codon is inserted at the 3 'end of the 5' GFP fragment to ensure that there is no functional GFP translation from the target sequence.

Transformation vectors containing the target sequence were generated by a multi-step cloning method as described below.

Construction of the HPT binary vector (pDAB1584)

Vector pDAB1400, containing the GUS expression cassette comprising the Arabidopsis ubi-3 promoter (Callis et al, 1990, J.Bio.chem.265: 12486-12493) driving the GUS gene (Jefferson, 1987) terminated by the Agrobacterium tumefaciens orf-1UTR (Huang et al, J.Bacteriol.172: 1814-1822) was used as the starting basic construct (FIG. 7).

To eliminate any unnecessary repetitive regulatory elements in the target construct, Agrobacterium tumefaciens orf-1UTR (Huang et al, J.Bacteriol.172: 1814-1822) in pDAB1400 was replaced with Agrobacterium tumefaciens orf-24UTR (Gelvin et al, 1987, EP222493) which was excised as a SacI/XbaI fragment from pDAB782 (FIG. 8) and cloned into the same site in pDAB 1400. The resulting construct contained the Arabidopsis thaliana ubi-3 promoter (Callis et al, 1990, J.Bio.chem.265: 12486-12493) driving the GUS gene terminated by Agrobacterium tumefaciens orf-24UTR (Gelvin et al, 1987, EP222493) (Jefferson, 1987, plantalmol.biol.Rep.5: 387-405) and was designated pDAB1582 (FIG. 9).

The HPT coding sequence was PCR amplified from the pDAB354 plasmid (FIG. 10) using primers P1 and P2 (Waldron et al, 1985, Plant mol.biol.18: 189-200). A BbsI site was added to the 5 'end of primer P1, while a SacI site was retained at the 3' end of primer P2. The HPTII PCR fragment was digested with BbsI/SacI and cloned into NcoI-SacI digested pDAB1582 to replace the GUS gene with the HPT gene from the PCR fragment. The resulting plasmid was designated pDAB1583 (FIG. 11).

The Arabidopsis thaliana ubi-3/HPT/Agrobacterium tumefaciens orf-24 fragment was then excised from pDAB1583 by NotI digestion and treated with T4DNA polymerase to generate blunt ends. The end-filled HPT expression cassette was cloned into pDAB2407 (fig. 12) (a binary base vector) at the PmeI site, resulting in plasmid pDAB1584 (fig. 13).

C. Construction of vector (pDAB1580) comprising homologous sequence and binding site for Scd27 Zinc finger-Fok 1 fusion protein

Agrobacterium tumefaciens orf-1UTR (Huang et al, J.Bacteriol.172: 1814-1822) in pDAB2418 (FIG. 14) was replaced with Agrobacterium tumefaciens orf25/26UTR (Gelvin et al, 1987, EP222493) to eliminate repetitive regulatory sequences in the targeting vector. For UTP exchange, Agrobacterium tumefaciens orf25/26UTR was PCR amplified from pDAB4045 plasmid (FIG. 15) using primers P3 and P4 (Gelvin et al, 1987, EP 222493). SmaI and AgeI sites were added at the 3 'end of the PCR fragment, while a SacI site was retained at the 5' end. The pDAB2418 plasmid DNA containing the PAT Gene expression cassette comprising the Arabidopsis ubiquitin-10 (ubi-10) promoter (Callis et al, 1990, J.biol.chem.265: 12486-12493) driving the termination of the PAT Gene (Wohlleben et al, 1988, Gene 70: 25-37) by the Agrobacterium tumefaciens orf-1UTR (Huang et al, J.Bacteriol.172: 1814-1822) and the tobacco RB7MAR sequence (Thompson et al, 1997, WO9727207) was digested with SacI and AgeI and the two largest fragments were recovered. These fragments were ligated to SacI and AgeI digested PCR products of Agrobacterium tumefaciens orf25/26UTR (Gelvin et al, 1987, EP 222493). The resulting plasmid was designated pDAB1575 (FIG. 16). Tobacco RB7MAR (Thompson et al, 1997, WO9727207) acts as the homologous sequence-1 in the targeting vector.

Intron-1 of arabidopsis 4-CoAS (locus At3g21320, GenBank NC003074) was selected to serve as the homologous sequence-2 in the targeting vector. The coding sequence of the PAT Gene (Wohlleben et al, 1988, Gene 70: 25-37) was analyzed and a site for insertion of an intron was identified 299/300bp downstream of the start codon so that appropriate 5 'and 3' splice sites would be formed. The full-length intron was then fused to the 253bp 3' part of the PAT coding sequence by DNA synthesis (Picoscript Ltd., LLP, Houston, Texas). NotI and SacI sites were added to the 5 'and 3' ends of the DNA fragment, respectively. The synthesized DNA fragment was then digested with NotI/SacI and inserted into pDAB1575 at the same site to replace the full-length PAT coding sequence. The resulting construct was named pDAB1577 (fig. 17).

A241 bp DNA fragment containing 4 tandem repeats of the Scd27-L0-Fok1 recognition site was synthesized (Picoscript Ltd., LLP, Houston, Texas) (FIG. 6) with SmaI sites added to the 5 'and 3' ends of the fragment. The synthetic fragments containing the zinc finger-Fok 1 binding sites were then digested with SmaI and inserted into pDAB1577 at the MscI site. The resulting vector was named pDAB1579 (fig. 18). A second SmaI digested fragment containing the zinc finger-Fok 1 binding site was then inserted into pDAB1579 at the SwaI site. The resulting construct was designated pDAB1580 (fig. 19). The vector contained homologous sequences 1 and 2 (tobacco RB7MAR and Arabidopsis thaliana 4-CoAS intron 1, respectively) and two synthetic Scd27 zinc finger-Fok 1 binding sites (each containing 4 tandem repeats of the Scd27-L0-Fok1 recognition site).

D. Construction of a vector comprising two partially repeated non-functional GFP fragments (pDAB1572)

The GFP gene CopGFP was purchased from Evagen Joint Stock Company (Moscow, Russia) and PCR amplified of the full length coding sequence using primers P5 and P6. BbsI and SacI sites were added to the 5 'and 3' ends of the PCR product, respectively. The CopGFP PCR product was then digested with BbsI/SacI and cloned into pDAB3401 (FIG. 20) containing a modified A.tumefaciens Δ mas promoter (Petolino et al, US6730824) driving the GUS gene terminated by A.tumefaciens orf-13' UTR (Huang et al, J.Bacteriol.172: 1814-1822) (Jefferson, 1987, Plant mol.biol.Rep.5: 387-405) to replace the GUS gene at the NcoI/SacI site. The resulting vector was named pDAB1570 (fig. 21).

To prepare two partially duplicated non-functional GFP fragments, primers P9 and P10 were used to PCR amplify a DNA fragment containing most of the CopGFP coding sequence but a 47bp deletion at the 5' end. An ApaI site was added to both the 5 ' and 3 ' ends, and an additional StuI site was added to the 5 ' end and downstream of the ApaI site. The PCR product was then digested with ApaI and inserted into pDAB1570 at the ApaI site, thereby generating two non-functional GFP fragments with 540bp repeats in the same vector. The resulting construct was named pDAB1572 (fig. 22).

E. Constructs comprising a vector for IL-1 Zinc finger-Fok 1 fusion protein binding site/GUS Gene fusion (pDAB1573) Building (2)

A233 bp DNA fragment of 4 tandem repeats containing the recognition site for IL-1_ L0-Fok1 (FIG. 6) was synthesized by Picoscript Ltd, LLP, (Houston, Texas), with NcoI and AflIII sites added to the 5 'and 3' ends, respectively. The synthesized fragment was then digested with NcoI/AflIII and inserted at the NcoI site into pDAB4003 (FIG. 23), which contained the GUS gene driven by the CsVMV promoter (Verdaguer et al, 1996, Plant molecular Biology 31: 1129-containing 1139) and terminated by the Agrobacterium tumefaciens orf-13' UTR (Huang et al, J.Bacteriol.172: 1814-containing 1822) (Jefferson, 1987, Plant mol.biol.Rep.5: 387-containing 405). An N-terminal fusion between the IL-1_ L0-Fok1 binding site and the GUS coding sequence was then generated. The resulting vector was named pDAB1571 (fig. 24).

To eliminate the repetitive 3 'UTR element in the target vector, Agrobacterium tumefaciens nos 3' UTR (DePicker et al, 1982, J.mol.Appl.Genet.1: 561-. The resulting plasmid was designated pDAB1573 (FIG. 26).

F. Final product ofConstruction of the target vector (pDAB1585)

To construct the final target vector, the GUS expression cassette with the IL-1-Fok1 fusion protein target site insertion was excised from pDAB1573 by NotI digestion, the ends were filled in, and pDAB1572 was inserted at the StuI site. The resulting intermediate vector was named pDAB1574 (fig. 27). An intact cassette comprising the modified Δ mas promoter (Petolino et al, US6730824), the 5 ' partially repeated GFP sequence (Evagen JointStockcompany, Moscow, Russia), the CsVMV promoter (Verdaguer et al, 1996, plant molecular biology 31: 1129-H1139), the IL-1-Fok1 fusion protein target sequence, the GUS gene (Jefferson, 1987, plant mol. biol. Rep.5: 387-405) coding region, the Agrobacterium tumefaciens nos 3 ' UTR (DePisor et al, 1982, J. mol. Appl. Genet.1: 561 573), the 3 ' partially repeated GFP (Evagen Joint UTR, Stoscow, Russia) and the Agrobacterium tumefaciens J.J.appl. J.18213-1812: 1824) was excised from pDAB1574 and inserted into the NotI site pDAB 1580. The resulting plasmid was designated pDAB1581 (FIG. 28). The AgeI fragment of pDAB1581 was then inserted into pDAB1584 at the AgeI site, thereby generating the final target construct pDAB1585 (fig. 4 and 5).

Example 6: generation of transgenic cell lines with integrated target sequences

The target sequence of example 5 was stably integrated into tobacco cells BY agrobacterium transformation using a suspension culture of tobacco cells (which is called BY 2). The basic cell line BY2 was obtained from Jun Ueki (Japan Tobacco, Iwata, Shizuoka, Japan). The culture was propagated into 5-10 μ diameter cells in 100-150 cell clusters (cell cluster) with a doubling time of approximately 18 hours. BY2 cell suspension cultures were maintained in medium containing the following components: LS basic salts (Phytotechnology Labs L689), 170mg/L KH₂PO₄30g/L of sucrose, 0.2mg/L of 2, 4-D and 0.6mg/L of thiamine hydrochloride, and the pH value is 6.0. BY2 cells were subcultured every 7 days BY adding 50mL LS-based medium (LS-based medium) to 0.25mL PCV. BY2 cell suspension culture was maintained in a rotary shakerIn a 250mL flask on a bottle machine (25 ℃, 125 RPM).

To generate transgenic BY2 cell cultures with integrated target sequence, a bottle of tobacco suspension 4 days after subculture was divided into 10-12 aliquots of 4mL, which were cultured overnight to OD with 100. mu.L₆₀₀About 1.5 agrobacterium strain LBA4404 comprising pDAB1585 was co-cultured in 100x25mm petri dishes. Each dish was wrapped with parafilm and incubated at 25 ℃ for 3 days without shaking, after which excess liquid was removed and replaced with 11mL of LS-based basal medium containing 500mg/L carbenicillin.

After resuspending tobacco cells, 1mL of the suspension was dispensed onto 100X25mm plates of appropriate basal medium solidified with 8g/L TC agar containing 500mg/L carbenicillin and 200mg/L hygromycin and incubated in the dark in an unwrapped form at 28 ℃. This results in a single treatment resulting in 120-144 block selection plates. Individual hygromycin-resistant isolates (isolates) appeared 10-14 days after plating and were transferred to individual 60x20mm plates (one isolate per plate) where they were maintained as calluses according to the 14-day subculture schedule until needed for analysis and subsequent re-transformation experiments.

Example 7: screening and characterization of target transgenic events

Hygromycin-resistant transgenic events (as described in example 6) generated from transformation of the target vector into BY2 tobacco cell cultures were analyzed as follows.

Initial analyses to screen for these transgenic events included GUS expression analysis to demonstrate accessibility of the target sequence, PCR analysis of partial and full-length target sequences to confirm the presence and integrity of the target vector, and Southern blot analysis to determine the copy number of the integrated target sequence. A subset of transgenic events showing GUS expression contained a single copy of the full-length target sequence; they were selected for reconstitution in suspension culture to generate target lines for subsequent retransformation. These reconstituted target lines also undergo further characterization, which includes more thorough Southern blot analysis, sequencing confirmation of the entire target insert, and flanking genomic sequence analysis.

Transgenic tobacco callus or suspension cultures initiated from selected events were analyzed for GUS activity by incubating 50mg samples in 150. mu.L of assay buffer at 37 ℃ for 24-48 hours. The assay buffer contained 0.2M sodium phosphate pH 8.0, 0.1mM each of potassium ferricyanide and potassium ferrocyanide, 1.0mM sodium EDTA, 0.5mg/mL of 5-bromo-4-chloro-3-indolyl-. beta. -glucuronide (glucuronide), and 0.6% (v/v) of Triton X-100(Jefferson, 1987, plantalmol.biol.Rep.5: 387-. The appearance of the blue colored region was used as an indicator of GUS gene expression, indicating that the target sequence insertion was transcriptionally active and therefore accessible in the local genomic environment.

Transgenic events expressing GUS were assayed by PCR with primer pair P15/P16, which resulted in amplification of a10 kb DNA fragment extending from the 3 'UTR of the HTP expression cassette at the 5' end of the target sequence to the 3 'UTR of the partial PAT gene cassette at the 3' end of the target sequence. Since all events were obtained under hygromycin selection, the HPT expression cassette was considered intact in all target events. Thus, only the 3' UTR of the HPT expression cassette was covered in the full-length PCR analysis. A PCR assay was also performed on a subset of events using primer pairs P15/P17 and P18/P19 to determine the integrity of the 5 'and 3' ends of the target sequence, respectively. All target events confirmed by PCR analysis were further assayed by Southern blot analysis to determine the copy number of the integrated target sequence.

Southern blot analysis was performed on all target events screened by GUS expression and full-length PCR. Mu.g of genomic DNA was digested with NsiI, a single cutter (unique cutter) within the target sequence. Digested genomic DNA was separated on a 0.8% agarose gel and transferred to a nylon membrane. After crosslinking, the transfer DNA on the membrane was hybridized with HPT gene probe to determine the copy number of the 5' end of the target sequence. The same blot was then stripped (strip) and rehybridized with the PAT gene probe to determine the copy number of the 3' end of the target sequence.

Multiple events showing GUS expression and containing a single copy of the full-length target sequence were selected for further characterization, including more thorough Southern blot analysis, entire target sequence confirmation, and flanking genomic sequence analysis. An event called BY2-380 was selected based on molecular characterization. Suspension cultures were reconstituted from this event for subsequent retransformation using a vector containing donor DNA and a non-C2H 2 zinc finger-Fok 1 fusion protein gene.

To ensure that the suspension culture established from target event BY2-380 contained the entire target sequence as expected, the main target sequence (3 ' UTR from HPT expression cassette 5 ' to partial PAT gene cassette 3 ' to the target sequence) was PCR amplified using primer pair P15/P16 and cloned into PCR2.1topo vector (Invitrogen, Carlsbad, California). The PCR products inserted in the TOPO vector were sequenced by lark technology, inc. (Houston, Texas). The sequence results indicated that BY2-380 had the expected complete target sequence.

The BY2-380 cell line was further analyzed using the Universal genome Walker kit (Clontech, Mountain View, California) to obtain flanking genomic sequences. Briefly, 2.5 μ g of genomic DNA was digested in separate reactions using the three blunt-ended restriction enzymes EcoRV, DraI and StuI. The digested DNA was purified by phenol/chloroform extraction and ligated to BD GenomeWalker adaptors. Nested PCR amplification was performed with the ligation as template and primers P20 (walking upstream of the 5 'end of the target sequence insertion) and P21 (walking downstream of the 3' end of the target sequence insertion) for the first-order PCR reaction, and primers P22 (walking upstream of the 5 'end of the target sequence insertion) and P23 (walking downstream of the 3' end of the target sequence insertion) for the second-order nested PCR reaction. Amplified fragments from the secondary PCR reactions were cloned into PCR2.1topo or PCR Blunt IITOPO vectors (Invitrogen, Carlsbad, California) and sequenced using a dye terminator Cycle Sequencing Kit (dye terminator Cycle Sequencing Kit, Beckman Coulter, Fullerton, CA). Flanking genomic sequences were obtained from the BY2-380 target line BY this method. Primers are then designed based on the flanking genomic sequences and used to amplify the entire target sequence.

The amplified fragments obtained from the target line have the expected size. Both ends of the amplified fragment were confirmed by sequencing.

Example 8: design and Generation of Donor DNA vectors

The donor DNA constructs included the homologous sequence-1 (tobacco RB7MAR) (Thompson et al, 1997, WO9727207), the full-length Arabidopsis ubi10 promoter (Callis et al, 1990, J.biol. chem.265: 12486-12493), the 299bp 5' part of the PAT Gene coding sequence (Wohlleben et al, 1988, Gene 70: 25-37) and the homologous sequence-2 (Arabidopsis 4-CoAS intron-1) (locus At3g21320, GenBank NC 003074). Both homologous sequence-1 and sequence-2 in the donor vector are identical to the corresponding homologous sequence-1 and sequence-2 in the target vector (pDAB 1585).

To construct the donor vector, 299bp of the 5' partial PAT coding sequence was fused to the full length Arabidopsis thaliana 4-CoAS intron-1 (locus At3g21320, GenBank NC003074) by DNA synthesis using Picoscript Ltd. NcoI and XhoI sites were added to the 5 'and 3' ends of the fragment, respectively. The synthetic DNA fragment was then digested with NcoI/XhoI and inserted at the same site with pDAB1575 to replace the full length PAT gene coding sequence and its 3' UTR. The resulting construct was named pDAB1576 (fig. 29).

pDAB1576 was then digested with AgeI and the entire fragment containing the 5' portion of the PAT expression cassette flanked by homology-1 and homology-2 was inserted into pDAB2407 (a binary base vector) at the same site. The resulting construct was named pDAB1600 (fig. 30) and was a binary version of the donor vector for plant cell retransformation.

Example 9: design and Generation of Zinc finger nuclease expression vectors

The zinc finger-Fok 1 fusion protein gene was driven by the CsVMV promoter and 5' UTR (Verdaguer et al, 1996, Plant Molecular Biology 31: 1129-1139). Also included in this cassette is the 3' untranslated region (UTR) of the Agrobacterium tumefaciens open reading frame-24 (orf-24) (Gelvin et al, 1987, EP 222493).

To construct these vectors, the C2H2 control and its C3H variant of the IL-1-Fok1 and Scd27-Fok1 coding sequences described in examples 1to 4 above were PCR amplified from their original design, with BbsI or NcoI and SacI sites added to the 5 'and 3' ends of the PCR fragments, respectively, and cloned into NcoI-SacI digested pDAB3731 (FIG. 31). The resulting plasmids were designated pDAB4322 (fig. 32), pDAB4331 (fig. 33), pDAB4332 (fig. 34), pDAB4333 (fig. 35), pDAB4334 (fig. 36), pDAB4336 (fig. 37), and pDAB4339 (fig. 38). All these vectors contain attL1 and attL2 sites flanking the ZFN expression cassette and are linked to Gateway^TMCloning systems (Invitrogen, Carlsbad, California) were compatible.

Two sets of binary type vectors are constructed for the IL-1-FokI fusion protein. One set contained the PAT selectable marker gene, while the other set did not. For the SCd27-FokI fusion protein, only binary versions of the vector were constructed that did not contain the PAT selectable marker gene. To construct a binary vector containing the PAT selectable marker Gene, LR clone was used^TMThe IL-1-FokI fusion protein expression cassettes in pDAB4322, pDAB4331, pDAB4332, pDAB4333, pDAB4334, and pDAB4336 were cloned into pDAB4321 (FIG. 39) via LR recombination reactions from an enzyme cocktail (Invitrogen, Carlsbad, California). The resulting plasmids were designated as pDAB4323 (fig. 40), pDAB4341 (fig. 41), pDAB4342 (fig. 42), pDAB4343 (fig. 43), pDAB4344 (fig. 44), pDAB4346 (fig. 45). To construct a binary vector without the PAT selectable marker Gene, LR clone was used^TMEnzyme cocktail (Invitrogen, Carlsbad, California) the C2H2IL-1-FokI, C3H IL-1-FokI and Scd27-FokI expression cassettes in pDAB4331, pDAB4336 and pDAB4339, respectively, were cloned into pDAB4330 (FIG. 46) via LR recombination reactions. The resulting plasmids were designated pDAB4351 (fig. 47), pDAB4356 (fig. 48) and pDAB4359 (fig. 49), respectively.

To construct the C2H2 control of SCD27-FokI, the HindIII/SacI fragment comprising the CsVMV promoter driving PAT and the 5 ' UTR in pDAB7002 (FIG. 50) was replaced with a fragment comprising the CsVMV promoter driving GUS and the 5 ' UTR and tobacco 5 ' UTR, which was cut from pDAB7025 (FIG. 51) with HindIII/SacI. The resulting plasmid was designated pDAB1591 (FIG. 52). The Scd27-L0-Fok1 coding sequence was PCR amplified from its original vector pCDNA3.1-SCD27a-L0-FokI (FIG. 53) using the primer pair P13/P14. BbsI and SacI sites were added to the 5 'and 3' ends of the PCR fragment, respectively. The PAT gene in pDAB1591 was replaced with the zinc finger fusion protein gene PCR fragment via SacI/NcoI cloning. The resulting plasmid was designated pDAB1594 (FIG. 54). A binary version of this vector was constructed by excising the zinc finger fusion protein gene expression cassette from pDAB1594 as the PmeI/XhoI fragment, filling in the ends, and cloning into pDAB2407 at the PmeI site. The resulting plasmid was designated pDAB1598 (FIG. 55). Details of all binary vectors used for plant transformation are summarized in table 6.

Table 6: zinc finger nuclease expression vector

The fokl domain is plant codon-preferred.

Example 10: design and Generation of Positive control vectors

To assess the non-canonical recombination frequency and to serve as a positive control, a vector containing the PAT gene expression cassette was used. To allow comparison with the final recombinant, Arabidopsis thaliana 4-CoAS intron-1 (locus At3g21320, GenBank NC003074) was inserted At 299/300bp in the PAT coding sequence (Wohlleben et al, 1988, Gene 70: 25-37). To construct this construct, the 2559bpSwaI/ClaI fragment from pDAB1576 was ligated to the backbone fragment of pDAB1577 (fig. 56) digested with the same restriction enzymes. The resulting vector contained the PAT Gene expression cassette and it had a 1743bp insertion of Arabidopsis thaliana 4-CoAS intron-1 (locus At3g21320, GenBank NC003074) (locus At3g21320, GenBank NC003074) in the middle of the PAT coding sequence (Wohlleben et al, 1988, Gene 70: 25-37). The vector was named pDAB1578 (FIG. 57).

To construct a binary version of pDAB1578, the PAT gene expression cassette with Arabidopsis intron-1 (locus At3g21320, GenBank NC003074) was excised from pDAB1578 using PmeI/XhoI. After the 3' end of this fragment was filled in, it was inserted into the binary base vector pDAB2407 at the PmeI site. The resulting vector was designated pDAB1601 (FIG. 58) and contained the PAT Gene comprising the Arabidopsis 4-CoAS intron-1 (locus At3g21320, GenBank NC003074) driven by the Arabidopsis ubi10 promoter (Callis et al, 1990, J.biol.chem.265: 12486-12493) and terminated by Agrobacterium tumefaciens orf 25/263' UTR (Gelvin et al, 1987, EP222493) (Wohlleben et al, 1988, Gene 70: 25-37).

Example 11: demonstration of intrachromosomal homologous recombination by retransformation of target cell cultures with C3H zinc finger nuclease gene

To confirm the functionality of the C3H zinc finger nuclease in stimulating intrachromosomal homologous recombination, two non-functional GFP fragments with a 540bp overlap were included in the target vector as shown in figure 59. Between these two fragments is the GUS gene expression cassette. The IL-1-Fok1 fusion protein binding sequence was fused to the GUS coding sequence at its N-terminus. Without being limited to one theory, it is hypothesized that in the presence of the IL-1-Fok1 fusion protein, an IL-1ZFN binding sequence will be recognized and a double-stranded DNA break will be induced, which will stimulate the endogenous DNA repair process. In the absence of donor DNA, the two partially homologous GFP fragments undergo an intrachromosomal homologous recombination process and the functional GFP gene is reconstituted.

The BY2-380 transgenic cell line (containing a single copy of the full-length integrated target sequence) was used to initiate suspension culture again, i.e., approximately 250-500mg of callus tissue was placed in 40-50mL LS-based basal media containing 100mg/L hygromycin and subcultured every 7 days as described above. Suspension cultures were transferred to minimal medium without hygromycin for at least two generations prior to retransformation.

Agrobacterium-mediated transformation of target cell cultures was performed as described above. For each experiment, 8 co-culture plates were generated as follows: 1 plate contained cells co-cultured with 300. mu.L of Agrobacterium tumefaciens strain LBA 4404; 1 plate contained cells co-cultured with 300 μ L of agrobacterium strain containing pDAB1590 (functional GFP construct); each of the 6 plates contained cells co-cultured with 300 μ L of an agrobacterium strain containing pDAB4323, pDAB4341, pDAB4342, pDAB4343, pDAB4344, and pDAB4346, respectively. After co-culturing using the method described above, cells were plated on 8 plates containing LS-based basal medium supplemented with 500mg/L carbenicillin without selection agent. The apparent expression of the constructed functional GFP gene resulted in visible fluorescence approximately 5-8 days after transformation. The number of green fluorescence sites per field was counted by observing 5 "random" microscopic fields per plate (8 plates per construct in each experiment) and the average was calculated from 6 independent experiments.

As summarized in table 7, an average of 9.50 and 7.57 green fluorescent positions per field were observed from two C3H zinc finger nucleases, namely pDAB4346 and pDAB4343, respectively. These two C3H designs of IL-1-FokI performed better than their C2H2 controls, i.e., pDAB4341 (6.37 positions per field) and pDAB4323 (5.53 positions per field). At the same time, the function of the other two C3H variants of the IL-1-fokl fusion protein, namely pDAB4344 (4.39 positions per field) and pDAB4342 (0.25 position per field), was significantly impaired compared to the C2H2 control, in particular pDAB4342, where the C3H switch was only performed by replacing the second cysteine with histidine in the 4 th finger. No detectable fluorescence above a slight background was observed in the negative control transformed with agrobacterium tumefaciens strain LBA 4404.

TABLE 7

Construction of functional GFP via intrachromosomal homologous recombination stimulated by IL-1-Fok1 zinc finger fusion protein

Carrier	ZFP type	GFP expression	Tukey test^**
				pDAB4346	C3H	9.50	A
pDAB4343	C3H	7.57	B
				pDAB4341	C2H2	6.37	C
pDAB4323^*	C2H2	5.53	D
				pDAB4344	C3H	4.39	E
pDAB4342	C3H	0.25	F

^*Comprising non-plant codon-preferred FokI domains

^**Mean values not related by the same letter are significantly different at the 0.05 level

Example 12: demonstration of homologous recombination between chromosomes by re-transforming target cell cultures with C3H zinc finger nuclease gene and donor DNA sequences

To demonstrate the functionality of the C3H zinc finger-Fok 1 fusion protein in stimulating homologous recombination between chromosomes in an exemplary tobacco system, two strategies were developed and tested.

In strategy 1, the middle of the target construct (FIG. 61) included a zinc finger-Fok 1 fusion protein binding site (IL-1-L0-Fok 1). In this strategy, the binding site is flanked on both sides by approximately 3kb of non-homologous sequence, followed upstream and downstream by homologous sequence-1 (tobacco RB7MAR) and homologous sequence-2 (Arabidopsis 4-CoAS intron-1), respectively. As previously demonstrated (e.g., U.S. patent publication No.20050064474), in the presence of a C2H2IL-1 zinc finger-Fok 1 fusion protein, the IL-1-L0-Fok1 binding sequence is recognized and a double stranded DNA break is induced at this specific site, which stimulates the endogenous DNA repair process. In the presence of donor DNA (which contains homologous sequences identical to those in the target sequence), the 5' portion of the PAT gene and its promoter replace the entire about 6kb DNA fragment between the homologous sequences in the target via homologous recombination. In this way, two partial PAT gene sequences (interspersed between Arabidopsis thaliana 4-CoAS intron-1) recreate a functional PAT gene, resulting in PAT expression and a herbicide resistance phenotype.

In strategy 2, two zinc finger-Fok 1 binding sites (Scd27-L0-FokI) were included in the targeting vector: one was located just downstream of the tobacco RB7MAR, and the other was located just upstream of the Arabidopsis thaliana 4-CoAS intron-1 (FIG. 62). Between the two zinc finger-Fok 1 fusion protein binding sites was an approximately 6kb sequence comprising a 5 'GFP fragment, a GUS expression cassette and a 3' GFP fragment. As previously demonstrated (e.g., U.S. patent publication No.20050064474), in the presence of a Scd27 zinc finger-Fok 1 fusion protein, two binding sequences were recognized and a double stranded DNA break was induced at two positions, which removed the approximately 6kb DNA fragment between the two binding sequences and stimulated the endogenous DNA repair process. Similar to strategy 1, the 5' portion of the PAT gene and its promoter were inserted into the target sequence by homologous recombination at the site inducing double-stranded DNA breaks in the presence of the donor DNA (which contained the same homologous sequence as in the target sequence). In this way, two partial PAT gene sequences (interspersed between Arabidopsis thaliana 4-CoAS intron-1) recreate a functional PAT gene, resulting in PAT expression and a herbicide resistance phenotype.

Agrobacterium-mediated transformation of BY2-380 target cell cultures was performed as described above. For each experiment, 12 co-culture plates were generated as follows: 1 plate contained cells co-cultured with 50 μ L of Agrobacterium strain containing pDAB1600 (donor DNA) and 250 μ L of Agrobacterium base strain LBA 4404; 1 plate contained cells co-cultured with 50 μ L of Agrobacterium strain containing pDAB1601(PAT selectable marker) and 250 μ L of Agrobacterium base strain LBA 4404; 2 plates contained cells co-cultured with 50 μ L of an Agrobacterium strain containing pDAB1600 (donor DNA) and 250 μ L of an Agrobacterium strain containing pDAB4351(C2H2IL-1ZFP-Fok 1); 3 plates contained cells co-cultured with 50 μ L of an Agrobacterium strain containing pDAB1600 (donor DNA) and 250 μ L of an Agrobacterium strain containing pDAB4356(C3H IL-1ZFP-Fok 1); 2 plates contained cells co-cultured with 50 μ L of an Agrobacterium strain containing pDAB1600 (donor DNA) and 250 μ L of an Agrobacterium strain containing pDAB1598(C2H2Scd27a ZFP-Fok 1); 3 plates contained 50 μ L of cells co-cultured with an Agrobacterium strain containing pDAB1600 (donor DNA) and 250 μ L of cells co-cultured with an Agrobacterium strain containing pDAB4359(C3H Scd27aZFP-Fok 1). After co-culturing using the method described above, the cells were plated to contain 500mg/L carbenicillin and 15mg/LOn a LS-based basal medium. Individual appearance 2-4 weeks after platingResistant isolates were transferred to individual 60x20mm plates (one isolate per plate) where they were maintained as calluses according to a 14 day subculture schedule until needed for analysis.

Multiple gains were obtained from both C3H IL-1 zinc finger nuclease (pDAB4356) and C3H Scd27 zinc finger nuclease (pDAB4359)Resistant isolates. These isolates were analyzed by PCR using primer pair P24/25, which amplifies a DNA fragment spanning the reconstituted PAT gene. Primer P24 is homologous to the 5 'end of the PAT coding sequence in the donor DNA, while primer P25 is homologous to the 3' end of the PAT coding sequence in the target DNA. If two partial PAT coding sequences are ligated by homologous recombination, a 2.3kb PCR fragment will be generated. As shown in FIG. 63, 2.3kb PCR products were obtained from the multiple isolates analyzed. These isolates were obtained from the co-transformation of both the C3H IL-1 zinc finger-Fok 1 fusion protein gene/donor DNA and the C3H Scd27 zinc finger-Fok 1 fusion protein gene/donor DNA. The 2.3kb PCR products from multiple independent isolates (representing isolates derived from both C3H IL-1 zinc finger-Fok 1 and C3H Scd27 zinc finger-Fok 1 fusion protein gene transformations) were purified from agarose gels and cloned into the PCR2.1topo vector (Invitrogen, Carlsbad, California). The 2.3kb PCR product inserted in the TOPO vector was then sequenced using the dye terminator cycle sequencing kit (Beckman Coulter). Sequencing results confirmed that all PCR products cloned in TOPO vectors contained predicted recombination sequences, including 5 'and 3' partial PAT gene sequences and an intervening arabidopsis 4-CoAS intron-1. These results confirm that the predicted interchromosomal recombination and exemplified gene targeting for the two strategies tested occurred via C3H zinc finger-Fok 1 fusion protein gene expression.

Example 13: identification of target gene sequences in maize cell cultures

A. Sequence identification

In this example, the DNA sequence of an endogenous maize gene of known function was selected as a target for genome editing using an engineered zinc finger nuclease. The genomic structure and sequence of this gene (designated IPP2-K, derived from the proprietary maize inbred line 5XH751) has been described in WO2006/029296, the disclosure of which is incorporated by reference.

In particular, the TIGR maize genomic database (available over the Internet at http:// www.tigr.org/tdb/tgi/mail) was queried using the BLAST algorithm using the IPP2-K genomic sequence. Several individual genomic fragments were identified with overlapping homology to IPP2-K, including but not limited to accession numbers AZM515213 and TC 311535. Based on the sequences of these accession numbers and the IPP2-K sequence, a number of short oligonucleotides were designed for use as PCR primers using the Primer3 program (Rozen, S. and Squaletsky, H.J (2000) Primer3 on the WWW for general users and for biologist programmers, Krawetz S, Misener S (eds.) Bioinformatics Methods and Protocols, Methods in molecular biology, HumanaPress, Totowa, NJ, pp 365-. These primers include, but are not limited to, the following forward oligonucleotides:

1.5’-ATGGAGATGGATGGGGTTCTGCAAGCCGC-3’(SEQ ID NO：104)

2.5’-CTTGGCAAGGTACTGCGGCTCAAGAAGATTC-3’(SEQ ID NO：161)

3.5’-ATGAAGAAAGACAGGGAATGAAGGAC-3’(SEQ ID NO：162)

4.5’-ATGAAGAAAGACAGGGAATGAAGGACCGCCAC-3’(SEQ ID NO：163)

5.5’-CATGGAGGGCGACGAGCCGGTGTAGCTG-3’(SEQ ID NO：164)

6.5’-ATCGACATGATTGGCACCCAGGTGTTG-3’(SEQ ID NO：165)

in addition, primers include, but are not limited to, the following reverse oligonucleotides:

7.5’-TTTCGACAAGCTCCAGAAAATCCCTAGAAAC-3’(SEQ ID NO：166)

8.5’-ACAAGCTCCAGAAAATCCCTAGAAACAC-3’(SEQ ID NO：167)

9.5’-TTCGACAAGCTCCAGAAAATCCCTAGAAACAC-3’(SEQ ID NO：168)

10.5’-TGCTAAGAACATTCTTTTCGACAAGCTCC-3’(SEQ ID NO：169)

11.5’-GAACATTCTTTTCGACAAGCTCCAGAAAATCC-3’(SEQ ID NO：170)

all oligonucleotide primers were synthesized by and purchased from Integrated DNA Technologies (IDT, Coralville, IA).

Hi II maize cell culture

To obtain immature embryos for callus culture initiation, F was performed between the Hi-II parents A and B (Armstrong, C., Green, C., and Phillips, R. (1991) Maize Genet. Coop. News Lett.65: 92-93) grown in the greenhouse₁And (4) hybridizing. Embryos of about 1.0-1.2mm in size (about 9-10 days after pollination) were harvested from healthy ears by application of Liqui-Surface sterilization was performed by soap scrubbing, immersion in 70% ethanol for 2-3 minutes, followed by immersion in 20% commercial bleach (0.1% sodium hypochlorite) for 30 minutes.

The ears were rinsed in sterile distilled water and immature zygotic embryos excised under sterile conditions and cultured in 15Ag10 medium (N6 medium (Chu C.C., Wang C.C., Sun C.S., Hsu C., Yin K.C., Chu C.Y., and Bi F.Y (1975) Sci.Sinica 18: 659-668), 1.0mg/L2, 4-D, 20g/L sucrose, 100mg/L casein hydrolysate (enzymatic digest), 25mM L-proline, 10mg/L AgNO₃、2.5g/L Gelrite，pH5.8) for 2-3 weeks with scutellum (scutellum) facing away from the medium. Tissue showing the expected morphology (Welter, ME, Clayton, DS, Miller, MA, Petolino, JF. (1995) Plant Cell Rep: 14: 725-.

To initiate embryogenic suspension cultures, approximately 3ml of single embryo derived callus tissue of Packed Cell Volume (PCV) was added to approximately 30ml H9CP + liquid medium (MS basal salt mixture (Murashige T. and Skoog F. (1962) Physiol. plant.15: 473-497), modified MS vitamins containing 10-fold less niacin and 5-fold more thiamine hydrochloride, 2.0 mg/L2, 4-D, 2.0mg/L alpha-naphthaleneacetic acid (NAA), 30g/L sucrose, 200mg/L casein hydrolysate (acid digest), 100mg/L myo-inositol, 6mM L-proline, 5% v/V coconut water (added just before subculture), pH 6.0). Suspension cultures were maintained in 125ml Erlenmeyer flasks in a temperature controlled shaker set at 125rpm, 28 ℃ under dark conditions. During cell line establishment (2-3 months), the suspension was sub-cultured every 3.5 days by adding 3ml PCV cells and 7ml conditioned medium to 20ml fresh H9CP + liquid medium using a macro-porous pipette. After maturation (as demonstrated by growth doubling) was reached, the suspension was scaled up in 500ml flasks and maintained, whereby 12ml PCV cells and 28ml conditioned medium were transferred to 80ml H9CP + medium. After the suspension culture is fully established, an aliquot is cryopreserved for future use. See WO 2005/107437.

DNA isolation and amplification

The maize HiII cell cultures described above were cultured in standard GN6 medium (N6 medium, 2.0 mg/L2, 4-D, 30g/L sucrose, 2.5g/L LGelrite, ph5.8) in 250ml flasks and genomic DNA was extracted using Qiagen (Valencia, CA) plant DNeasy extraction kit according to the manufacturer's recommendations. In the following wayPCR amplification reactions using the primers described above in all possible combinations were performed as follows: a reaction volume of 25. mu.l containing 20nggDNA template, 20pmol of each primer, 1% DMSO and 10 units of Accuprime in enzyme manufacturer's buffer^TMPf polymerase (Invitrogen, Carlsbad, Calif.). Amplification products ranging in size from 500bp to 2kb are derived from amplification cycles consisting of 95 ℃ -1 ', (95 ℃ -30 ', 57-62 ℃ -30 ', 72 ℃ -1 ') X30, 72 ℃ -5 ', 4 ℃ -maintenance. The amplified fragment was cloned directly into the vector pCR2.1(Invitrogen, Carlsbad, CA) using the TA cloning kit (Invitrogen, Carlsbad, CA) according to the manufacturer's recommendations.

D. Sequence analysis

Previous analyses of the IPP2-K gene in maize inbred 5XH751 and HiII cell cultures have indicated the presence of 2-3 different genes constituting the minigene family (Sun et al, in print, planta Physiology; WO 2006029296). Thus, the isolated cloned fragments were sequenced using the CEQ dye Terminator Cycle Sequencing Kit (CEQDye Terminator Cycle Sequencing Kit, Beckman Coulter, Fullerton, Calif.) according to the manufacturer's recommendations. Sequence analysis of multiple clones revealed that 2 different gene fragments derived from 2 different and previously characterized loci in the maize genome have been isolated from HiII cells.

Comparison of the 2 sequences isolated from HiII cultured cells indicated that there were minor differences (such as single nucleotide polymorphisms, SNPs) between the 2 paralogs in the predicted coding regions, whereas introns and noncoding regions differ significantly at the nucleotide level. These differences between the 2 paralogs were noted because they highlighted regions of the sequence that could be recognized by sequence-dependent DNA binding proteins such as zinc finger domains. One skilled in the art can design a zinc finger DNA binding domain that binds to one gene sequence but not another highly similar gene sequence. The 1.2kb partial gene sequence corresponding to the paralog of interest (FIG. 66) was selected as a template for zinc finger nuclease protein design, followed by zinc finger DNA binding domain analysis as described above.

Example 14: design of IPP2-K zinc finger DNA binding Domain

Using the target site identified for IPP2-K, recognition helices were selected for the IPP2-K zinc finger. The zinc finger design is shown in table 8 below.

Table 8: IPP2-K zinc finger design

Name of ZFN

F1

F2

F3

F4

F5

IPP2-K-1072a1

DRSALSR(SEQ IDNO：105)

RNDDRKK(SEQ IDNO：106)

RSDNLST(SEQ IDNO：107)

HSHARIK(SEQ IDNO：108)

RSDVLSE(SEQ IDNO：109)

QSGNLAR(SEQ IDNO：110)

IPP2-K-1072b1

DRSALSR(SEQ IDNO：105)

RNDDRKK(SEQ IDNO：106)

RSDNLAR(SEQ IDNO：111)

TSGSLTR(SEQ IDNO：112)

RSDVLSE(SEQ IDNO：109)

QSGNLAR(SEQ IDNO：110)

IPP2-K-1072c1

DRSALSR(SEQ IDNO：105)

RNDDRKK(SEQ IDNO：106)

TSGNLTR(SEQ IDNO：113)

TSGSLTR(SEQ IDNO：112)

RSDVLSE(SEQ IDNO：109)

QSGNLAR(SEQ IDNO：110)

IPP2-K-r1065a1

RSDHLSE(SEQ IDNO：114)

QSATRKK(SEQ IDNO：115)

ERGTLAR(SEQ IDNO：116)

RSDALTQ(SEQ IDNO：117)

Is free of

IPP2-K-r1149a2

RSDSLSA(SEQ IDNO：118)

RSAALAR(SEQ IDNO：119)

RSDNLSE(SEQ IDNO：120)

ASKTRTN(SEQ IDNO：121)

DRSHLAR(SEQ IDNO：122)

Is free of

IPP2-K-1156a2

RSDHLST(SEQ IDNO：123)

QSGSLTR(SEQ IDNO：124)

RSDHLSE(SEQ IDNO：114)

QNHHRIN(SEQ IDNO：125)

TGSNLTR(SEQ IDNO：126)

DRSALAR(SEQ IDNO：127)

Target sites for zinc finger design are shown in table 9 below.

Table 9: target site of IPP2-K zinc finger

Name of ZFN	Target sites (5 'to 3')
		IPP2-K-1072a1	GAACTGGTTGAGTCGGTC (SEQ ID NO：128)
IPP2-K-1072b1	GAACTGGTTGAGTCGGTC (SEQ ID NO：129)
		IPP2-K-1072c1	GAACTGGTTGAGTCGGTC (SEQ ID NO：129)
IPP2-K-r1065a1	ATGGCCCCACAG (SEQ ID NO：130)
		IPP2-K-r1149a2	GGCACCCAGGTGTTG (SEQ ID NO：131)
IPP2-K-1156a2	GTCGATGGTGGGGTATGG (SEQ ID NO：132)

IPP2-K was designed to be incorporated into a zinc finger expression vector encoding a protein with a CCHC structure. See tables 1to 4 above. The non-canonical zinc finger coding sequence was then fused to the nuclease domain of the type IIS restriction enzyme FokI (amino acid 384-.

Example 15: gene correction Using IPP2-K Zinc finger nuclease

In Urnov (2005) Nature 435 (7042): 646-51 and U.S. patent publication No.20050064474 (e.g., examples 6-11) test the ability of IPP2-K ZFNs described herein to promote homologous recombination in a GFP system. Briefly, 50ng of each ZFN and 500ng of a promoterless GFP donor (unrnov (2005) Nature) were transfected into 500,000 reporter cells using 2 μ L of Lipofectamine2000 per sample according to the Invitrogen Lipofectamine2000 protocol.

GFP expression was determined for cells 5 days post-transfection by measuring 40,000 cells per transfection on a Guava desk top FACS analyzer. The results are shown in FIG. 69.

Example 16: expression of C3H1ZFN in maize HiII cells

A. Carrier design

Plasmid vectors were constructed for expression of ZFN proteins in maize cells. To optimize the expression and relative stoichiometry of the 2 different proteins required to form functional zinc finger nuclease heterodimers, an expression strategy was employed that resulted in the insertion of open reading frames for two ZFN monomers on a single vector, driven by a single promoter. This strategy utilizes the 2A sequence derived from the Theroaassigna virus (Mattion, N.M., Harnish, E.C., Crowley, J.C. and Reilly, P.A. (1996) J.Virol.70, 8124-8127), maize Nuclear Localization Signal (NLS) from opaque gene II (opaque-2, op-2) (Maddalonon, M., Difonzo, N., Hartings, H., Lazzaroni, N., Salaminil, F., Thompson, R., and Motto M. (1989) Nucleic Acids Research Vol.17 (18): 7532), and the function of the promoter derived from the maize ubiquitin-1 gene (Christen A.H., Sharrrak R.A., Quorai.32 and Molti. P.H) (Biol675). A stepwise modular (modular) cloning scheme was designed to develop these expression vectors for any given pair of ZFN-encoding genes selected from library archives or synthesized de novo.

First, the pVAX vector (see, e.g., U.S. patent publication No. 2005-0267061, the disclosure of which is incorporated by reference) was modified to include the N-terminal expression domains shown in Panels A-E of FIG. 65. The characteristics of this modified plasmid (pVAX-N2A-NLSop2-EGFP-FokMono) (FIG. 65A) included a segment that was redesigned and synthesized to encode a maize op-2 derived NLS (RKRKESNRESARRSRYRK, SEQ IDNO: 133) and a segment that was redesigned and synthesized to encode a FokI nuclease domain that utilizes maize codon preferences. In addition, a single nucleotide insertion (C) downstream of the unique XhoI site created an additional SacI site to facilitate cloning.

Next, the pVAX vector (see, e.g., U.S. patent publication 2005-0267061) was also modified to include a C-terminal expression domain. The characteristics of this modified plasmid (pVAX-C2A-NLSop2-EGFP-FokMono) (FIG. 65B) included a segment that was redesigned and synthesized to encode a maize op-2 derived NLS (RKRKESNRESARRSRYRK, SEQ ID NO: 133) and a segment that was redesigned and synthesized to encode a FokI nuclease domain that utilizes maize codon preference. In addition, for the subsequent ligation of the 2 protein coding domains, the 2A sequence from Thosea asigna virus (EGRGSLLTCGDVEENPGP, SEQ ID NO: 134) was introduced at the N-terminus of the ZFN ORF.

The gene cassettes encoding the ORFs of the respective zinc finger proteins were cloned into N2A or C2A vectors via ligation using restriction enzymes KpnI and BamHI to generate compatible termini. Next, the BglII/XhoI fragment from the C2A vector was inserted into the N2A vector via the same restriction sites, creating an intermediate construct comprising a cassette comprising 2 ZFN coding domains flanked by NcoI and SacI restriction sites.

Finally, the NcoI/SacI cassette containing the two ZFN genes was excised from this intermediate construct via restriction digestion using those enzymes (fig. 65C) and ligated into the plasmid backbone pDAB3872 (fig. 65D). The resulting plasmid comprises the ZFN gene plus associated promoter and terminator sequences, plus a selectable marker for plasmid maintenance.

In the final construct (an example of which is shown in fig. 65E), the ZFN expression cassette (including promoter and terminator elements) is flanked by attL sites to facilitate manipulation using the Gateway system (Invitrogen, Carlsbad, CA). Each ZFN construct generated using this cloning protocol was transformed into e.coli DH 5a cells (Invitrogen, Carlsbad, CA) and subsequently maintained under appropriate selection.

DNA delivery and transient expression

Plasmid preparations of ZFN expression vectors constructed as described in fig. 65E were generated from 2L of e.coli cell cultures grown in LB medium supplemented with antibiotics using the endonuclease-free Gigaprep kit (Qiagen, Valencia, CA) according to the manufacturer's recommendations. Plasmid DNA is delivered directly to maize HiII cultured cells using a variety of methods.

In one example, via Whisskers^TMDNA delivery is performed on maize cells. Approximately 24 hours before DNA delivery3ml of PCV HiII maize suspension cells supplemented with 7ml of conditioned medium were subcultured to 20ml of GN6 liquid medium (GN 6 medium lacking Gelrite) in 125ml Erlenmeyer flasks and left for 24 hours on a shaker (125rpm, 28 ℃). 2mL of PCV was removed and added to 12mL of GN6S/M osmotic medium (N6 medium, 2.0 mg/L2, 4-D, 30g/L sucrose, 45.5g/L sorbitol, 45.5g/L mannitol, 100mg/L myo-inositol, pH6.0) in a 125mL Erlenmeyer flask. The flasks were incubated at 28 ℃ with moderate agitation (125rpm) in the dark for 30-35 minutes. During this time, a 50mg/ml suspension of silicon carbide whiskers (Advanced composite materials, inc., Eureka Springs, AK) was prepared by adding an appropriate volume of GN6S/M liquid medium to pre-weighed sterile whiskers (whisker). After incubation in GN6S/M, the contents of each flask were poured into 15mL conical-bottomed centrifuge tubes.

After the cells settled, approximately (all but)1mL GN6S/M liquid was aspirated and collected in a 125mL flask for future use. The pre-wetted whisker suspension was vortexed at the highest speed for 60 seconds, 160 μ Ι _ was added to the centrifuge tube using a large-pore, filter pipette tip, and 20 μ g DNA was added. The tube was "finger vortexed" and immediately placed in a Caulk "Vari-Mix II" dental amalgam mixer (modified to accommodate a 17X100mm culture tube) and then agitated at medium speed for 60 seconds. After agitation, the mixture of cells, media, whiskers and DNA was returned to the Erlenmeyer flask along with 18ml of added GN6 liquid media. Cells were allowed to recover in the dark for 2 hours on a shaker (125RPM, 28 ℃).

Approximately 5-6mL of the dispersed suspension was filtered onto Whatman #4 filter paper (5.5cm) using a glass cell collector unit (glass cell collector unit) connected to an in-house vacuum line, so that 5-6 sheets of filter paper were obtained for each sample. Filter paper was placed onto a 60x20mm GN6 medium plate and incubated at 28 ℃ in the dark. 24. After 48, or 72 hours, cells from 2-5 filter papers were scraped, collected into tubes, placed on dry ice, and then frozen at-80 ℃.

In another example of DNA delivery, purified endonuclease-free plasmid preparations were delivered directly to corn cells using microprojectile bombardment techniques adapted from the instrument manufacturer's protocol. With biolistics PDS-1000/He^TMThe system (Bio-Rad Laboratories, Hercules, Calif.) performs all the bombardment. For particle coating, 3mg of 1.0 micron diameter gold particles were washed 1 time with 100% ethanol, 2 times with sterile distilled water, and resuspended in 50 μ l of water in siliconized Eppendorf tubes. Mu.g plasmid DNA, 20. mu.l spermidine (0.1M) and 50. mu.l calcium chloride (2.5M) were added to the gold suspension. The mixture was incubated at room temperature for 10 minutes, pelleted at 10K rpm for 10 seconds, resuspended in 60. mu.l of cold 100% ethanol, and 8-9. mu.l were dispensed onto each macro-carrier (macrocarrier). To prepare the cells for bombardment, clusters of cells (cell cluster) were removed from the liquid culture 3 days after subculture and placed in a circle of 2.5cm in diameter on a permeation medium consisting of growth medium supplemented with 0.256M each of mannitol and sorbitol in a petri dish. Cells were incubated in osmoticum for 4 hours prior to bombardment. Bombardment occurs in the instrument described above by placing tissue on the middle stent under vacuum conditions of 1100psi and 27 inches hg (in of hg) and following the operating manual. At a time point of 24 hours post-treatment, the bombarded cell clusters were harvested, frozen in liquid nitrogen, and stored at-80 ℃.

Another example of DNA delivery and ZFN transient expression in maize cells involves the use of protoplast preparations. Use was made of a mixture obtained from Mitchell and Petolino (1991) j. 530 ℃ 536 and Lyznik et al (1995) planta J.8 (2): 177-186 modified method protoplasts are prepared from HiII maize cell cultures. Suspension cultures were harvested 48 hours after subculture (mid-log growth) by centrifugation at 1000rpm for 5 minutes. The medium was removed and incubated in 10ml of W5 medium (154mM NaCl)₂；125mM CaCl₂·H₂O；5mM KCl₂(ii) a 5mM glucose; pH5.8) and washing 5ml of the packed PCV.

The washed cells were collected by centrifugation at 100rpm for 5 minutes, followed by incubation in an enzyme mixture at 25ml Filter-sterilized K3 Medium (2.5g KNO)₃；250mgNH₄NO₃；900mg CaCl₂(dihydrate); 250Mg Mg₂SO₄；250mg NH₄SO₄；150mgNaPO₄(monobasic); 250mg of xylose; 10ml of ferrous sulfate/chelate (chemical) stock solution (F318); 1ml of B5 micronutrient (1000 Xstock-750 mg potassium iodide; 250mg molybdic acid (sodium salt) dehydrate; 25mg cobalt chloride; 25mg copper sulfate); 10ml of K3 vitamin (100 Xstock-1 g myo-inositol; 10mg pyridoxine hydrochloride; 100mg thiamine hydrochloride; 10mg nicotinic acid); +0.6M mannitol; ph5.8) contained 3% cellulase Y-C + 0.3% pectolyase Y23(pectolyase Y23) (karlan research Products corp., Cottonwood, AZ). Cells were incubated at 25 ℃ for 5-6 hours with gentle agitation (50rpm) to digest the secondary plant cell walls.

After cell wall degradation, the enzyme-cell mixture was filtered through a100 micron cell strainer (cell tractor) and the effluent containing protoplasts and cell debris was washed with an equal volume of K3+0.6M mannitol medium. Protoplasts were centrifuged at 800rpm for 5 minutes, the supernatant was discarded, and washing was repeated. The protoplast pellet was washed and resuspended in 20ml of K3+0.6M mannitol + 9% Ficoll 400 solution. 10ml of this solution were dispensed into 2 sterile plastic tubes and 2ml of TM medium (19.52g MES; 36.45g mannitol; 40ml 2M CaCl₂·H₂Storing liquid; ph5.5)) gently overlaid on the suspension, forming a discontinuous gradient.

Viable protoplasts were separated from non-viable protoplasts, cell debris and intact suspension cells by centrifugation at 800rpm for 5 minutes. The different protoplast bands formed at the gradient interface were pipetted out and washed with 10ml of fresh TM solution followed by centrifugation at 800rpm for 5 minutes. The resulting protoplast pellet was resuspended in 1ml of TM medium and the number of viable protoplasts was quantified in a hemocytometer using 25mg/mg Fluorescein Diacetate (FDA) for staining. Protoplast solution was adjusted to 1X10 in TM medium⁷Final concentration of individual protoplasts per ml.

Will be about 1x10⁶Each protoplast (100. mu.l) was transferred to a 2ml Eppendorf tube containing 10-80. mu.g of purified plasmid DNA. 100 μ l of 40% PEG-3350(Sigma Chemical Co., St. Louis, Mo.) solution was added dropwise and the suspension was gently mixed. The protoplast/DNA mixture was incubated for 30 minutes at room temperature, followed by dropwise dilution with 1ml GN6 growth medium. The diluted protoplasts were incubated in this medium for 24 hours at 25 ℃, then harvested, frozen in liquid nitrogen, and stored at-80 ℃.

Example 17: in vivo ZFN functionality

The functionality of a ZFN is understood in this example to include, but is not limited to, the ability of the ZFN to be expressed in cells of a crop species and the ability of the ZFN to mediate double-strand breaks in the endogenous genome of that crop by recognizing, binding and cleaving its desired target. It is also understood that in this example, the target of the ZFN is a gene in the endogenous locus and construct within the crop genome.

To assess whether the engineered ZFNs have functionality for predicting target genes in a genomic environment, DNA sequence-based assays were developed. ZFN-induced double-stranded DNA break-induced repair Mechanisms such as non-homologous end joining (NHEJ) are predicted (reviewed in Cahill et al, (2006) Mechnisms FrontBiosci.1 (11): 1958-76). One consequence of NHEJ is that a portion of a broken DNA strand may be repaired in an imperfect manner, resulting in a small number of deletions, insertions or substitutions at the cleavage site. One skilled in the art can detect these changes in DNA sequence by a variety of methods.

A. PCR-based cloning and sequencing

In one example, maize HiII cultured cells expressing ZFN protein were isolated 24 hours after transformation, frozen, and genomic DNA extraction was performed using Qiagen (Valencia, CA) plant DNeasy extraction kit according to the manufacturer's recommendations. PCR amplification was performed using oligonucleotide primers specific for the target gene and flanking the predicted ZFN cleavage site. The purified genomic DNA was amplified using a combination of a forward PCR primer (5'-GGA AGC ATT ATT CCA ATT TGATGA TAA TGG-3') (SEQ ID NO: 135) and a reverse PCR primer (5'-CCC AAG TGT CGA GGT TGT CAATATGTT AC-3') (SEQ ID NO: 136) specific for the targeted paralog of the IPP2-K gene under the following conditions: a reaction volume of 25. mu.l containing 20ng of gDNA template, 20pmol of each primer, 1% DMSO and 10 units of AccuprimePf polymerase (Invitrogen, Carlsbad, Calif.) in the enzyme manufacturer's buffer. Amplification products of the expected size are derived from amplification cycles consisting of 95 ℃ -1 ', (95 ℃ -30 ', 61 ℃ -30 ', 72 ℃ -1 ') X30, 72 ℃ -5 ', 4 ℃ -maintenance.

The amplified fragment was directly cloned into the vector pCR2.1(Invitrogen, Carlsbad, CA) using the TA cloning kit (Invitrogen, Carlsbad, CA). The isolated cloned fragments were sequenced in 96-well format using a CEQ dye terminator cycle sequencing kit (Beckmann Coulter, Fullerton, Calif.) according to the manufacturer's recommendations. In this experiment, the ZFN protein was predicted to bind to 2 short IPP2-K gene specific sequences to generate a heterodimeric nuclease capable of cleaving the ds-DNA shown in fig. 66.

Analysis of the sequencing results from multiple clones revealed that clone #127 contained a small deletion just at the predicted ZFN cleavage site, indicating that the NHEJ mechanism has mediated imperfect repair of the DNA sequence at this site (fig. 67).

These results demonstrate the ability of these engineered ZFNs to induce targeted double-strand breaks at endogenous loci within a crop species in a specific manner.

B. Massively parallel sequencing analysis

In another example, a combination of PCR and massively parallel high temperature sequencing (pyrosequencing) methods is applied to interrogate (interrogates) the genome of multiple cell samples expressing different ZFN proteins targeting this same sequence. Three variants of forward PCR primers (5 '-XXX CAC CAA GTT GTATTG CCT TCT CA-3') (SEQ ID NO: 137) (where XXX ═ GGG, CCC, or GGC) and three variants of reverse PCR primers (5 '-XX XAT AGG CTT GAG CCA AGCAATCTT-3') (SEQ ID NO: 138) (where XXX ═ GCC, CCG, or CGG) were synthesized (IDT, Coralville, IA). The 3bp tag at the 5' end of each primer serves as an identifier key and indicates from which cell sample the amplicon originated. Primer pairs with matching identifier tags (keys) were used in combination to amplify purified genomic DNA derived from a corn cell sample under the following conditions: a reaction volume of 50. mu.l containing 40ng of gDNA template, 20pmol of each primer, 1% DMSO and 10 units of Accuprime Pf polymerase (Invitrogen, Carlsbad, Calif.) in the enzyme manufacturer's buffer. Amplification products of the expected size were derived from amplification cycles consisting of 95 ℃ -1 ', (95 ℃ -30 ', 61 ℃ -30 ', 72 ℃ -1 ') X30, 4 ℃ -maintenance and purified using the MinElutePCR purification kit (Qiagen, Valencia, Calif.) according to the manufacturer's recommendations.

The PCR products were subjected directly to a massively parallel pyrosequencing reaction (also referred to as 454 sequencing) as described in 454Life Sciences (Branford, CT) as described in (Margulies et al (2005) Nature 437: 376-380). Analysis of the 454 sequencing results is performed by identifying sequence reads that contain deletions of the expected size and position within the DNA molecule.

The results of these analyses indicated that there were multiple small deletions at the expected cut sites of these ZFNs, as shown in fig. 68. These deletions mapped precisely to the ZFN target sites and indicated the creation of double-stranded breaks in the genome induced by the ZFNs, followed by repair by NHEJ. These results further demonstrate the ability of these engineered ZFNs to induce targeted double-strand breaks at endogenous loci within a crop species in a specific manner.

Example 18: donor DNA design for targeted integration

In this example, donor DNA is understood to include double stranded DNA molecules that are delivered into plant cells and incorporated into the nuclear genome. The mechanism by which this incorporation occurs may be via homology-independent non-homologous end joining at the site of the double-strand break in the nuclear DNA (NHEJ; for review see Cahill et al, (2006) Mechanisms Front biosci.1: 1958-76) or another similar mechanism. Such NHEJ driven, ligation-like incorporation of donor DNA into the genome is referred to as random integration, since the integration position of the donor DNA is mainly determined by the presence of double stranded DNA breaks. In this mechanism, integration of the donor DNA into the genome is independent of the nucleotide sequence at the site of the disruption in the genome or the nucleotide sequence of the donor itself. Thus, during random integration, the "address" at which the donor DNA is incorporated into the genome is not specified or predicted based on the donor DNA sequence. Random integration is the primary mechanism by which donor DNA gene transfer occurs during standard plant transformation via agrobacterium-mediated or biolistic-mediated delivery of DNA into living plant cells.

In contrast to random integration, donor DNA can also be incorporated into the genome via targeted integration. Targeted integration is understood to occur at the site (position) of the double strand break via homology-dependent mechanisms such as homology-dependent single-strand annealing or homologous recombination (reviewed in van den Bosch et al (2002) Biol chem.383 (6): 873-892). In the case of homology-dependent DNA break repair, donor DNA comprising a nucleotide sequence that has identity or similarity to the DNA at the break site may be incorporated at this site. Thus, the "address" at which the donor DNA is integrated into the genome depends on the nucleotide sequence identity or sequence similarity between the genome and the donor DNA molecule. In plant systems, repair of double-stranded breaks in DNA is known to utilize both NHEJ and homology-dependent pathways (for review see Puchta (2005) j. exp. bot.56: 1-14).

In this example, we describe the design and construction of donor DNA molecules to be integrated into the genome via targeted integration at the site of a double strand break induced by a sequence-specific ZFN protein. Different ZFN proteins can induce double strand breaks at different nucleotides in the target gene sequence; the specific site of the induced double strand break is called the position.

As described in example 13, we have characterized the nucleotide sequence of the target gene IPP2K from maize. Subsequently, we designed ZFN proteins that bind to specific bases of the target gene (example 14) and demonstrated their binding/cleavage activity at this sequence within the target gene in two heterologous systems and against endogenous genes in maize cells (examples 15-17). Here, we describe the construction of various donor molecules designed for incorporation into the maize genome via targeted integration at the ZFN-mediated double-strand break position in the IPP2K gene. For any location in any genome where the nucleotide sequence is known and predicted to comprise a double-strand break, one skilled in the art can construct donor DNA molecules designed to be incorporated into ZFN-induced double-strand breaks via homology-driven targeted integration.

In one embodiment described herein, the donor DNA molecule comprises an autonomous herbicide-tolerant gene expression cassette bounded by a segment of a nucleotide sequence identical to the nucleotide sequence of the target gene IPP2K at the targeted location. In this embodiment, an autonomous herbicide tolerance cassette is understood to include the complete promoter-transcription unit (PTU) comprising a promoter, an herbicide tolerance gene, and a terminator sequence known to be functional in plant cells. One skilled in the art can select any promoter, gene, and terminator combination to constitute an autonomous PTU. Also included on this plasmid construct is a DNA fragment having sequence identity to the target gene (IPP2K) at the indicated location in maize. These fragments act as "homology flanks" of the donor DNA and incorporate the donor directly into the target gene at the designated location via targeted integration. Homology flanks were placed upstream and downstream of the PTU in the correct 5 '-3' orientation relative to the PTU. Those skilled in the art know homologous flanks of different size and orientation in the donor DNA construction.

In another embodiment described herein, the donor DNA molecule comprises a plasmid construct comprising a non-autonomous herbicide-tolerant gene expression cassette bounded by a segment of nucleotide sequence identical to the nucleotide sequence of IPP2K at the target location. In this embodiment, a non-autonomous herbicide tolerance cassette is understood to include an incomplete promoter-transcription unit (PTU) lacking a functional promoter. The non-autonomous PTU does comprise an herbicide tolerance gene and a terminator sequence known to be functional in plant cells. One skilled in the art can select any combination of genes and terminators to make a non-autonomous PTU. In this example of a non-autonomous donor, expression of the herbicide tolerance gene is dependent on incorporation of the donor segment into a genomic location proximal to a functional promoter that can drive expression of the gene. A relatively rare situation can be appreciated where the donor would incorporate via random integration a promoter where there is an occasional discovery and which can be used in the genetic locus that drives expression of the herbicide tolerance gene. Alternatively, based on the presence of homology flanks within the donor DNA construct of a DNA fragment of suitable length that has sequence identity to the target gene at a particular location in maize, precise targeted integration of the donor DNA into the target gene at a particular location (as described with respect to autonomous donors) can occur, thus using the endogenous promoter of the target gene. In this embodiment, homology flanks are placed upstream and downstream of the PTU in the correct 5 '-3' orientation relative to the PTU. Those skilled in the art will ligate homologous flanks of different size and orientation in the donor DNA construct.

In both embodiments described herein (autonomous and non-autonomous donor design), plasmid construction typically contains additional elements to effect cloning, expression, and subsequent analysis of the herbicide tolerant gene. Such elements include bacterial origins of replication, engineered restriction sites, and the like, and are described below. The skilled person is aware of the use of different elements constituting the donor DNA molecule.

A. Bacterial strains and culture conditions

Luria-Bertani broth (LB: 10g/L bacto tryptone, 10g/L NaCl, 5g/L bacto yeast extract), LB agar (LB broth with 15g/L bacto agar added), or Terrific broth (TB: 12g/L bacto tryptone, 24g/L bacto yeast extract, 0.4% v/v glycerol, 17mM KH) was used₂PO₄、72mM K₂HPO₄) Cultivation of E.coli strains (One) at 37 deg.CTop10 chemically competent cells; MAXDH5α^TMChemically competent cells, Invitrogen Life Technologies, Carlsbad, CA) for 16 hours. The liquid culture was shaken at 200 rpm. Chloramphenicol (50. mu.g/ml), kanamycin (50. mu.g/ml), or ampicillin (100. mu.g/ml) was added to the medium as needed. All antibiotics, media and buffer reagents used in this study were purchased from Sigma-Aldrich Corporation (st. louis, MO) or Difco Laboratories (Detroit, MI).

B. Plasmid backbone position-1

The plasmid backbone containing the homologous flank of position-1 of IPP2K was engineered to allow integration of any donor DNA sequence into the corresponding target site of the IPP2K gene. One skilled in the art is aware of plasmid backbones that use various cloning sites, modular design elements, and sequences that are homologous to any target sequence within the genome of interest. The plasmid backbone exemplified here starts from the basic plasmid vector pBC SK (-) phagemid (3.4Kbp) (Stratagene, La Jolla, Calif.). The position-1 plasmid backbone was constructed using the 4-step synthesis described below.

In step #1, a base plasmid was prepared. Mu.g of pBCSK (-) was linearized at 37 ℃ using 10 units of Spe I and 10 units of Not I (New England Biolabs, Beverly, Mass.) restriction endonuclease for 1 hour. Restriction DNA was electrophoresed at 100V for 1 hour in 1.0% TAE (0.04M Tris-acetate, 0.002M EDTA) agarose gel supplemented with 0.5% ethidium bromide (Sigma-Aldrich Corporation, St. Louis, Mo.). The DNA fragments were visualized with UV light and the fragment size was estimated by comparison with a 1Kbp DNA ladder (Invitrogen Life Technologies, Carlsbad, Calif.). The 3.4Kbp SpeI/Not I digested subcloning vector pBC SK (-) was Gel cut and purified using QIA Rapid Gel Extraction Kit (QIAquick Gel Extraction Kit, QIAGEN Inc., Valencia, Calif.) according to the manufacturer's instructions.

In step #2, the 5 'and 3' homology flanks from position-1 of IPP2K were isolated. The following oligonucleotide primers were synthesized by integrated DNAtechnologies, Inc. (Coralville, IA) under standard desalting conditions and diluted with water to a concentration of 0.125. mu.g/ul:

5’-GCGGCCGCGTCTCACCGCGGCTTGGGGATTGGATACGGAGCT-3’(SEQ ID NO：143)

5’-ACTAGTGATATGGCCCCACAGGAGTTGCTCATGACTTG-3’(SEQ IDNO：144)

5’-ACTAGTCCAGAACTGGTTGAGTCGGTCAAACAAGATTGCT-3’(SEQID NO：145)

5’-GTCGACCTTGATGCTACCCATTGGGCTGTTGT-3’(SEQ ID NO：146)。

PCR amplification reactions were carried out using reagents supplied by TaKaRa Biotechnology Inc., Seta 3-4-1, Otsu, Shiga, 520- & 2193, Japan and consisting of: 5 μ l 10 × LA PCR^TMBuffer II (Mg)²⁺) 20ng of double stranded gDNA template (maize HiII), 10pmol of forward oligonucleotide primer, 10pmol of reverse oligonucleotide primer, 8. mu.l of dNTP mix (2.5 mM each), 33.5. mu. l H₂O, 0.5. mu.l (2.5 units) of TaKaRa LA Taq^TMDNA polymerase, 1 drop of mineral oil. PCR reactions were performed using a Perkin-Elmer Cetus48 sample DNA thermal cycler (Norwalk, CT) under cycling conditions as follows: 4 min at 94 ℃ per 1 cycle; 20 seconds at 98 ℃,1 minute at 65 ℃,1 minute at 68 ℃ per 30 cycles; 5 minutes per 1 cycle at 72 ℃; 4 deg.C/hold. 15 μ l of each PCR reaction was electrophoresed at 100V for 1 hour in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The amplified fragments were visualized with UV light and the fragment size was estimated by comparison to a 1KbpDNA ladder. The expected amplification product was judged by the presence of 0.821Kbp (5 'homology flank) or 0.821Kbp (3' homology flank) DNA fragments.

These fragments were gel cut and purified using a QIA rapid gel extraction kit (qiagen inc., Valencia, CA) according to the manufacturer's instructions. Then using TOPO TAKit (containing)2.1 vectors) and OneTOP10 chemically competent E.coli cells (Invitrogen Life technologies, Carlsbad, Calif.) the purified fragment was cloned into the pCR2.1 plasmid according to the manufacturer's protocol.

Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, N.J.) containing 2ml TB supplemented with 50. mu.l/ml kanamycin and incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher scientific, Pittsburgh, Pa.) and pelleted at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of plasmid isolated from the 5' homologous flanking cloning plasmid was digested with 10 units of Spe I and Not I. The 3' homologous flanking cloning plasmid was digested with 10 units of Spe I and 20 units of Sal I (New England Biolabs, Beverly, Mass.). All plasmid digests were incubated at 37 ℃ for 1 hour. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. According to 3.9Kbp2.1 the presence of an inserted DNA fragment of 0.821Kbp (5 'homology flank) or 0.821Kbp (3' homology flank) outside the vector was used to judge the expected plasmid clone.

Using CEQ^TMDTCS-Rapid Start kit (CEQ)^TMDTCS-Quick Start Kit, Beckman-Coulter, Palo Alto, Calif.) double-stranded sequencing reactions of plasmid clones were performed as described by the manufacturer. Using Performa DTR gel filtration cartridges(Performa DTR Gel filtration cartridges, Edge BioSystems, Gaithersburg, Md.) the reaction was purified as described in the manufacturer's protocol. In a Beckman-Coulter CEQ^TMSequence reactions were analyzed on a 2000XL DNA analysis System and Sequencher was used^TM4.1.4 edition (Gene codes corporation, Ann Arbor, MI) for nucleotide characterization. The sequence of the 0.821Kbp fragment corresponding to the homology-flanking position-15' derived from IPP2K is shown in FIG. 87 (SEQ ID NO: 171). The sequence of the 0.821Kbp fragment corresponding to the position-13' homology flank derived from IPP2K is shown in FIG. 88 (SEQ ID NO: 172).

In step #3, the position-15' homology was flanked into the base plasmid. Restriction fragments corresponding to clones containing flanking sequences with correct position-15 'homology were gel cut and purified using a QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions. The fragment (0.821Kbp) flanking the homology at position-15' was then ligated to the purified base plasmid digested with SpeI/Not I using 500 units of T4DNA ligase (Invitrogen Life Technologies, Carlsbad, Calif.) at a 1: 5 vector: insert ratio in a 20. mu.l reaction volume incubated for 16 hours in a water bath at 16 ℃ (step # 1). Subsequently, 5. mu.l of ligation reaction was transformed into E.coli OneTop10 chemically competent cells (Invitrogen Life Technologies, Carlsbad, Calif.) and plated under the selection conditions described by the manufacturer. Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, N.J.) containing 2ml TB supplemented with 50. mu.l/ml kanamycin and incubated at 37 ℃ for 16 hours with shaking at 200 rpm.

After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher Scientific, Pittsburgh, Pa.) and pelleted at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmidsKit (BD Biosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of the isolated plasmid DNA was digested with 10 units of SpeI and Not I (New England Biolabs, Beverly, Mass.) and incubated at 37 ℃ for 1 hour. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. The expected plasmid clone was judged by the presence of a 0.821Kbp insert DNA fragment (flanking the 5' homology) outside the 3.4Kbp base plasmid.

In step #4, position-13' homology was flanked into the product of step # 3. Mu.g of the engineered product in step #3 was linearized using 10 units of Spe I and 20 units of Sal I (New England Biolabs, Beverly, Mass.) restriction endonuclease for 1 hour at 37 ℃. Restriction DNA was electrophoresed at 100V for 1 hour in 1.0% TAE (0.04M Tris-acetate, 0.002M EDTA) agarose gel supplemented with 0.5% ethidium bromide (Sigma-Aldrich corporation, St. Louis, Mo.). The DNA fragments were visualized with UV light and the fragment size was estimated by comparison with a 1Kbp DNA ladder (Invitrogen Life Technologies, Carlsbad, Calif.). The product from step #3, digested at about 4.2Kbp with Spe I/Sal I, was gel cut and purified using QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions.

The isolated fragment of the 3' homologous flanking donor generated in step #2 (0.821Kbp) was then mixed with the step #3 product digested with Spe I/Sal I and purified as described above in a 20. mu.l ligation reaction using a vector: insert ratio of 1: 5 and 500 units of T4DNA ligase (Invitrogen Life technologies, Carlsbad, Calif.). The ligation reaction was incubated in a water bath at 16 ℃ for 16 hours. After ligation, 5. mu.l of the ligation reaction was converted to MAXDH5α^TMIn chemically competent cells (Invitrogen Life technologies, Carlsbad, Calif.), this was done according to the manufacturer's recommendations. Individual colonies were inoculated into 14ml of 2ml supplementIn Falcon tubes (Becton-Dickinson, Franklin Lakes, N.J.) with 50. mu.l/ml chloramphenicol in TB.

The culture was incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher Scientific, Pittsburgh, Pa.) and pelleted at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of the isolated plasmid were digested with 10 units of Sal I and Not I (New England Biolabs, Beverly, Mass.) and incubated for 1 hour at 37 ℃. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. The expected clones were judged by the presence of two DNA fragments of 1.64Kbp (insert) and 3.33Kbp (base plasmid). The resulting plasmid was designated pDAB7471 (FIG. 70).

C. Plasmid backbone position-2

The plasmid backbone containing the homologous flank of position-2 of IPP2K was engineered to allow integration of any donor DNA sequence into the corresponding target site of the IPP2K gene. One skilled in the art is aware of plasmid backbones that use various cloning sites, modular design elements, and sequences that are homologous to any target sequence within the genome of interest. The plasmid backbone exemplified here starts from the basic plasmid vector pBC SK (-) phagemid (3.4Kbp) (Stratagene, La Jolla, Calif.). The position-2 plasmid backbone was constructed using the 4-step synthesis described below.

In step #2, 5 'and 3' homology flanks from position-2 of IPP2K were isolated. The following oligonucleotide primers were synthesized by integrated DNAtechnologies, Inc. (Coralville, IA) under standard desalting conditions and diluted with water to a concentration of 0.125. mu.g/ul:

5’-GCGGCCGCTAGATAGCAGATGCAGATTGCT-3’(SEQ ID NO：147)

5’-ACTAGTATTGGCACCCAGGTGTTGGCTCA-3’(SEQ ID NO：148)

5’-ACTAGTCATGTCGATGGTGGGGTATGGTTCAGATTCAG-3’(SEQ IDNO：149)

5’-GTCGACGTACAATGATTTCAGGTTACGGCCTCAGGAC-3’(SEQ IDNO：150)。

PCR amplification reactions were carried out using reagents supplied by TaKaRa Biotechnology Inc., Seta 3-4-1, Otsu, Shiga, 520- & 2193, Japan and consisting of: 5 μ l10 × LA PCR^TMBuffer II (Mg)²⁺) 20ng of double stranded gDNA template (maize HiII), 10pmol of forward oligonucleotide primer, 10pmol of reverse oligonucleotide primer, 8. mu.l of dNTP mix (2.5 mM each), 33.5. mu. l H₂O, 0.5. mu.l (2.5 units) of TaKaRa LA Taq^TMDNA polymerase, 1 drop of mineral oil. PCR reactions were performed using a Perkin-Elmer Cetus48 sample DNA thermal cycler (Norwalk, CT) under cycling conditions as follows: 4 min at 94 ℃ per 1 cycle; 20 seconds at 98 ℃, 1 minute at 55 ℃, 1 minute at 68 ℃ per 30 cycles; 5 min at 72 ℃ per 1 cycle; 4 deg.C/hold. 15 μ l of each PCR reaction was electrophoresed at 100V for 1 hour in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The amplified fragments were visualized with UV light and the fragment size was estimated by comparison to a 1KbpDNA ladder. According to 0.855Kbp (5' homologous flanking) orThe presence of a 0.845Kbp (3' homology flanking) DNA fragment was used to judge the expected amplification product. These fragments were gel cut and purified using a QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions. Then using TOPO TAKit (containing)2.1 vectors) and OneTOP10 chemically competent E.coli cells (Invitrogen Life Technologies, Carlsbad, Calif.) the purified fragments were cloned into the pCR2.1 plasmid according to the manufacturer's protocol.

Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, N.J.) containing 2ml TB supplemented with 50. mu.l/ml kanamycin and incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher scientific, Pittsburgh, Pa.) and pelleted at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of plasmid isolated from the 5' homologous flanking cloning plasmid was digested with 10 units of Spe I and Not I. The 3' -homologous flanking cloning plasmid (threeprime-homology flash clone plasmid) was digested with 10 units of Spe I and 20 units of Sal I (New England Biolabs, Beverly, Mass.). All plasmid digests were incubated at 37 ℃ for 1 hour.

Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. According to 3.9Kbp2.1 the presence of an inserted DNA fragment of 0.855Kbp (5 'homology flank) or 0.845Kbp (3' homology flank) outside the vector determines the expected plasmid clone.

Using CEQ^TMDTCS-Rapid Start kit (CEQ)^TMDTCS-Quick Start Kit, Beckman-Coulter, Palo Alto, Calif.) double-stranded sequencing reactions of plasmid clones were performed as described by the manufacturer. The reaction was purified as described in the manufacturer's protocol using a Performa DTR Gel filtration cartridge (Performa DTR Gel filtration cartridges, Edge BioSystems, Gaithersburg, Md.). In a Beckman-Coulter CEQ^TMSequence reactions were analyzed on a 2000XL DNA analysis System and Sequencher was used^TM4.1.4 edition (Gene codes corporation, Ann Arbor, MI) for nucleotide characterization. The sequence of the 0.855Kbp fragment corresponding to the position-25' homology flank derived from IPP2K is shown in FIG. 89 (SEQ ID NO: 139). The sequence of the 0.845Kbp fragment corresponding to the homology-flanking position-23' derived from IPP2K is shown in FIG. 90 (SEQ ID NO: 140).

In step #3, the position-15' homology was flanked into the base plasmid. Restriction fragments corresponding to clones containing flanking sequences with correct position-25 'homology were gel cut and purified using the QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions. The fragment corresponding to the homology flank at position-15' (0.855Kbp) was then ligated to the purified base plasmid digested with SpeI/Not I using 500 units of T4DNA ligase (Invitrogen Life Technologies, Carlsbad, Calif.) at a 1: 5 vector: insert ratio in a 20. mu.l reaction volume incubated in a water bath at 16 ℃ for 16 hours (step # 1).

Subsequently, 5. mu.l of ligation reaction was transformed into E.coli OneTop10 chemically competent cells (Invitrogen Life Technologies, Carlsbad, Calif.) and under the selection conditions described by the manufacturerAnd (6) paving the board. Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, N.J.) containing 2ml TB supplemented with 50. mu.l/ml kanamycin and incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher scientific, Pittsburgh, Pa.) and pelleted at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of the isolated plasmid DNA was digested with 10 units of Spe I and Not I (New England Biolabs, Beverly, Mass.) and incubated at 37 ℃ for 1 hour. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. The expected plasmid clone was judged by the presence of a 0.855Kbp insert DNA fragment (flanking the 5' homology) outside the 3.4Kbp base plasmid.

In step #4, position-23' homology was flanked into the product of step # 3. Mu.g of the engineered product in step #3 was linearized using 10 units of Spe I and 20 units of Sal I (New England Biolabs, Beverly, Mass.) restriction endonuclease for 1 hour at 37 ℃. Restriction DNA was electrophoresed at 100V for 1 hour in 1.0% TAE (0.04M Tris-acetate, 0.002M EDTA) agarose gel supplemented with 0.5% ethidium bromide (Sigma-Aldrich corporation, St. Louis, Mo.). The DNA fragments were visualized with UV light and the fragment size was estimated by comparison with a 1Kbp DNA ladder (Invitrogen Life Technologies, Carlsbad, Calif.). The 4.25Kbp Spe I/Sal I digested product from step #3 was gel cut and purified using QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to manufacturer's instructions.

The isolated fragment of the 3' homology-flanking donor generated in step #2 (0.845 Kbp) was then ligated using a vector: insert ratio of 1: 5 and 500 units of T4DNA ligase (Invitrogen Life technologies, Carlsbad, Calif.)) Mix with the product of step #3 digested with Spe I/Sal I and purified as described above in a 20. mu.l ligation reaction. The ligation reaction was incubated in a water bath at 16 ℃ for 16 hours. After ligation, 5. mu.l of the ligation reaction was converted to MAXDH5α^TMIn chemically competent cells (Invitrogen Life technologies, Carlsbad, Calif.), this was done according to the manufacturer's recommendations. Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, N.J.) containing 2ml TB supplemented with 50. mu.l/ml chloramphenicol. The culture was incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher Scientific, Pittsburgh, Pa.) and pelleted at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of the isolated plasmid was digested with 10 units of Sal I and Not I (New England Biolabs, Beverly, Mass.), and incubated at 37 ℃ for 1 hour. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. The expected clones were judged by the presence of two DNA fragments of about 1.7Kbp (insert) and 3.33Kbp (base plasmid). The resulting plasmid was designated pDAB7451 (FIG. 71).

D. Construction of autonomous herbicide-tolerant gene expression cassette

An autonomous herbicide tolerant gene expression cassette comprising an intact promoter-transcription unit (PTU) comprising a promoter, an herbicide tolerant gene, and a polyadenylation (polya) termination sequence was constructed (fig. 72). In this embodiment, the promoter sequence is derived from rice (O.sativa) actin 1[ McElroy et al (plantaCell 2, 163-171; 1990); GenBank accession number S44221 and GenBank accession number X63830 ]. The herbicide tolerance Gene comprises the PAT (phosphinothricin acetyltransferase) Gene, which confers resistance to the herbicide Bialaphos (a modified version of the PAT coding region (GenBank accession M22827; Wohlleben et al Gene 70, 25-37; 1988), originally derived from Streptomyces viridochromogenes). Modifications to the initial sequence of the longest open reading frame of M22827 are substantial and include changes in codon usage patterns to optimize expression in plants. The protein encoded by the PAT open reading frame of pDAB3014 is identical to the protein encoded by the longest open reading frame of accession number M22827, except that valine is replaced with methionine as the first encoding amino acid and alanine is added as the second encoding amino acid. A rebuilt (rebuilt) version of PAT is found at GenBank accession No. I43995. The terminator sequence was derived from the maize (z. mays L.) lipase [ maize lipase cDNA clone of GenBank accession No. L35913, except that C at position 1093 of L35913 was replaced with G at position 2468 in pDAB 3014. The maize sequence contains a 3' untranslated region/transcription terminator region for the PAT gene ].

The following oligonucleotide primers were synthesized by Integrated DNA Technologies, Inc. (Coralville, IA) under standard desalting conditions and diluted with water to a concentration of 0.125. mu.g/. mu.l:

5’-ACTAGTTAACTGACCTCACTCGAGGTCATTCATATGCTTGA-3’(SEQID NO：151)

5’-ACTAGTGTGAATTCAGCACTTAAAGATCT-3’(SEQ ID NO：152)。

PCR amplification reactions were carried out using reagents supplied by TaKaRa Biotechnology Inc., Seta 3-4-1, Otsu, Shiga, 520- & 2193, Japan and consisting of: 5 μ l10 × LA PCR^TMBuffer II (Mg)²⁺) 20ng of double-stranded template [ pDAB3014 plasmid DNA]10pmol of forward oligonucleotide primer, 10pmol of reverse oligonucleotide primer, 8. mu.l of dNTP mix (2.5 mM each), 33.5. mu. l H₂O, 0.5. mu.l (2.5 units) of TaKaRa LA Taq^TMDNA polymerase, 1 drop of mineral oil. PCR reactions were performed using a Perkin-Elmer Cetus48 sample DNA thermal cycler (Norwalk, CT) under cycling conditions as follows: 4 min at 94 ℃ per 1 cycle; 20 seconds at 98 ℃, 1 minute at 55 ℃ and 3 minutes at 68 ℃ per 30 cycles; 5 min/1 at 72 deg.COne cycle; 4 deg.C/hold. 15 μ l of each PCR reaction was electrophoresed at 100V for 1 hour in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide.

The amplified fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. The expected amplification product was judged by the presence of the 2.3KbpDNA fragment. The fragment was gel-cleaved and purified using QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions. Then using TOPO TAThe kit clones the purified fragment into pCR2.1 plasmid, and converts to OneTOP10 chemically competent E.coli cells (Invitrogen Life Technologies, Carlsbad, Calif.) were performed according to the manufacturer's protocol.

Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, N.J.) containing 2ml TB supplemented with 50. mu.l/ml kanamycin and incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher scientific, Pittsburgh, Pa.) and pelleted at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of the isolated plasmid was digested with 10 units of Spe I and Not I. All plasmid digests were incubated at 37 ℃ for 1 hour.

Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. According to 3.9Kbp2.1 the presence of an 2.325Kbp insert DNA fragment outside the vector was used to judge the expected plasmid clone. Using CEQ^TMThe DTCS-Rapid Start kit (Beckman-Coulter, Palo Alto, Calif.) performs a double-stranded sequencing reaction of the plasmid clones as described by the manufacturer. The reaction was purified as described in the manufacturer's protocol using a Performa DTR gel cartridge (Edge BioSystems, Gaithersburg, Md.). In a Beckman-Coulter CEQ^TMSequence reactions were analyzed on a 2000XL DNA analysis System and Sequencher was used^TM4.1.4 edition (Gene Codes Corporation, Ann Arbor, MI) for nucleotide characterization.

E. Insertion of an autonomous herbicide-tolerant Gene cassette into the plasmid backbone-autonomous Donor

To create the donor plasmid, the autonomous herbicide tolerance gene cassette described in example 18D was inserted into the plasmid backbone constructs described in examples 18B and 18C. Restriction fragments derived from clones containing the expected 2.325Kbp sequence described above (fig. 72) were gel cut and purified using the QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions.

This fragment was then combined in a ligation reaction with purified pDAB7471 (position-1 plasmid backbone, FIG. 70) or pDAB7451 (position-2 plasmid backbone, FIG. 71) which had been digested with the restriction enzyme Spe I and subsequently dephosphorylated. The ligation was carried out under the following conditions: the vector/insert ratio and 500 units of T4DNA ligase (Invitrogen Life Technologies, Carlsbad, Calif.) were incubated in a 20. mu.l reaction volume for 16 hours in a water bath at 16 ℃. Subsequently 5. mu.l of the ligation reaction were transformed into 50. mu.l of E.coli MAXDH5α^TMChemically competent cells (Invitrogen life technologies, Carlsbad, CA) and plated under the selection conditions described by the manufacturer.

Individual colonies were inoculated into 14ml Falcon tubes containing 2ml TB supplemented with 50. mu.l/ml chloramphenicol (Becton-Dickinson, Fran)klin Lakes, NJ) and incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher scientific, Pittsburgh, Pa.) and pelleted at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of isolated plasmid DNA was digested with 10 units of Spe I (New England Biolabs, Beverly, Mass.) and incubated for 1 hour at 37 ℃. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. The expected plasmid clones were judged by the presence of DNA fragments of 2.325Kbp and about 4.9Kbp (pDAB7471 vector) or 2.325Kbp and about 5.0Kbp (pDAB7451 vector).

The resulting plasmids were designated pDAB7422 (position-1 autonomous donor) (FIG. 73) and pDAB7452 (position-2 autonomous donor) (FIG. 74), respectively.

F. Construction of non-autonomous herbicide-tolerant gene expression cassette

A non-autonomous herbicide tolerant gene expression cassette comprising an incomplete promoter-transcription unit (PTU) was constructed (fig. 75). In this embodiment, a strategy is used that exploits the functionality of the 2A sequence derived from the thesia assignna virus (mating, n.m., Harnish, e.c., Crowley, j.c., and Reilly, p.a. (1996) j.virol.70, 8124-. In this embodiment, the 2A translation termination signal sequence has been engineered to be translated in frame with the herbicide tolerance gene. In addition, the 2A/herbicide coding sequence has been engineered to be in frame with the translational reading frame of the IPP2K gene target. The herbicide tolerance Gene comprises the PAT (phosphinothricin acetyltransferase) Gene, which confers resistance to the herbicide Bialaphos (a modified version of the PAT coding region originally derived from S.viridochromogenes (GenBank accession M22827; Wohlleben et al, Gene 70, 25-37; 1988)). Modifications to the initial sequence of the longest open reading frame of M22827 are substantial and include changes in codon usage patterns to optimize expression in plants. The protein encoded from the PAT open reading frame of pDAB3014 was identical to the protein encoded by the longest open reading frame of M22827 (which begins at GTG at position 244 of M22827) except that valine was replaced with methionine as the first encoded amino acid and alanine was added as the second amino acid. A rebuilt (rebuilt) version of PAT is found at GenBank accession No. I43995. The terminator sequence was derived from maize lipase [ maize lipase cDNA clone of GenBank accession number L35913, except that the C at position 1093 of L35913 was replaced by G at position 2468 in pDAB3014 ]. The maize sequence contains a 3' untranslated region/transcription terminator region for the PAT gene.

5’-ACTAGTGGCGGCGGAGAGGGCAGAGGAAGTCTTCTAACATGCGGTGACGTGGAGGAGAATCCCGGCCCTAGGATGGCTTCTCCGGAGAGGAGACCAGTTGA-3(SEQ ID NO：153)

5’-ACTAGTATGCATGTGAATTCAGCACTTAAAGATCT-3’(SEQ IDNO：154)。

PCR amplification reactions were carried out using reagents supplied by TaKaRa Biotechnology Inc. (Seta 3-4-1, Otsu, Shiga, 520-2193, Japan) and consisting of: 5 μ l10 × LA PCR^TMBuffer II (Mg)²⁺) 20ng of double-stranded template (pDAB3014 plasmid DNA), 10pmol of forward oligonucleotide primer, 10pmol of reverse oligonucleotide primer, 8. mu.l of dNTP mix (2.5 mM each), 33.5. mu. l H₂O, 0.5. mu.l (2.5 units) of TaKaRa LA Taq^TMDNA polymerase, 1 drop of mineral oil. PCR reactions were performed using a Perkin-Elmer Cetus48 sample DNA thermal cycler (Norwalk, CT) under cycling conditions as follows: 4 min at 94 ℃ per 1 cycle; 20 seconds at 98 ℃, 1 minute at 55 ℃, 2 minutes at 68 ℃ per 30 cycles; 5 min at 72 ℃ per 1 cycle; 4 deg.C/hold. 15 μ l of each PCR reaction was electrophoresed at 100V for 1 hour in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. By ultraviolet radiationThe amplified fragments were visualized by line and the fragment size was estimated by comparison with a 1Kbp DNA ladder. The expected amplification product was judged by the presence of about 1KbpDNA fragment. The fragment was gel-cleaved and purified using QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions. Then using TOPO TAThe kit clones the purified fragment into pCR2.1 plasmid, and converts to OneTOP10 chemically competent E.coli cells (Invitrogen Life Technologies, Carlsbad, Calif.) were performed according to the manufacturer's protocol.

Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, N.J.) containing 2ml TB supplemented with 50. mu.l/ml kanamycin and incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher scientific, Pittsburgh, Pa.) and pelleted at 16,000x g for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of the isolated plasmid was digested with 10 units of Spe I. All plasmid digests were incubated at 37 ℃ for 1 hour. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. Insert DNA fragment according to about 1.0Kbp and 3.9 Kbp: (2.1 vector) to determine the expected plasmid clone. Using CEQ^TMThe DTCS-Rapid Start kit (Beckman-Coulter, Palo Alto, Calif.) performs a double-stranded sequencing reaction of the plasmid clones as described by the manufacturer. Using PerfThe reaction was purified by orma DTR gel filtration cartridge (Edge BioSystems, Gaithersburg, MD) as described in the manufacturer's protocol. In Beckman-CoulterCEQ^TMSequence reactions were analyzed on a 2000XL DNA analysis System and Sequencher was used^TM4.1.4 edition (Gene Codes Corporation, Ann Arbor, MI) for nucleotide characterization.

G. Insertion of a non-autonomous herbicide-tolerant Gene cassette into the plasmid backbone-a non-autonomous Donor

To create the donor plasmid, the non-autonomous herbicide-tolerant gene cassette described in example 18F was inserted into the plasmid backbone constructs described in examples 18B and 18C. Restriction fragments corresponding to clones containing the correct 1Kbp sequence were gel cut and purified using QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions. This fragment was then combined in a ligation reaction with purified pDAB7471 (position-1 plasmid backbone) (FIG. 70) or pDAB7451 (position-2 plasmid backbone) (FIG. 71) which had been digested with the restriction enzyme Spe I and subsequently dephosphorylated. The ligation was carried out under the following conditions: 1: 5 vector: insert ratio and 500 units of T4DNA ligase (Invitrogen Life Technologies, Carlsbad, Calif.) in a reaction volume of 20. mu.l were incubated in a water bath at 16 ℃ for 16 hours. Subsequently 5. mu.l of the ligation reaction were transformed into 50. mu.l of E.coli MAXDH5α^TMChemically competent cells (Invitrogen Life Technologies, Carlsbad, Calif.) and plated under the selection conditions described by the manufacturer.

Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, NJ) containing 2ml TB supplemented with 50. mu.l/ml chloramphenicol and incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher scientific, Pittsburgh, Pa.) and pelleted at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of isolated plasmid DNA was digested with 10 units of SpeI (New England Biolabs, Beverly, Mass.) and incubated for 1 hour at 37 ℃. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. The expected plasmid clones were judged by the presence of DNA fragments of 1.0Kbp and 4.96Kbp (pDAB7471 vector) or 1.0Kbp and about 5.0Kbp (pDAB7451 vector). The resulting plasmids were designated pDAB7423 (position-1 non-autonomous donor) (FIG. 76) and pDAB7454 (position-2 non-autonomous donor) (FIG. 77), respectively.

H. Position 1ZFN + HR donor sequence: combined plasmid

As an alternative strategy to delivering two separate plasmids into plant cells (e.g., one plasmid containing the ZFN element and the other containing the herbicide tolerant donor sequence), a single plasmid is engineered to contain all the necessary elements exemplified in this patent. The combination plasmids described in this example comprise ZFNs designed to target a particular IPP2K locus and generate a double-strand break there, and both autonomous PAT PTUs and/or non-autonomous 2A/PAT PTUs and donor flanks designed to integrate into those break sites.

By usingTechniques using lambda phage-based site-specific recombination (Landy, A. (1989) Ann. Rev. biochem. 58: 913)) to convert vectors pDAB7422 and pDAB7423 (described in examples 6E and 6G) into the corresponding pDAB7422 and pDAB7423And (3) a target vector. Once transformed, the ZFN-containing expression cassette (housed inInto a vector (entry vector) to mobilize (mobilize) the plasmid to the targetCreating a ZFN/donor combination plasmid. Mu.g of each of these plasmids were digested with 10 units of Not I (New England Biolabs, Beverly, Mass.) at 37 ℃ for 1 hour. Not I restriction endonuclease was heat inactivated at 65 ℃ for 15 min, followed by dephosphorylation of fragment ends at 37 ℃ for 1h using 3 units of Shrimp Alkaline Phosphatase (SAP) (Roche Diagnostics GmbH, Mannheim, Germany). Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. Vector fragments (pDB7422 ═ 7.317Kbp, pDAB7423 ═ 5.971Kbp) were visualized with ultraviolet rays, the size was estimated by comparison with a 1Kbp DNA ladder, gel-cut, and subsequently purified using QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions.

Then the vector fragment is combined with a vector comprisingTechnology element attR1, ccdB, Cm^RAnd attR2 in a ligation reaction performed under the following conditions: vector to insert ratio of 1: 5 and 500 units of T4DNA lipase (Invitrogen Life Technologies, Carlsbad, Calif.) in a reaction volume of 20. mu.l were incubated in a water bath at 16 ℃ for 16 hours. Subsequently, 5. mu.l of the ligation reaction was transformed into 50. mu.l of E.coli OneccdB Survival^TMChemically competent cells (Invitrogen Life Technologies, Carlsbad, Calif.) and plated under the selection conditions described by the manufacturer.

Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, NJ) containing 2ml TB supplemented with 50. mu.l/ml chloramphenicol and incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher scientific, Pittsburgh, Pa.) and pelleted at 16,000x g for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of the isolated plasmid DNA was digested with 10 units of EcoRII (New England BioLabs, Inc., Beverly, Mass.) and incubated at 37 ℃ for 1 hour. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The fragments were visualized with UV light and the fragment size was estimated by comparison to a 1Kbp DNA ladder. The expected plasmid clones were judged by the presence of DNA fragments of 1.448Kbp, 1.946Kbp, and 6.197Kbp (autonomous PAT PTU position-1 HR donor) and 5.807Kbp and 2.438Kbp (non-autonomous PAT position-1 HR donor). The resulting plasmids were designated as pDAB7424 (obtained fromAdapted position-1 autonomous donors) (FIG. 78) and pDAB7425 (warpAdapted position-1 non-autonomous donors) (fig. 79).

As a result of these cloning procedures, plasmids pDAB7424 and pDAB7425 were designedAnd (3) a target vector. pDAB7412 has as an elementFunctionality of the entry vector: ZmUbi1v.Z/ZFN12/Zm Per 53' UTR. To express ZFN expression cassette (Entry vector) transfer to an autonomous or non-autonomous donor molecule(s) ((ii)Vector of interest), 50ng (into vector): LRClonase was performed at a ratio of 150 ng/. mu.l (vector of interest)^TMII (Invitrogen Life Technologies, Carlsbad, CA) reactions, e.g.As outlined by the manufacturer. The resulting positive combination vectors were named pDAB7426 (position-1 autonomous HR donor/ZFN 12) (fig. 80) and pDAB7427 (non-autonomous HR donor/ZFN 12) (fig. 81).

Example 19: delivery of ZFNs and Donor DNA into plant cells

To achieve ZFN-mediated integration of the donor DNA into the plant genome via targeted integration, it will be appreciated that DNA encoding the ZFN needs to be delivered, followed by expression of a functional ZFN protein in the plant cell. Also needed is concomitant delivery of the donor DNA to the plant cell such that the functional ZFN protein can induce a double-strand break on the target DNA, which is then repaired via homology-driven integration of the donor DNA into the target locus. The skilled artisan understands that expression of a functional ZFN protein can be achieved by several methods, including but not limited to gene transfer of the ZFN-encoding construct, or transient expression of the ZFN-encoding construct. In both cases, expression of functional ZFN protein and delivery of donor DNA are achieved simultaneously in plant cells to drive targeted integration.

In the examples cited herein, we demonstrate methods for the concomitant delivery of ZFN-encoding DNA and donor DNA into plant cells. One skilled in the art may use any of a variety of DNA delivery methods suitable for plant cells, including but not limited to Agrobacterium-mediated transformation, biolistic-based DNA delivery, or Whisskers^TMMediated DNA delivery. In one embodiment described herein, Whisskers are implemented using various combinations of donor DNA and ZFN-encoding DNA constructs^TMMediated DNA delivery experiments. These combinations include 1) a single plasmid containing both the ZFN coding sequence and the donor DNA and 2) two different plasmids, one containing the ZFN coding sequence and the other containing the donor DNA. In another embodiment, biolistic-based DNA delivery is performed using various combinations of donor DNA and ZFN-encoding DNA constructs. Those skilled in the art will appreciate that these combinations can include 1) a single plasmid containing both the ZFN coding sequence and the donor DNAPlasmid and 2) two different plasmids, one containing the ZFN coding sequence and the other containing the donor DNA.

A.Whiskers ^TM Mediated DNA delivery

As described earlier herein, embryogenic Hi-II cell cultures of maize were generated and used as a source for demonstrating viable plant cells targeted for integration. One skilled in the art can utilize cell cultures derived from a variety of plant species, or differentiated plant tissues derived from a variety of plant species, as a source for confirming targeted integration of living plant cells.

In this example, 12ml of PCV from a previously cryopreserved cell line plus 28ml of conditioned medium was subcultured to 80ml of GN6 liquid medium (N6 medium (Chu et al, 1975), 2.0mg/L2, 4-D, 30g/L sucrose, pH5.8) in a 500ml Erlenmeyer flask and placed on a shaker (125rpm, 28 ℃). This procedure was repeated twice using the same cell line, resulting in a total of 36ml pcv partitioned among 3 flasks. After 24 hours, GN6 broth was removed and replaced with 72ml GN6S/M osmotic medium (N6 medium, 2.0mg/L2, 4-D, 30g/L sucrose, 45.5g/L sorbitol, 45.5g/L mannitol, 100mg/L myo-inositol, pH 6.0). The flasks were incubated at 28 ℃ in the dark for 30-35 minutes with moderate agitation (125 rpm). During the incubation period, a 50mg/ml suspension of silicon carbide whiskers was prepared by adding 8.1ml GN6S/M liquid medium to 405mg of sterile, silicon carbide whiskers (Advanced Composite Materials, LLC, Greer, SC).

After incubation in GN6S/M permeation medium, the contents of each flask were combined into 250ml centrifuge bottles. After all cells in the flask settled to the bottom, the content volume of more than about 14ml GN6S/M liquid was aspirated and collected in a sterile 1L flask for future use. The pre-wetted whisker suspension was vortexed at the highest speed for 60 seconds and then added to a centrifuge bottle.

In one example of delivering a single plasmid containing both ZFN coding sequence plus donor DNA into a plant cell, 170 μ g of purified circular plasmid DNA was added to the vial. In an alternative example of co-delivering two different plasmids, one containing the ZFN coding sequence and the other containing the donor DNA, various strategies on the amount of DNA were evaluated. One strategy utilizes 85 μ g of donor DNA and 85 μ g of zinc finger encoding DNA. Other modifications utilize a molar ratio of 10, 5, or 1 fold donor DNA to 1 fold zinc finger DNA based on the size of the individual plasmid (in kilobase pairs) such that a total of 170 μ g DNA is added per vial. In all cases of co-delivery, the DNA was pre-cooled in tubes before addition to centrifuge bottles. Once the DNA was added, the bottles were immediately placed in a modified Red Devil 5400 commercial paint shaker (Red Devil Equipment Co., Plymouth, MN) and agitated for 10 seconds. After agitation, a mixture of cells, media, whiskers and DNA was added to the contents of the 1L flask along with 125ml of fresh GN6 liquid media to reduce osmoticum (osmoticum). Cells were allowed to recover for 2 hours on a shaker set at 125 rpm. 6mL of the dispersed suspension was filtered onto Whatman #4 filter paper (5.5cm) using a glass cell collector unit connected to an in-house vacuum line, so that 60 sheets of filter paper were obtained per bottle. The filter paper was placed on a 60X20mm GN6 solid medium (same as GN6 liquid medium except with 2.5g/L Gelrite gelling agent) plate and cultured at 28 ℃ for 1 week in the dark.

B. Biolistic-mediated DNA delivery

In the examples cited herein, embryogenic suspensions of maize were subcultured to GN6 liquid medium for 24 hours before the experiments described earlier herein. Excess liquid medium was removed and approximately 0.4PCV cells were spread sparsely in the center of a 100X15mm Petri dish containing GN6S/M medium coagulated with 2.5g/L Gelrite into a circle of 2.5cm diameter. Cells were cultured for 4 hours in dark conditions. To coat biolistic particles with DNA, 3mg of 1.0 micron diameter gold particles were washed 1 time with 100% ethanol, 2 times with sterile distilled water, and resuspended in 50 μ l of water in siliconized Eppendorf tubes. A total of 5. mu.g of plasmid DNA, 20. mu.l of spermidine (0.1M) and 50. mu.l of calcium chloride (2.5M) were added separately to the gold suspension and vortexed. The mixture was incubated at room temperature for 10 minutes, pelleted in a benchtop microcentrifuge at 10,000rpm for 10 seconds, resuspended in 60. mu.l of cold 100% ethanol, and 8-9. mu.l were dispensed onto each macro-carrier.

Using biolistics PDS-1000/He^TMThe system (Bio-Rad Laboratories, Hercules, Calif.) was bombarded. Plates containing cells were placed on the middle scaffold at 1100psi and 27 inches Hg vacuum and bombarded following the operating manual. 16 hours after bombardment, the tissues were transferred in small pieces to GN6(1H) medium and cultured at 28 ℃ for 2-3 weeks in the dark. Transfer was continued every 2-4 weeks until a putative isolate of transgene from donor DNA integration appeared. Identification, isolation and regeneration of putative donor DNA integration events generated via biolistic-mediated DNA delivery and use for DNA integration via Whisskers^TMThe method of putative donor DNA integration events resulting from mediated DNA delivery is the same and is described below.

C. Identification and isolation of putative Targeted integration transgenic events

1 week after DNA delivery, filters were transferred to 60X20mm GN6(1H) selection medium (N6 medium, 2.0mg/L2, 4-D, 30g/L sucrose, 100mg/L myo-inositol, 1.0mg/L Bialaphos from Herbiace (Meiji Seika, Japan), 2.5g/L Gelrite, pH5.8) plates. These selection plates were incubated at 28 ℃ for 1 week in the dark.

After 1 week of selection in the dark, half of the cells from each plate were scraped into medium containing 3.0mL of GN6 agarose (N6 medium, 2.0mg/L2, 4-D, 30g/L sucrose, 100mg/L myo-inositol, 7 g/L) maintained at 37-38 ℃Agarose, ph5.8, autoclaved at 121 ℃ for only 10 minutes) and 1mg/L tubes from Bialaphos from Herbiace to embed the tissue on fresh medium.

The agarose/tissue mixture was broken with a spatula and then 3mL of the agarose/tissue mixture was poured evenly into a 100X15mm culture containing GN6(1H)On the surface of petri dishes of nutrient. This process was repeated for the two halves of each plate. Once all tissues are embedded, the method is usedOrEach plate was sealed and incubated at 28 ℃ under dark conditions for up to 10 weeks. Putative transformed isolates grown under these selection conditions were removed from the embedded plates and transferred to fresh selection medium in 60x20mm plates. If continued growth was evident after about 2 weeks, the event was deemed to be resistant to the applied herbicide (Bialophos), and an aliquot of the cells was subsequently harvested into a 2mL Eppendorf tube for genotyping.

One skilled in the art can utilize a gene in the donor DNA encoding any suitable selectable marker and apply comparable selection conditions to living cells. For example, an alternative selectable marker gene (such as AAD-1, as described in WO 2005/107437A 2) can be implemented as a donor for selection and recovery of integration events in maize cells as described herein.

Example 20: screening for targeted integration events via PCR genotyping

In this example, PCR genotyping is understood to include, but is not limited to, Polymerase Chain Reaction (PCR) amplification of genomic DNA derived from isolated corn callus (which is predicted to contain donor DNA embedded in the genome), followed by standard cloning and sequence analysis of the PCR amplification products. Methods for PCR genotyping are well described (e.g., Rios, G. et al (2002) Plant J.32: 243-253) and can be applied to genomic DNA derived from any Plant species or tissue type, including cell cultures.

One skilled in the art can design strategies for PCR genotyping that include, but are not limited to, amplification of a particular sequence in a plant genome, amplification of multiple particular sequences in a plant genome, amplification of non-particular sequences in a plant genome, or a combination thereof. Amplification may be followed by cloning and sequencing (as described in this example), or by direct sequence analysis of the amplified product. Those skilled in the art are aware of alternative methods that can be used to analyze the amplified fragments produced herein.

In one embodiment described herein, oligonucleotide primers specific for a gene target are employed in the PCR amplification. In another embodiment described herein, oligonucleotide primers specific for the donor DNA sequence are used in the PCR amplification. Another embodiment includes a combination of oligonucleotide primers that bind to both the gene target sequence and the donor DNA sequence. One skilled in the art can design additional primer combinations and amplification reactions for interrogating the genome.

A. Genomic DNA extraction

Genomic dna (gdna) was extracted from isolated, herbicide tolerant corn cells as described in example 19 and used as a template for PCR genotyping experiments. According toThe manufacturer's protocol detailed in the 96 plant kit (QIAGEN Inc., Valencia, CA) extracts gDNA from approximately 100-. The genomic DNA was eluted in 100. mu.l of elution buffer provided with the kit, resulting in a final concentration of 20-200 ng/. mu.l, and subsequently analyzed via the PCR-based genotyping method outlined below.

B. Primer design for PCR genotyping

One skilled in the art can use a variety of strategies for designing and performing PCR-based genotyping. It is feasible to design oligonucleotide primers for annealing to the gene target, the donor DNA sequence, and/or a combination of both. To design oligonucleotide primers that anneal to IPP2K gene targets that are not in the region enclosed by the homology flanks constructed into the donor DNA molecule, characterization of the DNA sequence contained additionalPlasmid cloning of gene target sequence data. Using CEQ^TMThe DTCS-Rapid Start kit (Beckman-Coulter, Palo Alto, Calif.) performs a double-stranded sequencing reaction of the plasmid clones as described by the manufacturer. The reaction was purified using a Performa DTR gel cartridge (Edge BioSystems, Gaithersburg, Md.) as described in the manufacturer's protocol. In a Beckman-Coulter CEQ^TMSequence reactions were analyzed on a 2000XL DNA analysis System and Sequencher was used^TMNucleotide characterization was performed in version 4.1.4 (Gene Codes Corporation, Ann Arbor, MI). These sequences correspond to the IPP2K gene regions upstream (5 '-) and downstream (3' -) of the ZFN targeting region and are depicted in FIG. 91(SEQ ID NO: 141) and FIG. 92(SEQ ID NO: 142).

In the example presented here, all oligonucleotide primers were synthesized by Integrated DNA Technologies, Inc. (Coralville, IA) under standard desalting conditions and diluted with water to a concentration of 100. mu.M. The lower set of forward and reverse oligonucleotide primers were designed to anneal to gDNA sequences specific for the IPP2K gene target that lie outside the boundaries of the donor DNA sequence. These oligonucleotides are as follows:

5’-TGGACGGAGCGAGAGCCAGAATTCGACGCT G-3’(SEQ ID NO：153)

5’-GTGCAAGAATGTATTGGGAATCAACCTGAT G-3’(SEQ ID NO：154)。

the second set of forward and reverse oligonucleotide primers are also designed to anneal to gDNA sequences outside the boundaries of the donor DNA sequence that are specific for the IPP2K gene target, but nested within the first pair:

5’-CTGTGGTACCAGTACTAGTACCAGCATC-3’(SEQ ID NO：155)

5’-TCT TGGATCAAGGCATCAAGC ATTCCAATCT-3’(SEQ ID NO：156)。

forward and reverse oligonucleotide primers were additionally designed for specific annealing to donor DNA corresponding to the coding region of the herbicide tolerance gene:

5’-TGGGTAACTGGCCTAACTGG-3’(SEQ ID NO：157)

5’-TGGAAGGCTAGGAACGCTTA-3’(SEQ ID NO：158)

5’-CCAGTTAGGCCAGTTACCCA-3’(SEQ ID NO：159)

5’TAAGCGTTCCTAGCCTTCCA-3’(SEQ ID NO：160)。

C. donor DNA specific PCR amplification

A primary PCR amplification reaction was performed using reagents supplied by TaKaRa Biotechnology Inc., Seta 3-4-1, Otsu, Shiga, 520- & 2193, Japan and consisting of: 2.5 μ l10 XEx TaqPCR^TMBuffer, 40-200ng double-stranded genomic DNA template, 10. mu.M forward oligonucleotide primer, 10. mu.M reverse oligonucleotide primer, 2. mu.l dNTP mix (2.5 mM each), 16. mu. l H₂O, 0.5. mu.l (2.5 units) Ex Taq^TMA DNA polymerase. PCR reactions were performed using Bio-Rad, 96 sample DNAEngine Tetrad2, Peltier thermal cycler (Hercules, Calif.) under cycling conditions as follows: 3 min at 94 ℃ per 1 cycle; 30 seconds at 94 ℃, 30 seconds at 64 ℃ and 5 minutes at 72 ℃ per 35 cycles; 10 min at 72 ℃ per 1 cycle; 4 deg.C/hold.

The amplification product of the primary PCR reaction is then re-amplified in a re-PCR reaction comprising: 2.5 μ l10X Ex Taq PCR^TMBuffer, 2. mu.l template (Primary PCR reaction in H)₂1: 100 dilution in O), 10. mu.M forward oligonucleotide primer, 10. mu.M reverse oligonucleotide primer, 2. mu.l dNTP mix (2.5 mM each), 16. mu. l H₂O, 0.5. mu.l (2.5 units) ExTaq^TMA DNA polymerase. PCR reactions were performed using Bio-Rad, 96 sample DNA Engine Tetrad2, Peltier thermal cycler (Hercules, Calif.) under cycling conditions as follows: 1 min at 95 ℃ per 1 cycle; 15 seconds at 94 ℃, 30 seconds at 61 ℃, 30 seconds at 72 ℃ per 30 cycles; 1 min at 72 ℃ per 1 cycle; 4 deg.C/hold. 10 μ l of each amplification product was electrophoresed at 100V for 1 hour in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. The amplified fragments were visualized with UV light and the fragment size was estimated by comparison with a 1Kbp Plus DNA ladder (Invitrogen Life Technologies, Carlsbad, Calif.). As shown in FIG. 82, the presence of the 0.317Kbp DNA fragment was judged to contain the pre-DNAPCR products of the phase fragment.

Example 21: detection of targeted integration events

In an herbicide tolerance event comprising an integrated donor DNA molecule encoding an herbicide tolerance gene cassette, a proportion of the event is the product of targeted integration of the donor DNA into the ZFN-induced double-strand break site. To distinguish these targeted integration events from those derived from random integration of herbicide-tolerant gene cassettes, a PCR-based genotyping strategy using a combination of genome-specific PCR primers and subsequent genome-specific plus donor-specific PCR primers was utilized.

A. Genome-specific amplification and subsequent genome/donor-specific amplification

In this embodiment, the primary PCR reaction utilizes oligonucleotide primers specific for the target region of the IPP2K gene upstream and downstream of the donor integration region (e.g., fig. 92 and 93). A primary PCR amplification reaction was performed using reagents supplied by TaKaRa Biotechnology Inc., Seta 3-4-1, Otsu, Shiga, 520- & 2193, Japan and consisting of: 2.5 μ l10X Ex TaqPCR^TMBuffer, 40-200ng double-stranded maize gDNA template, 10. mu.M forward oligonucleotide primer, 10. mu.M reverse oligonucleotide primer, 2. mu.l dNTP mix (2.5 mM each), 16. mu. l H₂O, 0.5. mu.l (2.5 units) Ex Taq^TMA DNA polymerase. PCR reactions were performed using Bio-Rad, 96 sample DNA Engine Tetrad2, Peltier thermal cycler (Hercules, Calif.) under cycling conditions as follows: 3 min at 94 ℃ per 1 cycle; 30 seconds at 94 ℃, 30 seconds at 64 ℃ and 5 minutes at 72 ℃ per 35 cycles; 10 min at 72 ℃ per 1 cycle; 4 deg.C/hold.

The primary PCR reaction product is then reacted at H₂Diluted 1: 100 in O and used as template DNA for two different re-PCR reactions. In this embodiment, the re-reaction utilizes primers that bind in the IPP2K genomic region and the donor molecule, producing an amplicon that spans the integration boundary between the genome and the donor. The first reaction focused on the genome and donorThe 5' boundary of (c). The second reaction focuses on the 3' boundary between the donor and the genome. Both reactions consisted of: 2.5 μ l10 XExTaq PCR^TMBuffer, 2. mu.l template [ primary PCR reaction 1: 100 dilution]10 μ M forward oligonucleotide primer, 10 μ M reverse oligonucleotide primer, 2 μ l dNTP mix (2.5 mM each), 16 μ l H2O, 0.5 μ l (2.5 units) Ex Taq^TMA DNA polymerase. PCR reactions were performed using Bio-Rad, 96 sample DNA Engine Tetrad2, Peltier thermal cycler (Hercules, Calif.) under cycling conditions as follows: 3 min at 94 ℃ per 1 cycle; 30 seconds at 94 ℃, 30 seconds at 60 ℃ and 2 minutes at 72 ℃ per 35 cycles; 10 min at 72 ℃ per 1 cycle; 4 deg.C/hold. 20 μ l of each of the re-PCR reactions were electrophoresed at 100V for 1 hour in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide.

The amplified fragments were visualized with UV light and the fragment size was estimated by comparison with a 1Kbp Plus DNA ladder (Invitrogen Life technologies, Carlsbad, Calif.). The PCR products derived from the donor into targeted integration in the IPP2K gene were judged by the presence of a DNA fragment of 1.65Kbp (5 'border) (fig. 83) or 1.99Kbp (3' border) (fig. 84). These fragments were gel cut and purified using a QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions. Followed by TOPO TAKit (containing)2.1 vectors) andTOP10 chemically competent E.coli cells (Invitrogen Life Technologies, Carlsbad, Calif.) the purified fragment was cloned into the ApCR2.1 plasmid according to the manufacturer's protocol.

Individual colonies were inoculated into 14ml Falcon tubes (Becton-Dickinson, Franklin Lakes, N.J.) containing 2ml TB supplemented with 50. mu.l/ml kanamycin and incubated at 37 ℃ for 16 hours with shaking at 200 rpm. After incubation, 1.5ml of cells were addedTransfer to a 1.7ml Costar microcentrifuge tube (Fisher scientific, Pittsburgh, Pa.) and pellet at 16,000Xg for 1 minute. Removing the supernatant, and usingPlasmid kit (BDbiosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of the isolated plasmid was digested with 10 units of EcoRI (New England BioLabs, Beverly, Mass.). All plasmid digests were incubated at 37 ℃ for 1 hour. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. Fragments were visualized with UV light and fragment size was estimated by comparison to a 1Kbp Plus DNA ladder (Invitrogen Life technologies, Carlsbad, Calif.).

According to 3.9Kbp2.1 the presence of an appropriately sized insert DNA fragment outside the vector is used to judge the expected plasmid clone. Using CEQ^TMThe DTCS-Rapid Start kit (Beckman-Coulter, Palo alto, Calif.) performs a double-stranded sequencing reaction of the plasmid clones as described by the manufacturer. Reactions were purified as described in the manufacturer's protocol using a PerformaDTR gel filtration cartridge (Edge BioSystems, Gaithersburg, Md.). In a Beckman-Coulter CEQ^TMSequence reactions were analyzed on a 2000XL DNA analysis System and Sequencher was used^TM4.1.4 edition (Gene codes corporation, Ann Arbor, MI) for nucleotide characterization. Nucleotide alignments were performed using Vector NTi version 10.1 (Invitrogen Life Technologies, Carlsbad, Calif.).

Sequence data analysis from the targeted integration event (event #073) was performed as follows. The primary PCR products spanning the entire genomic integration site are subject to re-amplification focused on the 5 'or 3' boundary between the genome and the donor. Alignment of the cloned fragments corresponding to these re-amplified products with the wild-type IPP2K genomic sequence and the expected sequence targeting the integration event clearly indicates that precise integration of the donor DNA occurred at the target site.

The nucleotide sequence of the IPP2K genomic locus, the genome/donor border, the nucleotide sequence of the donor region corresponding to the homology-flanking IPP2K, and the nucleotide sequence of the herbicide tolerance cassette were all preserved in the multiple clonal PCR products derived from this event. Thus, this event represents a genome in which homology-driven repair of ZFN-mediated double-strand breaks occur at specific gene targets and targeted integration of donor DNA. Additional transformation events representing the occurrence of unique targeted integration have been obtained, confirming that the methods taught herein are reproducible in maize callus. One skilled in the art can apply these methods to any gene target in any plant species for which targeted integration is deemed desirable.

B. Nested genome/donor specific amplification

In this embodiment, both the primary and subsequent secondary PCR reactions utilize a combination of oligonucleotide primers specific for the IPP2K gene target region upstream or downstream of the donor integration region (appendix V and VI) and oligonucleotide primers specific for the donor sequence. In this example, a primary PCR amplification reaction was performed using reagents supplied by TaKaRaBiotechnology Inc., Seta 3-4-1, Otsu, Shiga, 520-2193, Japan and consisting of: 2.5 μ l10 XEx Taq PCR^TMBuffer, 40-200ng double-stranded maize gDNA template, 10. mu.M forward oligonucleotide primer, 10. mu.M reverse oligonucleotide primer, 2. mu.ldNTP mix (2.5 mM each), 16. mu. l H₂O, 0.5. mu.l (2.5 units) Ex Taq^TMA DNA polymerase. PCR reactions were performed using Bio-Rad, 96 sample DNA Engine Tetrad2, Peltier thermal cycler (Hercules, Calif.) under cycling conditions as follows: 3 min at 94 ℃ per 1 cycle; 30 seconds at 94 ℃, 30 seconds at 52 ℃ or 64 ℃, 2 minutes at 72 ℃ per 35 cycles; 10 min at 72 ℃ per 1 cycle; 4 deg.C/hold.

Then the primary PCR reaction is carried out in H₂Diluted 1: 100 in O and used as template DNA for the second PCR reaction. In this embodiment, the re-reaction also utilizes primers that bind in the IPP2K genomic region and the donor molecule, producing an amplicon that spans the integration boundary between the genome and the donor. The specific details usedThe primers determine whether the amplicon is focused on the 5 'or 3' boundary between the genome and the donor. The reagent composition for these reactions is as follows: 2.5 μ l10 XEx Taq PCR^TMBuffer, 2. mu.l template [ primary PCR 1: 100 dilution]10 μ M forward oligonucleotide primer, 10 μ M reverse oligonucleotide primer, 2 μ l dNTP mix (2.5 mM each), 16 μ l H₂O, 0.5. mu.l (2.5 units) Ex Taq^TMA DNA polymerase. PCR reactions were performed using Bio-Rad, 96 sample DNAEngine Tetrad2, Peltier thermal cycler (Hercules, Calif.) under cycling conditions as follows: 3 min at 94 ℃ per 1 cycle; 30 seconds at 94 ℃, 30 seconds at 54 ℃ or 60 ℃, 2 minutes at 72 ℃ per 35 cycles; 10 min at 72 ℃ per 1 cycle; 4 deg.C/hold. 20 μ l of each of the secondary PCR reactions were electrophoresed for 1 hour at 100V in 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide.

The amplified fragments were visualized with UV light and the fragment size was estimated by comparison with a 1Kbp Plus DNA ladder (Invitrogen Life technologies, Carlsbad, Calif.). The PCR products derived from the donor into targeted integration in the IPP2K gene were judged by the presence of DNA fragments of 1.35Kbp (5 'border) (fig. 85) or 1.66Kbp (3' border) (fig. 86). These fragments were gel cut and purified using a QIA rapid gel extraction kit (QIAGEN inc., Valencia, CA) according to the manufacturer's instructions. Then using TOPO TAKit (containing)2.1 vectors) and OneTOP10 chemically competent E.coli cells (Invitrogen Life Technologies, Carlsbad, Calif.) the purified fragment was cloned into the pCR2.1 plasmid according to the manufacturer's protocol.

C. Nucleotide sequence analysis of genotyping PCR products

Individual colonies as described in example 21B were inoculated into 14ml containing 2ml supplemented with 50. mu.l/ml kanamycin in Falcon tubes of TB (Becton-Dickinson, Franklin Lakes, N.J.) and incubated for 16 h at 37 ℃ with shaking at 200 rpm. After incubation, 1.5ml of cells were transferred to a 1.7ml Costar microcentrifuge tube (Fisher Scientific, Pittsburgh, Pa.) and pelleted at 16,000x g for 1 minute. Removing the supernatant, and usingPlasmid kit (BD Biosciences/Clontech/Macherey-Nagel, Palo Alto, Calif.) plasmid DNA was isolated as described above. Mu.g of the isolated plasmid was digested with 10 units of EcoRI (New England BioLabs, Beverly, Mass.). All plasmid digests were incubated at 37 ℃ for 1 hour. Restriction DNA was electrophoresed for 1 hour at 100V in a 1.0% TAE agarose gel supplemented with 0.5% ethidium bromide. Fragments were visualized with UV light and fragment size was estimated by comparison to a 1Kbp Plus DNA ladder (Invitrogen Life technologies, Carlsbad, Calif.).

According to 3.9Kbp2.1 the presence of the inserted DNA fragment outside the vector to determine the plasmid clone. Using CEQ^TMThe DTCS-Rapid Start kit (Beckman-Coulter, Palo Alto, Calif.) performs a double-stranded sequencing reaction of the plasmid clones as described by the manufacturer. The reaction was purified as described in the manufacturer's protocol using a Performa DTR gel cartridge (Edge BioSystems, Gaithersburg, Md.). In a Beckman-Coulter CEQ^TMSequence reactions were analyzed on a 2000XL DNA analysis System and Sequencher was used^TM4.1.4 edition (Gene Codes Corporation, Ann Arbor, MI) for nucleotide characterization. Nucleotide alignments were performed using Vector NTi version 10.1 (Invitrogen Life Technologies, Carlsbad, Calif.).

Sequence data encompassing the border between the upstream (5 ' -) IPP2K genomic sequence and the donor DNA derived from the multiple targeted integration events is also obtained, including sequence data encompassing the border between the donor DNA and the downstream (3 ' -) IPP2K genomic sequence derived from the multiple targeted integration events and sequence data including the upstream (5 ' -) border sequence derived from a single transformed callus event (# 114). The targeted integration event (#114) of the transformation is the result of autonomous donor integration into the IPP2K gene target.

In these assays, both the primary and secondary PCR amplification reactions are focused on the 5 'or 3' boundary between the genome and the donor. Alignment of the cloned fragments corresponding to these re-amplified products with the wild-type IPP2K genomic sequence and the expected sequence targeting the integration event revealed that integration of the donor DNA occurred at the target site. The nucleotide sequence of the IPP2K genomic locus, the genome/donor border, the nucleotide sequence of the donor region corresponding to the homology-flanking IPP2K, and the nucleotide sequence of the herbicide tolerance cassette were all preserved in the multiple clonal PCR products derived from this event.

Thus, this event represents a genome in which homology-driven repair of ZFN-mediated double-strand breaks occurs at a specific gene target. Additional transformation events representing the occurrence of unique targeted integration have been obtained, confirming that the methods taught herein are reproducible in maize callus. One skilled in the art can apply these methods to any gene target in any plant species for which targeted integration is deemed desirable.

Example 22: regeneration of fertile whole plants from maize callus tissue

Isolated callus from herbicide tolerant corn cells derived from HiII cell cultures can be regenerated into whole, fertile corn plants. One skilled in the art can regenerate a complete, fertile corn plant from a variety of embryogenic corn cell cultures.

In this example, regeneration of isolated, Bialophos-resistant HiII callus was initiated by transferring isolated callus tissue to cytokinin-based induction medium 28(1H), said cytokinin-based induction medium 28(1H) comprising MS salts and vitamins, 30.0g/L sucrose, 5mg/L benzylaminopurine, 0.25 mg/L2, 4-D, 1mg/LBialaphos, and 2.5g/L Gelrite; pH 5.7. Cells were allowed to grow for 1 week in low light (13. mu. Em-2s-1) followed by transfer to higher light conditions (40. mu. Em-2s-1) for 1 week. The cells are then transferred to regeneration medium 36(1H), which is the same as the induction medium except that it lacks a plant growth regulator. Small (3-5cm) plantlets were excised with a hand tool and placed in sterile 150X 25-mm glass culture tubes containing SHGA medium (Schenk and Hildebrand base salts and vitamins, 1972, Can.J. Bot 50: 199-204; 1g/L myo-inositol, 10g/L sucrose, 2.0g/L Gelrite, pH 5.8).

Once the plantlets developed into sufficiently large and differentiated root and stem lines, they were transplanted into 4-inch pots containing Metro-Mix360 growth medium (Sun Gro Horticulture Canada Ltd.) and placed in a greenhouse. Plantlets were covered completely or partially with clear plastic cups for 2-7 days and then transplanted to 5 gallon pots containing a mixture of 95% Metro-Mix360 growth medium and 5% clay/loam and cultured to maturity. Plants may be self-pollinated or cross-pollinated with inbred lines to produce T1 or F1 seeds, respectively. One skilled in the art can self-pollinate regenerated plants or cross-pollinate regenerated plants with a variety of germplasm to achieve maize breeding.

May be found in U.S. patent application publication Nos. US-2003-0232410; US-2005-0026157; US-2005-0064474; US-2005-0208489 and US-2006-0188987; and U.S. patent application Ser. No. 11/493,423 filed on 26.7.2006 for additional information relating to targeted cleavage, targeted recombination, and targeted integration, the disclosures of which are incorporated by reference in their entirety for all purposes.

All patents, patent applications, and publications mentioned herein are hereby incorporated by reference in their entirety for all purposes.

Although the disclosure has been provided by way of illustration in some detail for purposes of clarity of understanding, it will be apparent to those skilled in the art that various changes and modifications can be made without departing from the spirit or scope of the disclosure. Accordingly, the above description and examples should not be construed as limiting.

Sequence listing

<110> Yinong Dow (DOW AGROSCIENCES LLC)

Sanggmamo BIOSCIENCES, Inc. (SANGAMO BIOSCIENCES, INC.)

<120> optimized non-canonical zinc finger proteins

<130>8325-4002.40

<140>PCT/US2007/025455

<141>2007-12-13

<150>60/874,911

<151>2006-12-14

<150>60/932,497

<151>2007-05-30

<160>199

<170>PatentIn version 3.5

<210>1

<211>25

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<220>

<221>MISC_FEATURE

<222>(2)..(3)

<223> Xaa ═ any amino acid

<220>

<221>MISC_FEATURE

<222>(4)..(5)

<223> Xaa ═ any amino acid

Xaa can be present or absent

<220>

<221>MISC_FEATURE

<222>(7)..(18)

<223> Xaa ═ any amino acid

<220>

<221>MISC_FEATURE

<222>(20)..(22)

<223> Xaa ═ any amino acid

<220>

<221>MISC_FEATURE

<222>(23)..(24)

<223> Xaa ═ any amino acid

Xaa can be present or absent

<400>1

Cys Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa

1 5 10 15

Xaa Xaa His Xaa Xaa Xaa Xaa Xaa His

20 25

<210>2

<211>6

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<220>

<221>MISC_FEATURE

<222>(2)..(2)

<223> Xaa ═ any amino acid (preferably A, K or T)

<220>

<221>MISC_FEATURE

<222>(3)..(3)

<223> Xaa ═ any amino acid (preferably Q, E or R)

<220>

<221>MISC_FEATURE

<222>(6)..(6)

<223> Xaa ═ any amino acid (preferably G)

<400>2

His Xaa Xaa Arg Cys Xaa

1 5

<210>3

<211>35

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<220>

<221>MISC_FEATURE

<222>(2)..(3)

<223> Xaa ═ any amino acid

<220>

<221>MISC_FEATURE

<222>(4)..(5)

<223> Xaa ═ any amino acid

Xaa can be present or absent

<220>

<221>MISC_FEATURE

<222>(7)..(18)

<223> Xaa ═ any amino acid

<220>

<221>MISC_FEATURE

<222>(20)..(22)

<223> Xaa ═ any amino acid

<220>

<221>MISC_FEATURE

<222>(23)..(24)

<223> Xaa ═ any amino acid

Xaa can be present or absent

<220>

<221>MISC_FEATURE

<222>(26)..(26)

<223> Xaa ═ any amino acid

<220>

<221>MISC_FEATURE

<222>(27)..(35)

<223> Xaa ═ any amino acid

Xaa can be present or absent

<400>3

Cys Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa

1 5 10 15

Xaa Xaa His Xaa Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa

20 25 30

Xaa Xaa Xaa

35

<210>4

<211>5

<212>PRT

<213> Artificial

<220>

<223> ZC finger

<400>4

Gly Leu Arg Gly Ser

1 5

<210>5

<211>6

<212>PRT

<213> Artificial

<220>

<223> ZC finger

<400>5

Gly Gly Leu Arg Gly Ser

1 5

<210>6

<211>1199

<212>DNA

<213> corn Sichuan (Zea mays)

<400>6

atggagatgg atggggttct gcaagccgcg gatgccaagg actgggttta caagggggaa 60

ggcgccgcga atctcatcct cagctacacc ggctcgtcgc cctccatggt aagcgctgag 120

taggttctta ctgagcgtgc acgcatcgat cacttgactt taggggctca atgtgtgatt 180

cacgggtgcc gcggcgccat tcgagctcca gatccagtac cgctcgagca agtgataaaa 240

catggagcag ggacgatcac gtggtcactt gaaaattacg tgaggtccgg ggcgacgatg 300

tacggcgcgg cgaactctca aacactcaca caaccaaaac cgcttcgtgt tcgtctttgt 360

tccaagcgac tgtgtgagtg tttgagagtt cgccagcgcg acatcgcccg atctgacaaa 420

ttaagctttc gttgcttttc catgattgtg cattttgtga gcatgcactg aatactatga 480

tggatatgtt tggaggaagc attattccaa tttgatgata agggtgttat ttacacttgt 540

tttcagcttg gcaaggtact gcggctcaag aagattctaa aaaacaagtc gcagcgggca 600

ccgagttgta ttgtattctc aagtcatgag caactcctgt ggggccatat cccagaactg 660

gttgagtcgg tcaaacaaga ttgcttggct caagcctatg cagtgcatgt tatgagccaa 720

cacctgggtg ccaatcatgt cgatggtggg gtatggttca gattcagttc atttatgtcc 780

tgttattgtg attttgattg gtaacatatt gacaacctcg acacttggga tcagattcag 840

ttcacttatg gaagaaattg gagaattgtt ataatttatc tataatcacc cctactgaaa 900

tagaaataac atggcatcaa tgtgcatgct attggatttt gacacgaata tgctttattc 960

tatcatatgt tggtaattcc agcaggcagc aggcactact ctttggatcc acgtgacttg1020

acaaagaaat catgccatct ttccacaatg caggtccgtg tacgtgtttc tagggatttt1080

ctggagcttg tcgaaaagaa tgttcttagc agccgtcctg ctgggagagt aaatgcaagt1140

tcaattgata acactgctga tgccgctctt ctaatagcag accactcttt attttctgg 1199

<210>7

<211>50

<212>DNA

<213> corn Sichuan

<400>7

caactcctgt ggggccatat cccagaactg gttgagtcgg tcaaacaaga 50

<210>8

<211>12

<212>DNA

<213> corn Sichuan

<400>8

ctgtggggcc at 12

<210>9

<211>18

<212>DNA

<213> corn Sichuan

<400>9

cttgaccaac tcagccag 18

<210>10

<211>59

<212>DNA

<213> corn Sichuan

<400>10

aagtcatgag caactcctgt ggggccatat cccagaactg gttgagtcgg tcaaacaag 59

<210>11

<211>53

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>11

aagtcatgag caactcctgt ggggccaaga actggttgag tcggtcaaac aag 53

<210>12

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>12

His Thr Lys Ile His Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>13

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>13

His Thr Lys Ile Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>14

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>14

His Thr Lys Gly Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>15

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>15

His Thr Lys Ala Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>16

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>16

His Thr Lys Val Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>17

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>17

His Thr Lys Leu Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>18

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>18

His Thr Lys Ser Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>19

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>19

His Thr LysAsn Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>20

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>20

His Thr Lys Lys Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>21

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>21

His Thr Lys Arg Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>22

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>22

His Thr Lys Ile Gly Gly Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>23

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>23

His Thr Lys Ile Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>24

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>24

His Thr Lys Ile Cys Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>25

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>25

His Thr Lys Ile Gly Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>26

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>26

His Thr Lys Ile Gly Cys Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>27

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>27

His Leu Lys Gly Asn Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>28

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>28

His Leu Lys Gly Asn Cys Pro Ala Gly Ser Gln Leu Val

1 5 10

<210>29

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>29

His Ser Glu Gly Gly Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>30

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>30

His Ser Glu Gly Gly Cys Pro Gly Gly Ser Gln Leu Val

1 5 10

<210>31

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>31

His Ser Ser Ser Asn Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>32

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>32

His Ser Ser Ser Asn Cys Thr Ile Gly Ser Gln Leu Val

1 5 10

<210>33

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>33

His Thr Lys Ile Cys Gly Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 1015

<210>34

<211>16

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>34

His Thr Lys Ile Gly Cys Gly Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>35

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>35

His Thr Lys Ile Gly Gly Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>36

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>36

His Thr Lys Ile Gly Gly Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>37

<211>16

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>37

His Thr Lys Ile Gly Gly Cys Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>38

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>38

His Thr Lys Arg Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>39

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>39

His Thr Lys Arg Cys Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>40

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>40

His Thr Lys Arg Cys Gly Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>41

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>41

His Thr Lys Arg Gly Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>42

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>42

His Thr Lys Arg Gly Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>43

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>43

His Thr Lys Arg Gly Cys Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>44

<211>16

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>44

His Thr Lys Arg Gly Cys Gly Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>45

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>45

His Thr Lys Arg Gly Gly Cys Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>46

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>46

His Thr Lys Arg Gly Gly Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>47

<211>16

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>47

His Thr Lys Arg Gly Gly Cys Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>48

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>48

His Leu Lys Gly Asn Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>49

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>49

His Leu Lys Gly Asn Cys Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>50

<211>16

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>50

His Leu Lys Gly Asn Cys Gly Gly Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>51

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>51

His Lys Glu Arg Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>52

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>52

His Thr Arg Arg Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>53

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>53

His Ala Gln Arg Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>54

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>54

His Lys Lys Phe Tyr Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>55

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>55

His Lys Lys His Tyr Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>56

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>56

His Lys Lys Tyr Thr Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>57

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>57

His Lys Lys Tyr Tyr Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>58

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>58

His Lys Gln Tyr Tyr Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>59

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>59

His Leu Leu Lys Lys Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>60

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>60

His Gln Lys Phe Pro Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>61

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>61

His Gln Lys Lys Leu Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>62

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>62

His Gln Ile Arg Gly Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>63

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>63

His Ile Lys Arg Gln Ser Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>64

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>64

His Ile Arg Arg Tyr Thr Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>65

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>65

His Ile Ser Ser Lys Lys Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>66

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>66

His Lys Ile Gln Lys Ala Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>67

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>67

His Lys Arg Ile Tyr Thr Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>68

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>68

His Leu Lys Gly Gln Asn Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>69

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>69

His Leu Lys Lys Asp Gly Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>70

<211>15

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>70

His Leu Lys Tyr Thr Pro Cys Gly Leu Arg Gly Ser Gln Leu Val

1 5 10 15

<210>71

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>71

His Thr Lys Arg Cys Gly Arg Gly Ser Gln Leu Val

1 5 10

<210>72

<211>14

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>72

His Thr Lys Ile Gly Cys Gly Gly Arg Gly Ser Gln Leu Val

1 5 10

<210>73

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>73

His Leu Lys Gly Asn Cys Gly Arg Gly Ser Gln Leu Val

1 5 10

<210>74

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>74

His Leu Lys Gly Asn Cys Gly Gly Gly Ser Gln Leu Val

1 5 10

<210>75

<211>11

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>75

His Ile Arg Thr Cys Thr Gly Ser Gln Lys Pro

15 10

<210>76

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>76

His Ile Arg Thr Cys Gly Thr Gly Ser Gln Lys Pro

1 5 10

<210>77

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>77

His Ile Arg Thr Gly Cys Thr Gly Ser Gln Lys Pro

1 5 10

<210>78

<211>13

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>78

His Ile Arg Thr Gly CysGly Thr Gly Ser Gln Lys Pro

1 5 10

<210>79

<211>11

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>79

His Ile Arg Arg Cys Thr Gly Ser Gln Lys Pro

1 5 10

<210>80

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>80

His Ile Arg Arg Gly Cys Thr Gly Ser Gln Lys Pro

1 5 10

<210>81

<211>11

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>81

His Thr Lys Ile His Thr Gly Ser Gln Lys Pro

1 5 10

<210>82

<211>11

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>82

His Thr Lys Ile Cys Thr Gly Ser Gln Lys Pro

1 5 10

<210>83

<211>11

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>83

His Thr Lys Arg Cys Thr Gly Ser Gln Lys Pro

1 5 10

<210>84

<211>11

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>84

His Ala Gln Arg Cys Thr Gly Ser Gln Lys Pro

1 5 10

<210>85

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>85

His Thr Lys Ile Cys Gly Thr Gly Ser Gln Lys Pro

1 5 10

<210>86

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>86

His Thr Lys Arg Cys Gly Thr Gly Ser Gln Lys Pro

1 5 10

<210>87

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>87

His Ala Gln Arg Cys Gly Thr Gly Ser Gln Lys Pro

1 5 10

<210>88

<211>12

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>88

His Thr Lys Ile His Leu Arg Gly Ser Gln Leu Val

1 5 10

<210>89

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>89

His Ala Gln Arg Cys Gly Gly

1 5

<210>90

<211>8

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>90

His Ala Gln Arg Cys Gly Gly Gly

1 5

<210>91

<211>8

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>91

His Thr Lys Ile Cys Gly Gly Gly

1 5

<210>92

<211>8

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>92

His Thr Lys Arg Cys Gly Gly Gly

1 5

<210>93

<211>6

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>93

His Ala Gln Arg Cys Gly

1 5

<210>94

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>94

Cys Cys His Thr Lys Ile His

1 5

<210>95

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>95

Cys Cys His Thr Lys Ile His

1 5

<210>96

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>96

Cys Cys His Thr Lys Ile Cys

1 5

<210>97

<211>10

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>97

Cys Cys His Thr Lys Arg Cys Gly Gly Gly

1 5 10

<210>98

<211>8

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>98

Cys Cys His Ala Gln Arg Cys Gly

1 5

<210>99

<211>8

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>99

Cys Cys His Ile Arg Thr Gly Cys

1 5

<210>100

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>100

Cys Cys His Thr Lys Ile His

1 5

<210>101

<211>8

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>101

Cys Cys His Ile Arg Thr Gly Cys

1 5

<210>102

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>102

Cys Cys His Thr Lys Ile His

1 5

<210>103

<211>9

<212>PRT

<213> Artificial

<220>

<223> Zinc finger motifs

<400>103

Cys Cys His Ala Gln Arg Cys Gly Gly

1 5

<210>104

<211>29

<212>DNA

<213> Artificial

<220>

<223> primer

<400>104

atggagatgg atggggttct gcaagccgc 29

<210>105

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>105

Asp Arg Ser Ala Leu Ser Arg

1 5

<210>106

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>106

Arg Asn Asp Asp Arg Lys Lys

1 5

<210>107

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>107

Arg Ser Asp Asn Leu Ser Thr

1 5

<210>108

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>108

His Ser His Ala Arg Ile Lys

1 5

<210>109

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>109

Arg Ser Asp Val Leu Ser Glu

1 5

<210>110

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>110

Gln Ser Gly Asn Leu Ala Arg

1 5

<210>111

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>111

Arg Ser Asp Asn Leu Ala Arg

1 5

<210>112

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>112

Thr Ser Gly Ser Leu Thr Arg

1 5

<210>113

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>113

Thr Ser Gly Asn Leu Thr Arg

1 5

<210>114

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>114

Arg Ser Asp His Leu Ser Glu

1 5

<210>115

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>115

Gln Ser Ala Thr Arg Lys Lys

1 5

<210>116

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>116

Glu Arg Gly Thr Leu Ala Arg

1 5

<210>117

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>117

Arg Ser Asp Ala Leu Thr Gln

1 5

<210>118

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>118

Arg Ser Asp Ser Leu Ser Ala

1 5

<210>119

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>119

Arg Ser Ala Ala Leu Ala Arg

1 5

<210>120

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>120

Arg Ser Asp Asn Leu Ser Glu

1 5

<210>121

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>121

Ala Ser Lys Thr Arg Thr Asn

1 5

<210>122

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>122

Asp Arg Ser His Leu Ala Arg

1 5

<210>123

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>123

Arg Ser Asp His Leu Ser Thr

1 5

<210>124

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>124

Gln Ser Gly Ser Leu Thr Arg

1 5

<210>125

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>125

Gln Asn His His Arg Ile Asn

1 5

<210>126

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>126

Thr Gly Ser Asn Leu Thr Arg

1 5

<210>127

<211>7

<212>PRT

<213> Artificial

<220>

<223> Zinc finger binding region

<400>127

Asp Arg Ser Ala Leu Ala Arg

1 5

<210>128

<211>18

<212>DNA

<213> Artificial

<220>

<223> IPP2K zinc finger target sequence

<400>128

gaactggttg agtcggtc 18

<210>129

<211>18

<212>DNA

<213> Artificial

<220>

<223> IPP2K zinc finger target sequence

<400>129

gaactggttg agtcggtc 18

<210>130

<211>12

<212>DNA

<213> Artificial

<220>

<223> IPP2K zinc finger target sequence

<400>130

atggccccac ag 12

<210>131

<211>15

<212>DNA

<213> Artificial

<220>

<223> IPP2K zinc finger target sequence

<400>131

ggcacccagg tgttg 15

<210>132

<211>18

<212>DNA

<213> Artificial

<220>

<223> IPP2K zinc finger target sequence

<400>132

gtcgatggtg gggtatgg 18

<210>133

<211>18

<212>PRT

<213> Artificial

<220>

<223> NLS derived from maize op-2

<400>133

Arg Lys Arg Lys Glu Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Tyr

1 5 10 15

Arg Lys

<210>134

<211>18

<212>PRT

<213> Artificial

<220>

<223> 2A sequence from Thosea asigna virus

<400>134

Glu Gly Arg Gly Ser Leu Leu Thr Cys Gly Asp Val Glu Glu Asn Pro

1 5 10 15

Gly Pro

<210>135

<211>30

<212>DNA

<213> Artificial

<220>

<223> primer

<400>135

ggaagcatta ttccaatttg atgataatgg 30

<210>136

<211>29

<212>DNA

<213> Artificial

<220>

<223> primer

<400>136

cccaagtgtc gaggttgtca atatgttac 29

<210>137

<211>26

<212>DNA

<213> Artificial

<220>

<223> primer

<400>137

ssscaccaag ttgtattgcc ttctca 26

<210>138

<211>26

<212>DNA

<213> Artificial

<220>

<223> primer

<400>138

sssataggct tgagccaagc aatctt 26

<210>139

<211>855

<212>DNA

<213> Artificial

<220>

<223> position-13' -homologous flanking sequence

<400>139

gcggccgcta gatagcagat gcagattgct tgcttctctg gtttgatttt tggagtcacc 60

atttctgttt ggttcgtgtg cctcagtgtc tgacagcagc agatcctcga tggagatgga 120

tggggttctg caagccgcgg atgccaagga ctgggtttac aagggggaag gcgccgcgaa 180

tctcatcctc agctacaccg gcacgtcgcc ctccatggta agcgctgagt aggttcttac 240

tgagtgtgca cgcatcgatc acttgacttt aggggctcaa tgtgtgattc acgggtgccg 300

ccattcgagc tccagatcca gtatcgctcg agcaagtgat aaaacatgga gcagggacga 360

tcacgtggtc acttgaaaat tatgtgaggt ccggggcgac gatgtacggc gcggcgaact 420

ctcaaacact cacacagcca aaaccgcttc gtgttcgtct ttgttccaag cgaccgtgtg 480

gtgtgtttgt agtagttcgc cggcgccgca catcgtcgcc ccggatctga caaattaagc 540

tttcgttgct tttccacgat tgtgcatttt ctgagcatgc actgaatact atgatggata 600

tgtttggagg aagcattatt ccaatttgat gataatggtg ttatttacac ttgttttcag 660

cttggcaagg tactgcggct caagaagatt ctaaaaaaca agttgcagcg ggcaccaagt 720

tgtattgcct tctcaagtca tgagcaactc ctgtggggcc atatcccaga actggttgag 780

tcggtcaaac aagattgctt ggctcaagcc tatgcagtgc atgttatgag ccaacacctg 840

ggtgccaata ctagt 855

<210>140

<211>845

<212>DNA

<213> Artificial

<220>

<223> position-23' -homologous flanking sequence

<400>140

actagtcatg tcgatggtgg ggtatggttc agattcagtt catttatgtc ctgttattgt 60

gattttgatt ggtaacatat tgacaacctc gacacttggg atcagattca gttcacttat 120

ggaagaaatt ggagaattgt gataatttat ctataatcac ccctactgaa atagaaataa 180

catgacatca atgtgcatgc tattggattt tgacacgaat atgctttatt ctatcatatg 240

ttggtaattc cagcaggcag caggcactac tctttggatc cacgtgactt gacaaagaaa 300

tcatgccatc tttccacaat gcaggtccgt gtacgtgttt ctagggattt tctggagctt 360

gtcgaaaaga atgttcttag cagccgtcct gctgggagag taaatgcaag ttcaattgat 420

aacactgctg atgccgctct tctaatagca gaccactctt tattttctgg tacgtactct 480

atccctcttc ttaccataat ctgaatcttg ttaaggttta aaatatacga ttgattaagt 540

aaaatccaga gctctattca tatctcacgc actgatgttt tgatgaaacg cttgcagcaa 600

gacggttgcc tgttatttct atttgcatta gacaaacagt cacctttgtt tataaaggtc 660

tttgaatttg cagttcttat aggtttaagt ttgcaactgt tacttacaac agcccaatgg 720

gtagcatcaa gattgttttt ttcagtgatt cataacttaa ctcttggtta aaccgctaga 780

acatggttgg tgtcttaaaa tgcaactggt cctgaggccg taacctgaaa tcattgtacg 840

tcgac 845

<210>141

<211>484

<212>DNA

<213> Artificial

<220>

<223> upstream 5' -IPP2K Gene region of ZFN targeting region

<400>141

tggacggagc gagagccaga attcgacgct ggcggcggcg cgtcgccaat acgcagcgcg 60

gatgtggagc cacatgcaaa cgtgtgtccg cccgcgtggc gtccactctc cctccacgtt 120

tcggcgtcct cgtcgccttc ctgggaaatc tccagctact gcccactgcc ccttcccttc 180

agtccctttc cccgggctgt ggtaccagta ctagtaccag catctcttca ggctccacca 240

agcgcagaca ccgcagcagc ggcagcagca cgatccggtg accccccgcc gcgtccagcc 300

tgctcctccg gtgatcgccg gactggcggg gtaggaacca gcggagcgca gcccgcctcc 360

ttccgctggt aagagtgacg cccgcccgct cctcccttcg ctcgcttcct tgctcttccg 420

attctggcgt accagtctca ccgcggcttg gggatttgat gcggagctag ttaaccagca 480

gagc 484

<210>142

<211>729

<212>DNA

<213> Artificial

<220>

<223> downstream 3' -IPP2K gene sequence of ZFN targeting region

<400>142

attgtttttt tcagtgattc ataacttaac tcttggttaa accgctagaa catggttggt 60

gtcttaaaat gcaactggtc ctgaggccgt aacctgaaat cattgtactt ttctctcatt 120

tctttagata tttccaaaac tctacattag atgatttatg tttgcttact tagtctttct 180

taatctcagg caatcctaag ggtagcagct gcatagctgt agagataaag gtactttgca 240

agcttcctct tttattctta tttttcattt cttatgtata tttctcctca accatttgac 300

ttcttttcgg catgctctac cttgcaggcc aaatgtgggt ttctgccatc atcagaatat 360

atatcagaag ataatactat caagaaacta gtaacgagat ataagatgca tcagcacctc 420

aaattttatc agggtgaggt gtgtagattg gaatgcttga tgccttgatc caagataaaa 480

ttccactctc ttttgcgcac ttaaaaaaca tccatcgatg atacaaactt gatcaaaata 540

ccttaaggct tgttatttac ggcactgttg taatattata ccgtctcttg ctttttgaca 600

tcaggttgat tcccaataca ttcttgcaca catttcagat atcgaagact agtgagtaca 660

atcctcttga tctattttct gggtcaaaag agagaatatg catggccatc aagtcccttt 720

tctcaactc 729

<210>143

<211>42

<212>DNA

<213> Artificial

<220>

<223> primer

<400>143

gcggccgcgt ctcaccgcgg cttggggatt ggatacggag ct 42

<210>144

<211>38

<212>DNA

<213> Artificial

<220>

<223> primer

<400>144

actagtgata tggccccaca ggagttgctc atgacttg 38

<210>145

<211>40

<212>DNA

<213> Artificial

<220>

<223> primer

<400>145

actagtccag aactggttga gtcggtcaaa caagattgct 40

<210>146

<211>32

<212>DNA

<213> Artificial

<220>

<223> primer

<400>146

gtcgaccttg atgctaccca ttgggctgtt gt 32

<210>147

<211>30

<212>DNA

<213> Artificial

<220>

<223> primer

<400>147

gcggccgcta gatagcagat gcagattgct 30

<210>148

<211>29

<212>DNA

<213> Artificial

<220>

<223> primer

<400>148

actagtattg gcacccaggt gttggctca 29

<210>149

<211>38

<212>DNA

<213> Artificial

<220>

<223> primer

<400>149

actagtcatg tcgatggtgg ggtatggttc agattcag 38

<210>150

<211>37

<212>DNA

<213> Artificial

<220>

<223> primer

<400>150

gtcgacgtac aatgatttca ggttacggcc tcaggac 37

<210>151

<211>41

<212>DNA

<213> Artificial

<220>

<223> primer

<400>151

actagttaac tgacctcact cgaggtcatt catatgcttg a 41

<210>152

<211>29

<212>DNA

<213> Artificial

<220>

<223> primer

<400>152

actagtgtga attcagcact taaagatct 29

<210>153

<211>101

<212>DNA

<213> Artificial

<220>

<223> primer

<400>153

actagtggcg gcggagaggg cagaggaagt cttctaacat gcggtgacgt ggaggagaat 60

cccggcccta ggatggcttc tccggagagg agaccagttg a 101

<210>154

<211>35

<212>DNA

<213> Artificial

<220>

<223> primer

<400>154

actagtatgc atgtgaattc agcacttaaa gatct 35

<210>155

<211>28

<212>DNA

<213> Artificial

<220>

<223> primer

<400>155

ctgtggtacc agtactagta ccagcatc 28

<210>156

<211>31

<212>DNA

<213> Artificial

<220>

<223> primer

<400>156

tcttggatca aggcatcaag cattccaatc t 31

<210>157

<211>20

<212>DNA

<213> Artificial

<220>

<223> primer

<400>157

tgggtaactg gcctaactgg 20

<210>158

<211>20

<212>DNA

<213> Artificial

<220>

<223> primer

<400>158

tggaaggcta ggaacgctta 20

<210>159

<211>20

<212>DNA

<213> Artificial

<220>

<223> primer

<400>159

ccagttaggc cagttaccca 20

<210>160

<211>20

<212>DNA

<213> Artificial

<220>

<223> primer

<400>160

taagcgttcc tagccttcca 20

<210>161

<211>31

<212>DNA

<213> Artificial

<220>

<223> primer

<400>161

cttggcaagg tactgcggct caagaagatt c 31

<210>162

<211>26

<212>DNA

<213> Artificial

<220>

<223> primer

<400>162

atgaagaaag acagggaatg aaggac 26

<210>163

<211>32

<212>DNA

<213> Artificial

<220>

<223> primer

<400>163

atgaagaaag acagggaatg aaggaccgcc ac 32

<210>164

<211>28

<212>DNA

<213> Artificial

<220>

<223> primer

<400>164

catggagggc gacgagccgg tgtagctg 28

<210>165

<211>27

<212>DNA

<213> Artificial

<220>

<223> primer

<400>165

atcgacatga ttggcaccca ggtgttg 27

<210>166

<211>31

<212>DNA

<213> Artificial

<220>

<223> primer

<400>166

tttcgacaag ctccagaaaa tccctagaaa c 31

<210>167

<211>28

<212>DNA

<213> Artificial

<220>

<223> primer

<400>167

acaagctcca gaaaatccct agaaacac 28

<210>168

<211>32

<212>DNA

<213> Artificial

<220>

<223> primer

<400>168

ttcgacaagc tccagaaaat ccctagaaac ac 32

<210>169

<211>29

<212>DNA

<213> Artificial

<220>

<223> primer

<400>169

tgctaagaac attcttttcg acaagctcc 29

<210>170

<211>32

<212>DNA

<213> Artificial

<220>

<223> primer

<400>170

gaacattctt ttcgacaagc tccagaaaat cc 32

<210>171

<211>30

<212>DNA

<213> Artificial

<220>

<223> primer

<400>171

tggacggagc gagagccaga attcgacgct 30

<210>172

<211>31

<212>DNA

<213> Artificial

<220>

<223> primer

<400>172

gtgcaagaat gtattgggaa tcaacctgat g 31

<210>173

<211>59

<212>DNA

<213> corn Sichuan

<400>173

aagtcatgag caactcctgt ggggccatat cccagaactg gttgagtcgg tcaaacaag 59

<210>174

<211>53

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>174

aagtcatgag caactcctgt ggggccaaga actggttgag tcggtcaaac aag 53

<210>175

<211>55

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>175

aagtcatgag caactcctgt ggggccataa gaactggttg agtcggtcaa acaag 55

<210>176

<211>54

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>176

aagtcatgag caactcctgt ggggccatag aactggttga gtcggtcaaa caag 54

<210>177

<211>56

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>177

aagtcatgag caactcctgt ggggccatac agaactggtt gagtcggtca aacaag 56

<210>178

<211>57

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>178

aagtcatgag caactcctgt ggggccatac cagaactggt tgagtcggtc aaacaag 57

<210>179

<211>52

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>179

aagtcatgag caactcctgt ggggccagaa ctggttgagt cggtcaaaca ag 52

<210>180

<211>57

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>180

aagtcatgag caactcctgt ggggccatat cagaactggt tgagtcggtc aaacaag 57

<210>181

<211>54

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>181

aagtcatgag caactcctgt ggggccatag aactggttga gtcggtcaaa caag 54

<210>182

<211>54

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>182

aagtcatgag caactcctgt ggggccatag aactggttga gtcggtcaaa caag 54

<210>183

<211>57

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>183

aagtcatgag caactcct gg tggggccata cagaactggt tgagtcggtc aaacaag 57

<210>184

<211>54

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>184

aagtcatgag caactcctgt ggggccatag aactggttga gtcggtcaaa caag 54

<210>185

<211>53

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>185

aagcatgagc aactcctgtg gggccataga actggttgag tcggtcaaac aag 53

<210>186

<211>56

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>186

aagtcatgag caactcctgt ggggccatac agaactggtt gagtcggtca aacaag 56

<210>187

<211>57

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>187

aagtcatgag caactcctgt ggggccatat cagaactggt tgagtcggtc aaacaag 57

<210>188

<211>54

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>188

aagtcatgag caactcctgt ggggccatag aactggttga gtcggtcaaa caag 54

<210>189

<211>54

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>189

aagtcatgag caactcctgt ggggccacag aactggttga gtcggtcaaa caag 54

<210>190

<211>54

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>190

aagtcatgag caactcctgt ggggccatag aactggttga gtcggtcaaa caag 54

<210>191

<211>57

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>191

aagtcatgag caactcctgt ggggccatac cagaactggt tgagtcggtc aaacaag 57

<210>192

<211>54

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>192

aagtcatgag caactcctgt ggggccatag aactggttga gtcggtcaaa caag 54

<210>193

<211>55

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>193

aagtcatgag caactcctgt ggggccataa gaactggttg agtcggtcaa acaag 55

<210>194

<211>56

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>194

aagtcatgag caactcctgt ggggccatac agaactggtt gagtcggtca aacaag 56

<210>195

<211>56

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>195

aagtcatgag caactcctgt ggggccatat agaactggtt gagtcggtca aacaag 56

<210>196

<211>56

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>196

aagtcatgag caactcctgt ggggccatac agaactggtt gagtcggtca aacaag 56

<210>197

<211>57

<212>DNA

<213> Artificial

<220>

<223> maize IPP2K gene sequence with ZFN-mediated mutations

<400>197

aagtcatgag caactcctgt ggggccatac cagaactggt tgagtcggtc aaacaag 57

<210>198

<211>821

<212>DNA

<213> Artificial

<220>

<223> position-15' -homologous flanking sequence

<400>198

gcggccgcgt ctcaccgcgg cttggggatt ggatacggag ctagttaacc agcagagcta 60

gatagcagac gcagatcgct tgcttctctg gtttgatttt tggagtcacc atttctgttt 120

ggttcgtgtg cctcagtgtc tgacagcagc agatcctcga tggagatgga tggggttctg 180

caagccgcgg atgccaagga ctgggtttac aagggggaag gcgccgcgaa tctcatcctc 240

agctacaccg gcacgtcgcc ctccatggta agcgctgagt aggttcttac tgagtgtgca 300

cgcatcgatc acttgacttt aggggctcaa tgtgtgattc acgggtgccg ccattcgagc 360

tccagatcca gtatcgctcg agcaagtgat aaaacatgga gcagggacga tcacgtggtc 420

acttgaaaat tatgtgaggt ccggggcgac gatgtacggc gcggcgaact ctcaaacact 480

cacacagcca aaaccgcttc gtgttcgtct ttgttccaag cgaccgtgtg gtgtgtttgt 540

agtagttcgc cggcgccgca catcgtcgcc ccggatctga caaattaagc tttcgttgct 600

tttccacgat tgtgcatttt ctgagcatgc actgaatact atgatggata tgtttggagg 660

aagcattatt ccaatttgat gataatggtg ttatttacac ttgttttcag cttggcaagg 720

tactgcggct caagaagatt ctaaaaaaca agttgcagcg ggcaccaagt tgtattgcct 780

tctcaagtca tgagcaactc ctgtggggcc atatcactag t 821

<210>199

<211>821

<212>DNA

<213> Artificial

<220>

<223> position-13' -homologous flanking sequence

<400>199

actagtccag aactggttga gtcggtcaaa caagattgct tggctcaagc ctatgcagtg 60

catgttatga gccaacacct gggtgccaat catgtcgatg gtggggtatg gttcagattc 120

agttcattta tgtcctgtta ttgtgatttt gattggtaac atattgacaa cctcgacact 180

tgggatcaga ttcagttcac ttatggaaga aattggagaa ttgtgataat ttatctataa 240

tcacccctac tgaaatagaa ataacatgac atcaatgtgc atgctattgg attttgacac 300

gaatatgctt tattctatca tatgttggta attccagcag gcagcaggca ctactctttg 360

gatccacgtg acttgacaaa gaaatcatgc catctttcca caatgcaggt ccgtgtacgt 420

gtttctaggg attttctgga gcttgtcgaa aagaatgttc ttagcagccg tcctgctggg 480

agagtaaatg caagttcaat tgataacact gctgatgccg ctcttctaat agcagaccac 540

tctttatttt ctggtacgta ctctatccct cttcttacca taatctgaat cttgttaagg 600

tttaaaatat atgattgatt aagtaaaatc cagagctcta ttcatatctc acgcactgat 660

gttttgatga aacgcttgca gcaagacggt tgcctgttat ttctatttgc attagacaaa 720

cagtcacctt tgtttataaa ggtctttgaa tttgcagttc ttataggttt aagtttgcaa 780

ctgttactta caacagccca atgggtagca tcaaggtcga c 821

Claims

1. A zinc finger protein comprising a non-canonical zinc finger, wherein the non-canonical zinc finger has a helical portion involved in DNA binding and wherein at least one zinc finger comprises the sequence Cys- (X)^A)_2-4-Cys-(X^B)₁₂-His-(X^C)_3-5-Cys-(X^D)_l-10(SEQ ID NO:3) wherein X^AAnd X^BCan be any amino acid, and wherein the amino acid sequence C-terminal to and including the 3 rd zinc coordinating residue is selected from the group consisting of:

HIRTCTGSQKP (SEQ ID NO:75) ；

HIRRCTGSQKP (SEQ ID NO:79)；

HTKICTGSQKP (SEQ ID NO:82)；

HAQRCTGSQKP (SEQ ID NO:84) ；

HTKICGTGSQKP(SEQ ID NO:85)；

HAQRCGTGSQKP(SEQ ID NO:87)。

2. a zinc finger protein comprising a plurality of zinc fingers, wherein at least one zinc finger comprises a non-canonical zinc finger according to claim 1.

3. The zinc finger protein of any of claims 1to 2, wherein the zinc finger protein comprises any of the following sequences and is engineered to bind to a target sequence in the IPP2-K gene:

DRSALSR (SEQ ID NO:105)；

RNDDRKK (SEQ ID NO:106)；

RSDNLST (SEQ ID NO:107);

HSHARIK (SEQ ID NO:108)；

RSDVLSE (SEQ ID NO:109)；

QSGNLAR (SEQ ID NO:110)；

RSDNLAR (SEQ ID NO:111)；

TSGSLTR (SEQ ID NO:112)；

TSGNLTR (SEQ ID NO:113)；

RSDHLSE (SEQ ID NO:114)；

QSATRKK (SEQ ID NO:115)；

ERGTLAR (SEQ ID NO:116)；

RSDALTQ (SEQ ID NO:117)；

RSDSLSA (SEQ ID NO:118)；

RSAALAR (SEQ ID NO:119)；

RSDNLSE (SEQ ID NO:120)；

ASKTRTN (SEQ ID NO:121)；

DRSHLAR (SEQ ID NO:122)；

RSDHLST (SEQ ID NO:123)；

QSGSLTR (SEQ ID NO:124)；

QNHHRIN(SEQ ID NO:125)；

TGSNLTR (SEQ ID NO:126 )；

DRSALAR (SEQ ID NO:127)。

4. a fusion protein comprising a zinc finger protein of any of claims 1to 3 and one or more functional domains.

5. The fusion protein of claim 4, wherein the functional domain comprises a cleavage half-domain, and wherein the fusion protein comprises a linker interposed between the cleavage half-domain and the zinc finger protein.

6. The fusion protein of claim 5, wherein the linker is 5 amino acids in length.

7. The fusion protein of claim 6, wherein the amino acid sequence of the linker is GLRGS (SEQ ID NO: 4).

8. The fusion protein of claim 5, wherein the linker is 6 amino acids in length.

9. The fusion protein of claim 8, wherein the amino acid sequence of the linker is GGLRGS (SEQ ID NO: 5).

10. A polynucleotide encoding at least one zinc finger protein according to any one of claims 1to 3 or at least one fusion protein according to any one of claims 4 to 9.

11. A method for targeted cleavage of cellular chromatin in a plant cell, the method comprising expressing in the cell a pair of fusion proteins according to any one of claims 5 to 9 or at least one polynucleotide encoding the pair of fusion proteins, wherein:

(a) the target sequences of the fusion proteins are within 10 nucleotides of each other; and is

12. A method of targeted genetic recombination in a host plant cell, the method comprising:

(a) expressing in said host cell a pair of fusion proteins according to any one of claims 5 to 9 or at least one polynucleotide encoding said pair of fusion proteins, wherein the target sequence of said fusion protein is present in a selected host target locus; and are

13. The method of claim 12, further comprising introducing an exogenous polynucleotide into the host cell.

14. The method of claim 13, wherein the exogenous polynucleotide is integrated into the genome of the host plant cell.

15. The method of any one of claims 12 to 14, wherein the sequence alteration is a mutation selected from the group consisting of: deletion of genetic material, insertion of genetic material, replacement of genetic material, and any combination thereof.

16. The method of any one of claims 11 to 14, wherein the target sequence is in the IPP2-K gene.

17. The method of claim 15, wherein the target sequence is in the IPP2-K gene.

18. A method for reducing the level of phytic acid in plant seeds comprising inactivating or altering the IPP2-K gene by a method according to claim 16 or 17.

19. A method for making phosphorus more metabolically available in plant seeds, the method comprising inactivating or altering the IPP2-K gene by a method according to claim 16 or 17.