HK1134495B

HK1134495B - Polynucleotides encoding isoprenoid modifying enzymes and methods of use thereof

Info

Publication number: HK1134495B
Application number: HK10100169.3A
Authority: HK
Inventors: D‧-K‧罗; K‧纽曼; E‧M‧帕若蒂斯; J‧D‧基斯林; M‧奥莱特; R‧伊切斯; K‧霍; T‧哈姆
Original assignee: 加利福尼亚大学董事会
Priority date: 2005-07-05
Filing date: 2006-06-29
Publication date: 2014-05-16

Description

Polynucleotides encoding isoprenoid-modifying enzymes and methods of use thereof

Cross referencing

This application claims priority from U.S. provisional patent application 60/697,067, filed on 5.7.2005, which is incorporated herein by reference in its entirety.

Technical Field

The invention relates to the field of production of isoprenoid compounds, in particular to an enzyme for modifying isoprenoid compounds.

Background

Isoprenoids constitute a very large and diverse group of natural products that have the same biosynthetic origin, namely the isoprenyl diphosphonate (IPP), a metabolic precursor. At least 20,000 isoprenoids have been described. By definition, isoprenoids consist of so-called isopentene (isoprene) (C5) units. The number of C atoms present in isoprenoids is generally divisible by 5 (C5, C10, C15, C20, C25, C30, and C40), but irregular isoprenoids and polyterpenes have also been reported. Isoprenoid compounds are also known as "terpenes" or "terpenoids". Important members of the isoprenoids include carotenoids, sesquiterpenes, diterpenes and hemiterpenes. Carotenoids include, for example: lycopene, beta-carotene, and many other substances used as antioxidants. Sesquiterpenes include, for example: a compound artemisinin with antimalarial activity is provided. Diterpenes include, for example: a cancer chemotherapeutic drug taxol.

Isoprenoids comprise a number of structurally diverse families of natural products. In this family, terpenoids isolated from plants and other natural sources are used as commercial flavor and fragrance compounds, as well as pharmaceutical compounds such as antimalarial, antiviral, and anticancer drugs. Most terpenoids in use today are natural products or derivatives thereof. Many of these natural product source organisms (e.g., trees, marine invertebrates) are neither amenable to large-scale breeding to produce commercially viable quantities nor to genetic manipulation to increase the yield of or derive these compounds. Thus, these natural products must be produced semi-synthetically from analogs or synthesized using conventional chemical synthetic methods. Moreover, many natural products have complex structures, and thus, the synthesis of these natural products is not economical or possible at present. These natural products must be extracted from natural sources such as trees, sponges, corals, and marine microorganisms; or produced synthetically or semi-synthetically from more abundant precursors. Extraction of natural products from natural sources is limited by the availability of the natural source; the synthetic or semi-synthetic production of natural products may suffer from low yields and/or high costs. These production problems and limitations of natural sources can limit the commercial and clinical development of such products.

An important example of a sesquiterpene compound is artemisinin. Artemisinin is a very potent antimalarial drug, currently extracted from the plant (Artemisia annua) for use in combination therapy. Plant-derived artemisinin is expensive and its availability is influenced by the weather and political environment of the country in which the plant is cultivated. Artemisinic acid (artemisinic acid) is a key intermediate in artemisinin biosynthesis. Conversion of amorpha-4, 11-diene to arteannuin by traditional chemical methods is an important step in the preparation of artemisinin, which is difficult and costly to perform.

There is a need in the art for a process for producing isoprenoid compounds that avoids some of the above-mentioned disadvantages. The present invention fills this need by providing polynucleotides encoding enzymes that modify isoprenoid compounds and host cells that are genetically modified to produce such enzymes.

Literature reference

Berta et al (2005) Planta Med.71: 40-47; deKraker et al (2003) tetrahedron 59: 409 and 418; martin et al (2003) nat biotechnol.21: 796-802; WO 03/025193; U.S. patent publication numbers 20050019882; U.S. patent publication numbers 20030148479; U.S. patent publication numbers 20040005678; U.S. patent publication No. 20030166255.

Summary of The Invention

The invention provides isolated nucleic acids comprising nucleotide sequences encoding isoprenoid-modifying enzymes, and recombinant vectors containing the same. The invention also provides genetically modified host cells comprising a nucleic acid or recombinant vector of the invention. The invention also provides transgenic plants comprising a nucleic acid of the invention. The invention also provides methods of producing an isoprenoid compound, the methods generally comprising culturing a subject genetically modified host cell under conditions that permit synthesis of an enzyme encoded by a subject nucleic acid, the enzyme capable of modifying an isoprenoid compound.

Brief description of the drawings

FIG. 1 depicts the nucleotide sequence of the cDNA coding sequence for CYP71D-A4 (SEQ ID NO: 1).

FIG. 2 depicts the amorphadiene 12-oxidase amino acid sequence (SEQ ID NO: 2).

FIG. 3 depicts the nucleotide sequence of the Artemisia annua cytochrome P450 reductase cDNA coding region (SEQ ID NO: 3).

FIG. 4 depicts the amino acid sequence of Artemisia annua cytochrome P450 reductase (SEQ ID NO: 4).

FIGS. 5A-C depict the results of in vivo substrate feeding experiments.

FIGS. 6A and 6B depict product validation by GC-MS.

FIGS. 7A-C depict de novo production of artemisinic acid in yeast.

FIGS. 8A-C depict in vitro amorphadiene oxidase assays.

FIG. 9 depicts the nucleotide sequence of a cDNA encoding an isoprenoid-modifying enzyme (clone 71D-B1) (SEQ ID NO: 5).

FIG. 10 depicts the amino acid sequence of an isoprenoid-modifying enzyme (71D-B1; SEQ ID NO: 6).

FIGS. 11A-C depict the hydroxylation activity of enzyme 71D-B1.

FIG. 12 depicts the nucleotide sequence of genomic DNA encoding an isoprenoid-modifying enzyme (SEQ ID NO: 7).

FIG. 13 is a schematic diagram of the isoprenoid metabolic pathway producing the isoprenoid biosynthetic pathway intermediates polyprenyl diphosphate (polyprenyl diphosphate), cage for animals geranyl diphosphate (GPP), farnesyl diphosphate (FPP), and cage for animals geranylgeranyl diphosphate (GGPPP) from isopentyl-1-enyl diphosphate (IPP) and dimethylallyl Diphosphate (DMAPP).

FIG. 14 is a schematic of the Mevalonate (MEV) pathway for the production of IPP.

Figure 15 is a schematic diagram of the DXP pathway for producing IPP and dimethylallyl pyrophosphate (DMAPP).

Definition of

The terms "isoprenoid", "isoprenoid compound", "terpene compound", "terpenoid" and "terpenoid compound" are used interchangeably. Isoprenoid compounds consist of varying amounts of so-called isoamylene (C5) units. The number of carbon atoms present in the isoprenoid is typically divisible by 5 (e.g., C5, C10, C15, C20, C25, C30, and C40). Irregular isoprenoids and polyterpenes have been reported and are also included in the definition of "isoprenoids". Isoprenoid compounds include, but are not limited to: monoterpenes, sesquiterpenes, triterpenes, polyterpenes and diterpenes.

As used herein, the terms "prenyl diphosphate" and "prenyl pyrophosphate" are used interchangeably and include monoprenyl diphosphates containing one isopentenyl group (e.g., IPP and DMAPP), as well as polyprenyl diphosphates containing two or more isopentenyl groups. Monoprenyl diphosphates include isopentyl-1-enyl pyrophosphate (IPP) and its isomer dimethylallyl pyrophosphate (DMAPP).

The term "terpene synthase" as used herein refers to any enzyme that enzymatically modifies IPP, DMAPP, or polyprenyl pyrophosphate to produce terpenoids. The term "terpene synthase" includes enzymes that catalyze the conversion of an prenyl diphosphate to an isoprenoid.

The term "pyrophosphate" is used interchangeably herein with "diphosphate". Thus, for example, the terms "isopentenyl diphosphate" and "isopentenyl pyrophosphate" are interchangeable; the terms "isopentenyl pyrophosphate" and "isopentenyl diphosphate" are interchangeable; the terms "farnesyl diphosphate" and "farnesyl pyrophosphate" are interchangeable; and so on.

The term "mevalonate pathway" or "MEV pathway" as used herein refers to the biosynthetic pathway that converts acetyl-CoA to IPP. The mevalonate pathway comprises enzymes that catalyze the following steps: (a) condensing two acetyl-CoA molecules to acetoacetyl-CoA; (b) condensing acetoacetyl-CoA with acetyl-CoA to form HMG-CoA; (c) converting HMG-CoA into mevalonate; (d) phosphorylating mevalonate to mevalonate 5-phosphate; (e) converting mevalonate 5-phosphate to mevalonate 5-pyrophosphate; and (f) converting mevalonate 5-pyrophosphate to isopentenyl pyrophosphate. FIG. 14 schematically illustrates the mevalonate pathway. The "upper half" of the mevalonate pathway refers to the enzyme responsible for the conversion of acetyl-CoA to mevalonate via an intermediate of the MEV pathway.

The term "1-deoxy-D-xylulose 5-diphosphate pathway" or "DXP pathway" as used herein refers to a pathway that converts glyceraldehyde 3-phosphate and pyruvate to IPP and DMAPP through DXP pathway intermediates, wherein the DXP pathway comprises enzymes that catalyze the reactions schematically illustrated in FIG. 15.

As used herein, the term "prenyltransferase" is used interchangeably with the terms "prenyl diphosphate synthase" and "polyprenyl synthase" (e.g., "GPP synthase," "FPP synthase," "OPP synthase," etc.) and refers to an enzyme that catalyzes the sequential 1' -4 condensation of prenyl diphosphate with the allylic initiation substrate (primer substrate) resulting in the formation of prenyl diphosphates of various chain lengths.

The terms "polynucleotide" and "nucleic acid" are used interchangeably herein to refer to nucleotides of any length, including ribonucleotides or deoxynucleotides, in polymerized form. Thus, the term includes, but is not limited to: single, double or multiple stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or polymers containing purine and pyrimidine bases or other natural, chemically or biochemically modified non-natural, or derivatized nucleotide bases.

The terms "peptide", "polypeptide" and "protein" are used interchangeably herein to refer to amino acids of any length in polymerized form, and may include coded and non-coded amino acids, chemically or biochemically modified or derivatized amino acids, and polypeptides containing modified peptide backbones.

The term "naturally occurring" as used herein applies to a nucleic acid, cell, or organism, and refers to a nucleic acid, cell, or organism that is found in nature. For example, a polypeptide or polynucleotide sequence present in an organism (including viruses) that can be isolated from a natural source and not intentionally modified by a laboratory worker is naturally occurring.

The term "isolated" as used herein refers to a polynucleotide, polypeptide or cell in an environment different from the environment in which the polynucleotide, polypeptide or cell naturally occurs. The isolated genetically modified host cell may be present in a mixed population of genetically modified host cells.

The term "exogenous nucleic acid" as used herein refers to a nucleic acid that is not normally or naturally found and/or produced in a given natural bacterium, organism, or cell. The term "endogenous nucleic acid" as used herein refers to a nucleic acid that is typically found and/or produced in a given native bacterium, organism, or cell. "endogenous nucleic acid" is also referred to as "natural nucleic acid" or nucleic acid that is "natural" to a given bacterium, organism, or cell. For example, nucleic acids encoding HMGS, mevalonate kinase and phosphomevalonate kinase represent exogenous nucleic acids of E.coli (E.coli). These mevalonate pathway nucleic acids can be cloned from Saccharomyces cerevisiae (Saccharomyces cerevisiae). In s.cerevisiae, the gene sequences encoding HMGS, MK, and PMK are "endogenous" nucleic acids on the chromosome.

The term "heterologous nucleic acid" as used herein refers to a nucleic acid that satisfies at least one of the following conditions: (a) nucleic acids are "exogenous" (i.e., not naturally found) to a given host microorganism or host cell; (b) a nucleic acid comprises a nucleotide sequence that is naturally found (e.g., "endogenous") in a given host microorganism or host cell (e.g., a nucleic acid comprises a nucleotide sequence that is endogenous to a host microorganism or host cell), but that is produced in a cell in an unnatural amount (e.g., greater than expected or greater than naturally occurring); or the nucleic acid sequence differs from the endogenous nucleotide sequence such that a non-natural amount (e.g., greater than expected or greater than naturally occurring) of the same endogenously found encoded protein (identical or substantially identical in amino acid sequence) is produced in the cell; (c) nucleic acids comprise two or more nucleotide sequences or segments that differ from one another in nature, e.g., the nucleic acid is a recombinant nucleic acid.

The term "recombinant" as used herein refers to a particular nucleic acid (DNA or RNA) that is the product of various combinations of cloning, restriction and/or ligation steps that result in a construct having structural coding or non-coding sequences that are distinguishable from endogenous nucleic acids found in a natural system. In general, a DNA sequence encoding a structural coding sequence can be assembled from a cDNA fragment and a short oligonucleotide linker, or from a series of synthetic oligonucleotides, to provide a synthetic nucleic acid capable of being expressed from a recombinant transcriptional unit contained in a cellular or cell-free transcription and translation system. Such sequences may be provided in the form of an open reading frame that is not interrupted by internal untranslated sequences or introns (typically present in eukaryotic genes). Genomic DNA comprising related sequences may also be used to form recombinant genes or transcriptional units. Untranslated DNA sequences may be present at the 5 'or 3' end of an open reading frame, which do not interfere with the manipulation or expression of the coding region, and which may in fact be used to modulate the production of the desired product by a variety of mechanisms (see "DNA regulatory sequences" below).

Thus, for example, the term "recombinant" polynucleotide or nucleic acid refers to a polynucleotide or nucleic acid that is not naturally occurring, e.g., as prepared by the artificial combination of two separate sequence segments through human intervention. This artificial combination is often achieved by chemical synthesis methods or by artificial manipulation of isolated nucleic acid segments (e.g., genetic engineering techniques). This is typically done in order to replace codons with redundant codons encoding the same or conserved amino acids, typically with sequence recognition sites introduced or removed. Alternatively, the method is performed to ligate nucleic acid segments having the desired functions to produce the desired combination of functions. This artificial combination is often achieved by chemical synthesis methods or by artificial manipulation of isolated nucleic acid segments (e.g., genetic engineering techniques).

"construct" refers to a recombinant nucleic acid, usually recombinant DNA, which is produced for the purpose of expressing a particular nucleotide sequence, or with which other recombinant nucleotide sequences are constructed.

As used herein, the terms "operon" and "single transcription unit" are used interchangeably to refer to two or more contiguous coding regions (nucleotide sequences encoding gene products such as RNA or proteins) that are coordinately regulated by one or more control elements (e.g., a promoter). The term "gene product" as used herein refers to RNA encoded by DNA (and vice versa) or protein encoded by RNA or DNA, wherein a gene typically comprises one or more nucleotide sequences encoding a protein and may also include introns and other non-coding nucleotide sequences.

The terms "DNA regulatory sequence", "control element" and "regulatory element" are used interchangeably herein to refer to transcriptional or translational control sequences, such as promoters, enhancers, polyadenylation signals, terminators, protein degradation signals, and the like, that provide and/or regulate expression of a coding sequence and/or production of an encoded polypeptide in a host cell.

The terms "transformation" and "genetic modification" are used interchangeably herein to refer to a permanent or transient induced genetic alteration of a cell following the introduction of a new nucleic acid (i.e., DNA exogenous to the cell). Genetic alterations ("modifications") can be effected by incorporating the new DNA into the host cell genome, or by maintaining the new DNA transiently or stably as an epigenetic element. When the cell is a eukaryotic cell, permanent genetic alteration is typically effected by introducing DNA into the genome of the cell. In prokaryotic cells, permanent changes may be introduced into the chromosome or by extrachromosomal elements such as plasmids and expression vectors, which may contain one or more selectable markers to help maintain them in the recombinant host cell. Suitable methods for genetic modification include viral infection, transfection, conjugation, protoplast fusion, electroporation, particle gun technology, calcium phosphate precipitation, direct microinjection, and the like. The choice of method will generally depend on the cell type to be transformed and the environment in which the transformation is to take place (i.e., in vitro, ex vivo or in vivo). A general discussion of these methods can be found in Ausubel et al, Short Protocols in molecular Biology, 3 rd edition, Wissen publishing group (Wiley and Sons), 1995.

"operatively connected" refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. For example, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. The terms "heterologous promoter" and "heterologous control region" as used herein refer to promoters and other control regions not normally associated with a particular nucleic acid in nature. For example, a "transcriptional control region heterologous to a coding region" is a transcriptional control region that is not normally associated with a native coding region.

As used herein, "host cell" refers to an eukaryotic cell, a prokaryotic cell, or a cell from a multicellular organism (e.g., a cell line) in vivo or in vitro, but cultured as a unicellular entity, wherein the eukaryotic or prokaryotic cell is useful as or has been used as a nucleic acid receptor (e.g., an expression vector comprising a nucleotide sequence encoding one or more biosynthetic pathway gene products, such as mevalonate pathway gene products), including the progeny of the original cell that has been genetically modified with the nucleic acid. It is understood that the progeny of a single cell may not necessarily be identical in morphology or genomic DNA or the entire DNA set of the original parent, as there are natural, accidental, or deliberate mutations. A "recombinant host cell" (also referred to as a "genetically modified host cell") is a host cell into which a heterologous nucleic acid, such as an expression vector, has been introduced. For example, a prokaryotic host cell is a genetically modified prokaryotic host cell (e.g., a bacterium) as a result of introducing a heterologous nucleic acid into the appropriate prokaryotic host cell, where the heterologous nucleic acid is, for example, an exogenous nucleic acid that is outside of (not found in nature) the prokaryotic host cell, or a recombinant nucleic acid that is not normally found in the prokaryotic host cell; a eukaryotic host cell is a genetically modified eukaryotic host cell in that a heterologous nucleic acid is introduced into the appropriate eukaryotic host cell, such as an exogenous nucleic acid outside of the eukaryotic host cell or a recombinant nucleic acid not normally found in the eukaryotic host cell.

A nucleic acid is "hybridizable" to another nucleic acid (e.g., cDNA, genomic DNA, or RNA) when a single-stranded form of the nucleic acid can anneal to the other nucleic acid under conditions of suitable temperature and solution ionic strength. Hybridization and washing conditions are well known, see, e.g., Sambrook, j., Fritsch, e.f., and manitis, t. "molecular cloning: a Laboratory Manual, second edition, Cold spring Harbor Laboratory Press (Cold spring Harbor Laboratory Press), in particular chapter 11 and Table 11.1 thereof; and Sambrook, j. and Russell, w., "molecular cloning: a laboratory Manual, third edition, Cold spring harbor laboratory Press, Cold spring harbor (2001). The conditions of temperature and ionic strength determine the "stringency" of the hybridization. Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from less related organisms, to highly similar fragments, such as genes that replicate functional enzymes from closely related organisms. Hybridization conditions and post-hybridization washes can be used to obtain the desired hybridization determining stringency conditions. One set of illustrative post-hybridization washes is a series of washes starting with 6 XSSC (SSC is 0.15M NaCl and 15mM citrate buffer), 0.5% SDS at room temperature for 15 minutes, followed by a repeat wash with 2 XSSC, 0.5% SDS at 45 ℃ for 30 minutes, followed by a repeat wash with 0.2 XSSC, 0.5% SDS at 50 ℃ for 30 minutes, twice. Other stringent conditions are obtained with higher temperatures, where the washes are identical to those described above except that the temperature of the last two 30 min washes with 0.2 XSSC, 0.5% SDS is raised to 60 ℃. Another set of highly stringent conditions uses 0.1 XSSC, 0.1% SDS, 65 ℃ for the last two washes. Another example of stringent hybridization conditions is hybridization at 50 ℃ or higher in 0.1 XSSC (15mM sodium chloride/1.5 mM sodium citrate). Another example of stringent hybridization conditions is overnight incubation at 42 ℃ in a solution of: 50% formamide, 5 XSSC (150mM NaCl, 15mM trisodium citrate), 50mM sodium phosphate (pH7.6), 5 XDenhardt's solution, 10% dextran sulfate, and 20. mu.g/ml denatured sheared salmon sperm DNA, then the filter was washed with 0.1 XSSC at about 65 ℃. Stringent hybridization conditions and post-hybridization wash conditions are hybridization conditions and post-hybridization wash conditions that are at least as stringent as the above representative conditions.

Hybridization requires that the two nucleic acids contain complementary sequences, but depending on the stringency of the hybridization, there may be base mismatches. The appropriate stringency for hybridization of nucleic acids depends on the length of the nucleic acids and the degree of complementarity, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the higher the melting temperature (Tm) of nucleic acid hybrids having these sequences. The relative stability of nucleic acid hybridization (corresponding to higher Tm) decreases in the following order: RNA, DNA, RNA, DNA. For hybrids greater than 100 nucleotides in length, equations for calculating Tm were generated (see Sambrook et al, supra, 9.50-9.51). For hybridization of shorter nucleic acids, i.e., oligonucleotides, the position of the mismatch becomes more important, and the length of the oligonucleotide determines its specificity (see Sambrook et al, supra, 11.7-11.8). Typically, the length of a hybridizable nucleic acid is at least about 10 nucleotides. Exemplary minimum lengths of hybridizable nucleic acids are: at least about 15 nucleotides; at least about 20 nucleotides; at least about 30 nucleotides. Furthermore, one skilled in the art will recognize that the temperature and wash solution salt concentration may be adjusted depending on factors such as probe length.

The term "conservative amino acid substitution" refers to the interchangeability of amino acid residues in proteins having similar side chains. For example, a group of amino acids having aliphatic side chains consists of glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains consists of serine and threonine; a group of amino acids having amide-containing side chains consisting of asparagine and glutamine; a group of amino acids with aromatic side chains consists of phenylalanine, tyrosine and tryptophan; a group of amino acids having basic side chains consists of lysine, arginine and histidine; a group of amino acids with sulfur-containing side chains consists of cysteine and methionine. Exemplary conservative amino acid substitutions are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, and asparagine-glutamine.

"synthetic nucleic acids" can be assembled from building blocks of oligonucleotides that are chemically synthesized using methods known to those skilled in the art. These building blocks are ligated and annealed to form gene segments, which are then assembled enzymatically to construct the entire gene. When referring to a DNA sequence, "chemically synthesized" means that its constituent amino acids are assembled in vitro. DNA can be chemically synthesized manually using established methods or automatically using one of many commercially available machines. The nucleotide sequence of the nucleic acid may be modified to optimize expression based on optimizing the nucleotide sequence to reflect the codon bias of the host cell. It will be appreciated by those skilled in the art that the likelihood of successful expression is increased if the codon usage is biased towards codons appropriate for the host. Preferred codons can be determined based on genetic studies obtained from host cells where sequence information is available.

A polynucleotide or polypeptide has a certain percentage of "sequence identity" to another polynucleotide or polypeptide, meaning that the percentage of bases or amino acids are the same when aligned, and that the percentage of bases or amino acids are in the same relative position when the two sequences are compared. Sequence similarity can be determined in a number of different ways. To determine sequence identity, sequences can be aligned using methods and computer programs, including BLAST, available from the internet site ncbi. See, e.g., Altschul et al (1990), j.mol.biol.215: 403-10. Another alignment algorithm is FASTA, available from the Genetic Computing Group (GCG) package of Madison, Wis., USA (a capital division of Oxford Molecular Group, Inc.). Other alignment techniques are described in Methods in Enzymology, volume 266: computer Methods for Macromolecular Sequence Analysis (Computer Methods for Macromolecular Sequence Analysis) (1996), Doolittle eds., Academic Press, Inc., division of Harbour Branch & Co., san Diego, Calif., USA. Of particular interest are alignment programs that allow for gaps in the sequences. Smith-Waterman is a type of algorithm that allows gaps in sequence alignment. See meth.mol. biol.70: 173-187(1997). Also, the GAP program using the Needleman & Wunsch alignment method can be used to align sequences. See j.mol.biol.48: 443-453(1970).

Before the present invention is further described, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.

Where a range of values is provided, it is understood that the invention includes each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, the invention also includes ranges excluding either or both of those limits.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, the preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited.

It must be noted that, as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "an isoprenoid-modifying enzyme" includes a plurality of such enzymes, reference to "a cytochrome P450 reductase" includes reference to one or more cytochrome P450 reductases and equivalents thereof known to those skilled in the art, and so forth. It is also noted that the claims may be made to exclude any optional elements. Accordingly, this statement is intended to serve as antecedent basis for use of such exclusive terminology as "solely," "only," and the like in connection with the recitation of claim elements, or use of a "negative" limitation.

The publications described herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Moreover, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.

Detailed Description

The invention provides isolated nucleic acids comprising nucleotide sequences encoding isoprenoid-modifying enzymes, and recombinant vectors comprising the same. The invention also provides genetically modified host cells comprising a nucleic acid or recombinant vector of the invention. The invention also provides transgenic plants comprising a nucleic acid of the invention. The invention also provides methods of producing an isoprenoid compound, the methods generally involving culturing a subject genetically modified host cell under conditions that permit synthesis of an isoprenoid compound-modifying enzyme encoded by a subject nucleic acid.

Nucleic acids, vectors and host cells

The present invention provides isolated nucleic acids comprising a nucleotide sequence encoding an enzyme that modifies an isoprenoid compound, referred to herein as an "isoprenoid-modifying enzyme". A nucleic acid of the invention comprising a nucleotide sequence encoding an isoprenoid-modifying enzyme is referred to as an "isoprenoid-modifying enzyme nucleic acid". In a specific embodiment, an isolated isoprenoid-modifying enzyme nucleic acid of the present invention comprises a nucleotide sequence encoding a cytochrome P450 monooxygenase. In particular embodiments, an isolated isoprenoid-modifying enzyme nucleic acid of the present invention comprises a nucleotide sequence encoding an isoprenoid oxidase. In some embodiments, an isolated isoprenoid-modifying enzyme nucleic acid of the present invention comprises a nucleotide sequence encoding a terpene hydroxylase. In some embodiments, an isolated isoprenoid-modifying enzyme nucleic acid of the present invention comprises a nucleotide sequence encoding a terpene oxidase. In some embodiments, an isolated isoprenoid-modifying enzyme nucleic acid of the present invention comprises a nucleotide sequence encoding a sesquiterpene oxidase. In some embodiments, an isolated isoprenoid-modifying enzyme nucleic acid of the present invention comprises a nucleotide sequence encoding a sesquiterpene hydroxylase.

NADPH-cytochrome P450 oxidoreductase (CPR, EC 1.6.2.4) is the redox partner of many P450-monooxygenases. The invention also provides an isolated nucleic acid comprising a nucleotide sequence encoding a Cytochrome P450 Reductase (CPR). A nucleic acid of the invention comprising a nucleotide sequence encoding a CPR is referred to as a "CPR nucleic acid". CPR encoded by CPR nucleic acids of the invention transfers electrons from NADPH to cytochrome P450. Typically, the CPR encoded by the CPR nucleic acid of the invention transfers electrons from NADPH to an isoprenoid-modifying enzyme, such as a sesquiterpene oxidase, encoded by the isoprenoid-modifying enzyme-encoding nucleic acid of the invention.

Nucleic acids encoding isoprenoid-modifying enzymes

In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide having isoprenoid hydroxylase and/or isoprenoid oxidase activity. In some embodiments, the isolated nucleic acid of the invention comprises a nucleotide sequence encoding a cytochrome P450 monooxygenase. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding an isoprenoid hydroxylase. In some embodiments, the isolated nucleic acids of the invention comprise a nucleotide sequence encoding an isoprenoid oxidase. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide that undergoes sequential hydroxylation and oxidation reactions, e.g., the polypeptide hydroxylates a terpene compound to produce a terpene alcohol, oxidizes the terpene alcohol to produce a terpene aldehyde, and oxidizes the terpene aldehyde to produce a terpene carboxylic acid. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide that catalyzes the hydroxylation and/or oxidation of an isopropenyl group of a terpene, e.g., the hydroxylation of an isopropenyl group of a monoterpene, diterpene, triterpene, sesquiterpene, or polyterpene. In some embodiments, the isolated nucleic acids of the invention comprise a nucleotide sequence encoding a monoterpene oxidase. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a monoterpene hydroxylase. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a polyterpene hydroxylase. In some embodiments, the isolated nucleic acids of the invention comprise a nucleotide sequence encoding a polyterpene oxidase. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a diterpene hydroxylase. In some embodiments, the isolated nucleic acids of the invention comprise a nucleotide sequence encoding a diterpene oxidase. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a triterpene hydroxylase. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a triterpene oxidase enzyme. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a sesquiterpene hydroxylase. In some embodiments, the isolated nucleic acids of the invention comprise a nucleotide sequence encoding a sesquiterpene oxidase. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a sesquiterpene C12-hydroxylase. In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide capable of undergoing C12 oxidation of a sesquiterpene. In some embodiments, the isolated nucleic acid of the invention comprises a nucleotide sequence encoding an amorphadiene 12-oxidase.

The product of a terpene cyclase (also known as "terpene synthase") reaction is the so-called "terpene skeleton". In some embodiments, an isolated nucleic acid of the invention comprises a nucleotide sequence encoding an isoprenoid-modifying enzyme that catalyzes the hydroxylation and/or oxidation of a terpene scaffold or a downstream product thereof. Typically, the substrate of an isoprenoid-modifying enzyme encoded by a nucleic acid of the invention comprises a terpene backbone or a modified terpene backbone. In many embodiments, the substrate of an isoprenoid-modifying enzyme encoded by a nucleic acid of the present invention includes an isopropenyl group.

Monoterpene substrates of isoprenoid-modifying enzymes encoded by nucleic acids of the invention include, but are not limited to: the oxidation product produced is a monoterpene compound or any monoterpene substrate that is an intermediate of the biosynthetic pathway for producing monoterpene compounds. Exemplary monoterpene substrates include, but are not limited to: monoterpene substrates belonging to the following families: acyclic monoterpenes, dimethyloctanes, menthanes, irregular monoterpenoids, eucalyptols, camphanes, isobornanes, monocyclic monoterpenes, pinanes, fenchynes, limonanes, caranes, ionones, Iridanes (Iridanes), and cannabis (canabanoids). Exemplary monoterpene substrates, intermediates and products include, but are not limited to: limonene, citronellol (citranelol), cage for animals geraniol, menthol, perillyl alcohol, linalool, and ninhydrin.

Diterpene substrates of isoprenoid-modifying enzymes encoded by nucleic acids of the invention include, but are not limited to: the oxidation product produced is a diterpene compound or any diterpene substrate that is an intermediate of the biosynthetic pathway for the production of diterpene compounds. Exemplary diterpene substrates include, but are not limited to: diterpene substrates belonging to the following families: acyclic diterpenes, bicyclic diterpenes, monocyclic diterpenes, Labdanes (Labdanes), clondanes (clinodanes), taxanes, tricyclic diterpenes, tetracyclic diterpenes, kaurenes, bayesian (beyeres), acridines (aiserenes), Aphidicolins (aphidigins), quinoxyrins, gibberellins, macrocyclic diterpenes, and elizabethanes (Elizabethatrianes). Exemplary diterpene substrates, intermediates, and products include, but are not limited to: ricin, elsinochrome (eleutherobin), paclitaxel, prostratin (prostratin), and pseudopterosin.

Triterpene substrates of isoprenoid-modifying enzymes encoded by nucleic acids of the invention include, but are not limited to: the resulting oxidation product is a triterpene compound or any triterpene substrate that is an intermediate in the biosynthetic pathway for the production of triterpene compounds. Exemplary triterpene substrates, intermediates and products include, but are not limited to: abutiside E (arbruside E), brucidine (bruceantin), testosterone, progesterone, cortisone and digitoxin.

Sesquiterpene substrates of isoprenoid-modifying enzymes encoded by nucleic acids of the invention include, but are not limited to: the oxidation product produced is a sesquiterpene compound or any sesquiterpene substrate that is an intermediate of the biosynthetic pathway for producing a sesquiterpene compound. Exemplary sesquiterpene substrates include, but are not limited to: sesquiterpene substrates belonging to the following families: farnesane, monocyclic sesquiterpene, bicyclic farnesane, bicycloalkane (Bisbolanes), santaloane, carbopol (Cupranes), phyllanthane (Herbertanes), pelargane (gymnositranes), trichothecene, chamabrane (Chamigranes), daucane, calaane, ansitastine (Antisantins), cadinane, sesquiterpene ketone (Oplopananes), piperane (Copaanes), picrotoxane (Picrotoxanes), cedrane, longifolan (Longiciclines), syringane, Modhepanenes (Modhephanes), silvestane (Siphierfofoanes), humulane, holophyllane (Intergrifolines), lipiane (lipianes), protoporphane (Protoilanes), cryptomelans (Hipandalanes), polybotrys (Illicularines), and mixtures (Illicinones)utanes), lactucanes (lactananes), stewartianes (Sterpuranes), fulvalene (fomanosanes), malanane (Marasmanes), germacrane, elemene, eudesmane, beckiane (Bakkanes), meroxane (Chilosyphanes), guaiane, pseudoguaiane, tricyclic sesquiterpenes, patchouliane, trihydridaliane, and mixtures thereofAlkanes (trioxanes), vanillin (aromadranes), homoalkanes (gordonanes), nalidones (Nardosinanes), braziranes (Brasilanes), mostanes (Pinguisanes), sesquipinanes (sequipinanes), sesquicamphanes (sequicamphanes), thujane, bicycloheptanes, allisanes (Alliacanes), stevensane (sterkuranes), lacrimane, afficines (Afficanes), holophyllanes, protoillonanes (protoilludines), aristolochianes, and neolananes (Neolemnanes). Exemplary sesquiterpene substrates include, but are not limited to: amorphadiene, isolongifolene (allosylidene), (-) - α -trans-bergamotene (bergamotene), (-) - β -elemene, (+) -germacrene A, germacrene B, (+) - γ -gutylene, (+) -tubalene, decahydrodimethylmethylvinylnaphthol (neointemedenol), (+) - β -cnidium, and (+) -valencene.

Standard assays for these enzyme activities using appropriate substrates can readily be used to determine whether a nucleic acid of the invention encodes a terpene oxidase or a terpene hydroxylase. The enzyme-modified product is typically analyzed by gas chromatography-mass spectrometry. Standard assays for these enzymatic activities are not difficult to determine whether a nucleic acid of the invention encodes a sesquiterpene oxidase or a sesquiterpene hydroxylase. See, for example, U.S. patent publication No. 20050019882, incorporated herein by reference.

In some embodiments, the nucleic acid of the invention comprises the sequence shown in figure 1 and SEQ ID NO: 1, or a nucleotide sequence listed in seq id no. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence identical to SEQ ID NO: 1 is at least about 45%, at least about 50%, at least about 55%, at least about 57%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% of the nucleotide sequence. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence identical to SEQ ID NO: 1, or a nucleotide sequence comprising 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, about 10-15, about 15-20, about 20-25, or about 25-50 nucleotide substitutions as compared to the nucleotide sequence set forth in claim 1.

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence identical to SEQ ID NO: 1, is at least about 45%, at least about 50%, at least about 55%, at least about 57%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%, wherein the nucleic acid encodes a polypeptide having terpene hydroxylase and/or terpene oxidase activity (e.g., sesquiterpene oxidase activity, sesquiterpene hydroxylase activity, etc.).

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence identical to SEQ ID NO: 1, at least about 500, at least about 600, at least about 700, at least about 800, at least about 900, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, or at least about 1450 contiguous nucleotides of a portion of the nucleotide sequences set forth in seq id No. 1 are nucleotide sequences having at least about 50%, at least about 55%, at least about 57%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% nucleotide sequence identity.

In some embodiments, the nucleic acid of the invention comprises SEQ ID NO: 1, at least about 500, at least about 600, at least about 700, at least about 800, at least about 900, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, or at least about 1450 contiguous nucleotides of the nucleotide sequence set forth in seq id no. In some embodiments, the nucleic acid of the invention comprises SEQ ID NO: 1, at least about 500, at least about 600, at least about 700, at least about 800, at least about 900, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, or at least about 1450 contiguous nucleotides of the nucleotide sequence set forth in seq id no, which encodes a polypeptide having terpene hydroxylase and/or terpene oxidase activity, e.g., sesquiterpene hydroxylase and/or oxidase activity.

In some embodiments, a nucleic acid of the invention comprises a sequence that hybridizes under stringent hybridization conditions to a nucleic acid comprising SEQ ID NO: 1 or the complement thereof.

In some embodiments, the nucleic acid of the invention comprises a nucleic acid sequence encoding a polypeptide comprising the sequence shown in figure 2 and SEQ ID NO: 2, or a pharmaceutically acceptable salt thereof, and 2, a nucleotide sequence of a polypeptide of the amino acid sequence set forth in seq id no. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO: 2 is at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% of the amino acid sequence of the polypeptide. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO: 2, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, or at least about 490 contiguous amino acids of a moiety of the amino acid sequences set forth in seq id No. 2 has an amino acid sequence identity of at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO: 2, or a conservative amino acid substitution of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, about 10-15, about 15-20, or about 20-25 amino acid substitutions relative to the amino acid sequence set forth in seq id No. 2. In some embodiments, the encoded polypeptide has terpene hydroxylase and/or terpene oxidase activity. In some embodiments, the encoded polypeptide has sesquiterpene oxidase activity. In some embodiments, the encoded polypeptide catalyzes the oxidation of C12 of a sesquiterpene substrate. In other embodiments, the encoded polypeptide has sesquiterpene hydroxylase activity.

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO: 2 is at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% of the amino acid sequences of at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, or at least about 490 contiguous amino acids. In some embodiments, the encoded polypeptide has terpene hydroxylase and/or terpene oxidase activity. In some embodiments, the encoded polypeptide has sesquiterpene oxidase activity. In some embodiments, the encoded polypeptide catalyzes the oxidation of C12 of a sesquiterpene substrate. In other embodiments, the encoded polypeptide has sesquiterpene hydroxylase activity.

In some embodiments, the nucleic acid of the invention comprises the sequence shown in figure 9 and SEQ ID NO: 5, or a nucleotide sequence listed in seq id no. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence identical to SEQ ID NO: 5 is at least about 45%, at least about 50%, at least about 55%, at least about 57%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% of the nucleotide sequence. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence identical to SEQ ID NO: 5, a nucleotide sequence comprising 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, about 10-15, about 15-20, about 20-25, or about 25-50 nucleotide substitutions compared to the nucleotide sequence set forth in seq id No. 5.

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence identical to SEQ ID NO: 5 is at least about 45%, at least about 50%, at least about 55%, at least about 57%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%, wherein the nucleic acid encodes a polypeptide having terpene hydroxylase and/or terpene oxidase activity (e.g., sesquiterpene oxidase activity, sesquiterpene hydroxylase activity, etc.).

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence identical to SEQ ID NO: 5, a nucleotide sequence having at least about 45%, at least about 50%, at least about 55%, at least about 57%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% nucleotide sequence identity to a portion of at least about 500, at least about 600, at least about 700, at least about 800, at least about 900, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, or at least about 1450 contiguous nucleotides of the nucleotide sequence.

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO: 6 is at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% of the amino acid sequences at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, or at least about 480 contiguous amino acids. In many embodiments, the encoded polypeptide has terpene hydroxylase and/or terpene oxidase activity. In many embodiments, the encoded polypeptide has sesquiterpene oxidase or sesquiterpene hydroxylase activity. In many embodiments, the encoded polypeptide catalyzes the hydroxylation of a sesquiterpene substrate.

In some embodiments, the nucleic acid of the invention comprises SEQ ID NO: 5, at least about 500, at least about 600, at least about 700, at least about 800, at least about 900, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, or at least about 1450 contiguous nucleotides of the nucleotide sequence set forth. In some embodiments, the nucleic acid of the invention comprises SEQ ID NO: 5, at least about 500, at least about 600, at least about 700, at least about 800, at least about 900, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, or at least about 1450 contiguous nucleotides of the nucleotide sequence set forth in seq id no, which encodes a polypeptide having terpene hydroxylase and/or oxidase activity, e.g., sesquiterpene oxidase activity, sesquiterpene hydroxylase activity, etc.

In some embodiments, a nucleic acid of the invention comprises a sequence that hybridizes under stringent hybridization conditions to a nucleic acid comprising SEQ ID NO: 5 or the complement thereof.

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising the sequence shown in figure 9 and SEQ ID NO: 6. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO: 6 is at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% amino acid sequence identity. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO: 6, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, or at least about 480 contiguous amino acids of the amino acid sequence set forth in seq id No. 6, an amino acid sequence having at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% amino acid sequence identity. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO: 6, or an amino acid sequence comprising 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, about 10-15, about 15-20, or about 20-25 conservative amino acid substitutions as compared to the amino acid sequence set forth in seq id No. 6. In some embodiments, the encoded polypeptide has terpene hydroxylase and/or terpene oxidase activity. In some embodiments, the encoded polypeptide has sesquiterpene oxidase activity. In some embodiments, the encoded polypeptide catalyzes the hydroxylation of a sesquiterpene substrate. In other embodiments, the encoded polypeptide has sesquiterpene hydroxylase activity.

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO: 6 is at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% of the amino acid sequences at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, or at least about 480 contiguous amino acids. In some embodiments, the encoded polypeptide has terpene hydroxylase and/or terpene oxidase activity. In some embodiments, the encoded polypeptide has sesquiterpene oxidase activity. In some embodiments, the encoded polypeptide catalyzes the hydroxylation of a sesquiterpene substrate. In other embodiments, the encoded polypeptide has sesquiterpene hydroxylase activity.

In some embodiments, the nucleic acid of the invention comprises a nucleic acid sequence encoding a polypeptide comprising SEQ ID NO: 2 or SEQ ID NO: 6 to the amino acid sequence of a variant of the polypeptide of the amino acid sequence listed. For example, in some embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding an enzyme that hybridizes with a nucleic acid sequence comprising SEQ ID NO: 2 or SEQ ID NO: 6, which has one or more of the following properties: 1) the enzyme activity is improved; 2) improved in vitro and/or in vivo stability; 3) the yield is improved; 4) a change in protein turnover rate; 5) substrate specificity changes (e.g., such that the variant enzyme modifies a selected substrate); 6) increased enzyme efficiency (e.g., increased efficiency of converting a substrate to a product); and 7) increased solubility (e.g., in the cytoplasm or cytoplasm).

Nucleic acid encoding cytochrome P450 reductase

The present invention provides isolated nucleic acids comprising a nucleotide sequence encoding a Cytochrome P450 Reductase (CPR). In some embodiments, a CPR nucleic acid of the invention comprises a nucleotide sequence encoding a CPR that is capable of transferring electrons from NADPH to a cytochrome P450 oxidase encoded by an isoprenoid-modifying enzyme nucleic acid of the invention.

In some embodiments, the nucleic acid of the invention comprises the sequence shown in figure 3 and SEQ ID NO:3, or a nucleotide sequence listed in seq id no. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence identical to SEQ ID NO:3 is at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% of the nucleotide sequence.

In some embodiments, a nucleic acid of the invention comprises a sequence that hybridizes under stringent hybridization conditions to a nucleic acid comprising SEQ ID NO:3 or the complement thereof.

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising the sequence shown in figure 4 and SEQ ID NO:4, or a pharmaceutically acceptable salt thereof. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO:4 is at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% amino acid sequence identity. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO:4, or 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, about 10-15, about 15-20, or about 20-25 conservative amino acid substitutions in comparison to the amino acid sequence set forth in seq id No. 4.

In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a polypeptide comprising a sequence identical to SEQ ID NO:4 is at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, at least about 550, at least about 600, at least about 650, or at least about 700 contiguous amino acids in an amino acid sequence that is at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100%. In some embodiments, the encoded polypeptide transfers electrons from NADPH to a polypeptide encoded by an isoprenoid-modifying enzyme nucleic acid of the present invention (e.g., an isoprenoid-modifying enzyme).

In some embodiments, the nucleic acid of the invention comprises SEQ ID NO:3, at least about 500, at least about 600, at least about 700, at least about 800, at least about 900, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, at least about 1500, at least about 1600, at least about 1700, at least about 1800, at least about 1900, at least about 2000, or at least about 2100 contiguous nucleotides in the listed nucleotide sequence. In some embodiments, the nucleic acid of the invention comprises SEQ ID NO:3, at least about 500, at least about 600, at least about 700, at least about 800, at least about 900, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, at least about 1500, at least about 1600, at least about 1700, at least about 1800, at least about 1900, at least about 2000, or at least about 2100 contiguous nucleotides of the nucleotide sequences listed therein that encode a polypeptide that is capable of transferring electrons from NADPH to a cytochrome P450 oxidase encoded by an isoprenoid-modifying enzyme nucleic acid of the present invention, e.g., that encodes a polypeptide that is capable of transferring electrons from NADPH to a polypeptide encoded by an isoprenoid-modifying enzyme nucleic acid of the present invention (e.g., an isoprenoid-modifying enzyme).

In some embodiments, the nucleic acid of the invention comprises a nucleic acid sequence encoding a polypeptide comprising SEQ ID NO:4, or a variant of a polypeptide of the amino acid sequence listed. For example, in some embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding an enzyme that hybridizes with a nucleic acid sequence comprising SEQ ID NO:4, which has one or more of the following properties: 1) the enzyme activity is improved; 2) improved in vitro and/or in vivo stability; 3) the yield is improved; 4) a change in protein turnover rate; 5) substrate specificity changes (e.g., such that the variant enzyme modifies a selected substrate); 6) increased enzyme efficiency (e.g., increased efficiency of converting a substrate to a product); and 7) increased solubility (e.g., in the cytoplasm or cytoplasm).

In some embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding a fusion protein comprising an amino acid sequence of an isoprenoid-modifying enzyme having terpene hydroxylase and/or terpene oxidase activity (as described above) fused to a heterologous polypeptide ("fusion partner"), e.g., a polypeptide other than an isoprenoid-modifying enzyme as described above. In some embodiments, the nucleic acid of the invention comprises a nucleotide sequence encoding a fusion protein comprising the amino acid sequence of the CPR described above and a heterologous polypeptide, e.g., a polypeptide other than CPR. Suitable fusion partners include, but are not limited to: a polypeptide that increases the solubility of an isoprenoid-modifying enzyme or CPR; polypeptides that provide a detectable signal (e.g., fluorescent proteins; enzymes that produce a detectable product such as beta-galactosidase, luciferase, horseradish peroxidase, etc.); polypeptides that localize isoprenoid-modifying enzymes or CPR in specific cellular compartments (e.g., cytosol, cytoplasm, etc.); and so on.

In some embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding an isoprenoid-modifying enzyme (e.g., a polypeptide having terpene hydroxylase and/or terpene oxidase activity) and a CPR. In some embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding a fusion protein comprising an amino acid sequence of an isoprenoid-modifying enzyme having terpene hydroxylase and/or terpene oxidase activity (as described above) fused to a CPR polypeptide. In some embodiments, the encoded fusion protein has the general formula NH₂-A-X-B-COOH, wherein A is an isoprenoid having terpene hydroxylase and/or terpene oxidase activityA modifying enzyme, X is an optional linker, B is a CPR polypeptide. In some embodiments, the encoded fusion protein has the general formula NH₂-a-X-B-COOH, wherein a is a CPR polypeptide, X is an optional linker, and B is an isoprenoid-modifying polypeptide having terpene hydroxylase and/or terpene oxidase activity.

Linker peptides can have various amino acid sequences. Proteins may be linked by generally flexible spacer peptides, but other chemical linkages are not excluded. The linker may be a cleavable linker. Suitable linker sequences are generally peptides of about 5 to 50 amino acids, or about 6 to 25 amino acids in length. Peptide linkers with some flexibility are generally used. The linker peptide can be almost any amino acid sequence, bearing in mind that it is preferred that the linker has a sequence that results in a generally flexible peptide. Small amino acids such as glycine and alanine can be used to produce elastic peptides. Such sequences can generally be generated by those skilled in the art. A variety of different linkers are available and are also suitable for use in the present invention.

Suitable linker peptides often include amino acid sequences rich in alanine and proline residues, which are known to impart elasticity to protein structures. Exemplary linkers contain a combination of glycine, alanine, proline, and methionine residues, such as AAAGGM (SEQ ID NO: 8); AAAGGMPPAAAGGM (SEQ ID NO: 9); AAAGGM (SEQ ID NO: 10); and PPAAGGM (SEQ ID NO: 11). Other exemplary linker peptides include IEGR (SEQ ID NO: 12; and GGKGGK (SEQ ID NO: 13). however, any elastomeric linker of generally about 5-50 amino acids in length can be employed.

Construction article

The invention also provides recombinant vectors ("constructs") comprising a nucleic acid of the invention. In some embodiments, the recombinant vectors of the invention are capable of amplifying a nucleic acid of the invention. In some embodiments, the recombinant vectors of the invention are capable of producing the encoded isoprenoid-modifying enzyme or the encoded CPR in a eukaryotic cell, prokaryotic cell, or cell-free transcription/translation system. Suitable expression vectors include, but are not limited to: baculovirus vectors, phage vectors, plasmids, phagemids, cosmids, F cosmids (fosmid), bacterial artificial chromosomes, viral vectors (e.g., viral vectors based on vaccinia virus, poliovirus, adenovirus, adeno-associated virus, SV40, herpes simplex virus, etc.), P1-based artificial chromosomes, yeast plasmids, yeast artificial chromosomes, and any other vector specific to a particular host of interest (e.g., Escherichia coli, yeast, and plant cells).

In some embodiments, a recombinant vector of the invention comprises an isoprenoid-modifying enzyme-encoding nucleic acid of the invention and a CPR-encoding nucleic acid of the invention. In some embodiments, the recombinant vectors of the invention are expression vectors capable of producing the encoded isoprenoid-modifying enzyme and the encoded CPR in a eukaryotic, prokaryotic, or cell-free transcription/translation system.

Certain vector types are capable of amplifying the expression cassettes of the invention. Other vector types are required for efficient introduction and expression of the nucleic acids of the invention into cells. Any vector capable of accepting a nucleic acid of the invention is contemplated for use as a recombinant vector of the invention. The vector may be a circular or linear DNA which is integrated into the host genome or remains episomal. Vectors may require additional manipulation or specific conditions for efficient incorporation into a host cell (e.g., many expression plasmids), or may be part of a self-integrating cell-specific system (e.g., a recombinant virus). In some embodiments, the vector is functional in a prokaryotic cell, and such a vector is used to propagate a recombinant vector and/or to express a nucleic acid of the invention. In some embodiments, the vector is functional in a eukaryotic cell, and in many cases such a vector is an expression vector.

Many suitable expression vectors are known to those skilled in the art, many of which are commercially available. The following vectors are provided by way of example; for bacterial host cells: pBluescript (Stratagene, Stichoca, san Diego, Calif.), pQE vector (Qiagen), pBluescript plasmid, pNH vector, lambda-ZAP vector (Stratagene); pTrc (Amann et al, Gene, 69: 301-315 (1988)); pTrc99a, pKK223-3, pDR540 and pRIT2T (Pharmacia); for eukaryotic host cells: pXT1, pSG5 (Setchatty Co.), pSVK3, pBPV, pMSG and pSVLSV40 (Framezia). However, any other plasmid or other vector may be employed so long as it is compatible with the host cell.

In many embodiments, the recombinant vectors of the invention contain one or more selectable marker genes to provide phenotypic characteristics for selection of transformed host cells. Suitable selectable markers include, but are not limited to: dihydrofolate reductase, for neomycin resistance in eukaryotic cell cultures; and tetracycline or ampicillin resistance for prokaryotic host cells such as E.coli.

In many embodiments, the nucleic acids of the invention comprise a nucleotide sequence encoding an isoprenoid-modifying enzyme, wherein the nucleotide sequence encoding the isoprenoid-modifying enzyme is operably linked to one or more transcriptional and/or translational control elements. In many embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding a CPR, wherein the CPR-encoding nucleotide sequence is operably linked to one or more transcriptional and/or translational control elements.

In some embodiments, as described above, a recombinant vector of the invention comprises an isoprenoid-modifying enzyme-encoding nucleic acid of the invention and a CPR-encoding nucleic acid of the invention. In some embodiments, the nucleotide sequence encoding an isoprenoid-modifying enzyme and the nucleotide sequence encoding a CPR are operably linked to different transcriptional control elements. In other embodiments, the nucleotide sequence encoding an isoprenoid-modifying enzyme and the nucleotide sequence encoding a CPR are operably linked to the same transcriptional control element. In some embodiments, the nucleotide sequence encoding an isoprenoid-modifying enzyme and the nucleotide sequence encoding a CPR are both operably linked to the same inducible promoter. In some embodiments, the nucleotide sequence encoding an isoprenoid-modifying enzyme and the nucleotide sequence encoding a CPR are both operably linked to the same constitutive promoter.

Promoters suitable for use in prokaryotic host cells include, but are not limited to: the bacteriophage T7 RNA polymerase promoter; a trp promoter; a lac operator promoter; hybrid promoters, such as lac/tac hybrid promoter, tac/trc hybrid promoter, trp/lac promoter, T7/lac promoter; a trc promoter; tac promoter, etc.; the araBAD promoter; in vivo regulated promoters, such as the ssaG promoter or related promoters (see, e.g., U.S. patent publication No. 20040131637), the pagC promoter (Pulkkien and Miller, J.Bacteriol., 1991: 173 (1): 86-93; Alpuche-Aranda et al, PNAS, 1992; 89 (21): 10079-83), the nirB promoter (Harborne et al (1992) mol.Micro.6: 2805-; sigma 70 promoters, such as the consensus sigma 70 promoter (see, e.g., GenBank accession nos. AX798980, AX798961, and AX 798183); stationary phase promoters such as dps promoter, spv promoter, etc.; promoters derived from the pathogenic island SPI-2 (see, e.g., WO 96/17951); the actA promoter (see, e.g., Shetron-Rama et al (2002) infection. Immun.70: 1087-1096); the rpsM promoter (see, e.g., Valldivia and Falkow (1996), mol. Microbiol. 22: 367-; the tet promoter (see, e.g., Hillen, W. and Wissmann, A. (1989), published in Saenger, W. and Heinemann, U, Topics in Molecular and Structural Biology, Protein-Nucleic Acid Interaction, Macmilan, London, England, Vol.10, p.143-; the SP6 promoter (see, e.g., Melton et al (1984) Nucl. acids Res.12: 7035-7056); and so on.

Non-limiting examples of suitable eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTR from retrovirus, and mouse metallothionein-I. In some embodiments, for example, when expressed in yeast cells, suitable promoters are constitutive promoters such as the ADH1 promoter, the PGK1 promoter, the ENO promoter, the PYK1 promoter, and the like; or regulated promoters such as GAL1 promoter, GAL10 promoter, ADH2 promoter, PHO5 promoter, CUP1 promoter, GAL7 promoter, MET25 promoter, MET3 promoter, etc. One of ordinary skill in the art will be able to select appropriate vectors and promoters. The expression vector may also contain a ribosome binding site for translation initiation and a transcription terminator. The expression vector may also contain appropriate sequences for amplifying expression.

In many embodiments, the nucleotide sequence encoding the isoprenoid-modifying enzyme is operably linked to an inducible promoter. In many embodiments, the nucleotide sequence encoding the CPR is operably linked to an inducible promoter. Inducible promoters are well known in the art. Suitable inducible promoters include, but are not limited to: pL of lambda phage; plac; ptrp; ptac (Ptrp-lac hybrid promoter); isopropyl-beta-D-thiogalactopyranoside (IPTG) -inducible promoters, such as the lacZ promoter; a tetracycline-inducible promoter; arabinose inducible promoters, e.g. P_BAD(see, e.g., Guzman et al (1995) J.Bacteriol.177: 4121-4130); xylose-inducible promoters, such as Pxyl (see, e.g., Kim et al (1996) Gene 181: 71-76); GAL1 promoter; a tryptophan promoter; a lac promoter; alcohol-inducible promoters, such as methanol-inducible promoters, ethanol-inducible promoters; raffinose-inducible promoters; thermally inducible promoters, e.g. thermally inducible lambda P_LPromoters, promoters controlled by thermosensitive repressors (e.g., CI 857-repressed lambda-based expression vectors; see, e.g., Hoffmann et al (1999) FEMS Microbiol Lett.177 (2): 327-34); and so on.

In yeast, many vectors containing constitutive or inducible promoters can be used. For a review see Current protocols in Molecular Biology (New compiled Molecular Biology laboratory Manual), Vol.2, 1988, eds. Ausubel et al, Greene publishing. Assoc. & Wiley Interscience (Green Press Union group and Wechsler science Press), Chapter 13; grant et al, 1987, Expression and Secretion Vector for Yeast (Expression and Secretion vectors for Yeast), Methods in Enzymology, Wu and Grossman eds 31987, academic Press (Acad. Press), New York, Vol.153, p.516-; glover, 1986, DNACloning, volume II, IRL press, washington, dc, chapter 3; and Bitter, 1987, Heterologous Gene Expression in Yeast (Heterologous Gene Expression in Yeast), Methods in enzymology (Methods in enzymology), Berger and Kimmel eds, academic Press, New York, Vol.152, pp.673-684; and The Molecular Biology of The Yeast Saccharomyces (Molecular Biology of Saccharomyces cerevisiae), 1982, eds., Stratan et al, Cold spring harbor Press, volumes I and II. Constitutive Yeast promoters such as ADH or LEU2 or inducible promoters such as GAL (Cloning in Yeast), Chapter 3, R.Rothstein, in DNA Cloning, Vol.11, A Practical Approach, ed.DM Glover, 1986, IRL Press, Washington D.C.. Alternatively, vectors that facilitate integration of the foreign DNA sequences into the yeast chromosome may be used.

In some embodiments, a nucleic acid of the invention or a vector of the invention comprises a promoter or other regulatory element for expression in a plant cell. Non-limiting examples of suitable constitutive promoters that function in Plant cells are the cauliflower mosaic virus 35S promoter, the tandem 35S promoter (Kay et al, Science 236: 1299(1987)), the cauliflower mosaic virus 19S promoter, the nopaline synthase gene promoter (Singer et al, Plant mol. biol. 14: 433 (1990); An, Plant Physiol. 81: 86 (1986)), the octopine synthase gene promoter, and the ubiquitin promoter suitable inducible promoters that function in Plant cells include, but are not limited to, the phenylalanine ammonia-lyase gene promoter, the chalcone synthase gene promoter, the pathology-related protein gene promoter, the copper-inducible regulatory element (Mett et al, Proc. Natl. Acad. Sci. USA 90: 4567-4571 (1993); Furst et al, Cell 55: 705-717(1988)), tetracycline and chlorotetracycline-inducible regulatory element (Gatz et al, plant J.2: 397-; rder, et al, mol.gen.genet.243: 32-38 (1994); gatz, meth.cell biol.50: 411-424 (1995)); ecdysone-inducible regulatory elements (Christopherson et al, Proc. Natl. Acad. Sci. USA 89: 6314-6318 (1992); Kreutzweiser et al, Ecotoxicol. environ. safety28: 14-24 (1994)); heat shock attractionRegulatory elements of conductance (Takahashi et al, Plant Physiol.99: 383-390 (1992); Yabe et al, Plant cell Physiol.35: 1207-1219 (1994); Ueda et al, mol.Gen.Genet.250: 533-539 (1996)); and lac operator elements which are used in conjunction with a constitutively expressed lac repressor to produce, for example, IPTG-induced expression (Wilde et al, EMBO J. 11: 1251-1259 (1992); the nitrate-inducible promoter of the spinach nitrite reductase gene (Back et al, Plant mol. biol. 17: 9 (1991); light-inducible promoters, such as promoters associated with RuBP carboxylase or the small subunit of the LHCP gene family (Feinbaum et al, mol. Gen. Genet.226: 449 (1991); Lam and Chua, Science 248: 471 (1990); the light-responsive regulatory elements described in U.S. Pat. No. 20040038400; the salicylic acid-inducible regulatory elements (Uknes et al, Plant 5: 159-169 (1993); Bi et al, Plant J.8: 235-245 (1995); the Plant hormone-inducible regulatory elements (Yamacchia-shin et al, Shinozaki et al, Shinl Bio15. 1990: glucocorticoid-inducible effects (15: 1990) of the Plant hormone family (15: 1990), proc.natl.acad.sci.usa 88: 10421(1991).

Plant tissue-selective regulatory elements may also be included in the nucleic acids of the invention or the vectors of the invention. Tissue-selective regulatory elements suitable for ectopic expression of a nucleic acid in one tissue or a limited number of tissues include, but are not limited to: xylem-selective regulatory elements, tracheid-selective regulatory elements, fiber-selective regulatory elements, trichome-selective regulatory elements (see, e.g., Wang et al (2002) J.Exp.Botany 53: 1891-1897), glandular hair-selective regulatory elements, and the like.

Vectors suitable for use in plant cells are known in the art, and any such vector may be used to introduce the nucleic acids of the invention into a plant host cell. Suitable carriers include, for example: ti plasmid of Agrobacterium tumefaciens (Agrobacterium tumefaciens) or Ri of Agrobacterium rhizogenes (A. rhizogenes)₁A plasmid. Ti or Ri after infection with Agrobacterium₁The plasmid is delivered to the plant cell and stably integrated into the plant genome. Schell, Science, 237: 1176-83(1987). Also suitable for use are plantsAn artificial chromosome, as described in U.S. patent No. 6,900,012.

Composition comprising a metal oxide and a metal oxide

The invention also provides compositions comprising a nucleic acid of the invention. The invention also provides compositions comprising the recombinant vectors of the invention. In many embodiments, a composition comprising a nucleic acid of the invention or an expression vector of the invention comprises one or more of: salts, e.g. NaCl, MgCl, KCl, MgSO₄Etc.; buffers, e.g. Tris buffer, N- (2-hydroxyethyl) piperazine-N' - (2-ethanesulfonic acid) (HEPES), 2- (N-morpholino) ethanesulfonic acid (MES), sodium 2- (N-Morpholino) Ethanesulfonate (MES), 3- (N-morpholino) propanesulfonic acid (MOPS), N-Tris [ hydroxymethyl ] methane]Methyl-3-aminopropanesulfonic acid (TAPS), and the like; a solubilizer; detergents, e.g., non-ionic detergents such as Tween-20 and the like; a nuclease inhibitor; and so on. In some embodiments, the nucleic acid of the invention or the recombinant vector of the invention is lyophilized.

Host cell

The present invention provides genetically modified host cells, such as host cells genetically modified with a nucleic acid of the invention or a recombinant vector of the invention. In many embodiments, the genetically modified host cell of the invention is an in vitro host cell. In other embodiments, the genetically modified host cell of the invention is an in vivo host cell. In other embodiments, the genetically modified host cell of the invention is part of a multicellular organism.

In many embodiments, the host cell is a unicellular organism, or is cultured as a single cell. In some embodiments, the host cell is a eukaryotic cell. Suitable eukaryotic host cells include, but are not limited to: yeast cells, insect cells, plant cells, fungal cells, and algal cells. Suitable eukaryotic host cells include, but are not limited to: pichia pastoris (Pichia pastoris), Pichia finlandica (Pichia finlandica), Pichia trehala, Pichia koclamae, Pichia membranaefaciens (Pichia membranaefaciens), Pichia opuntiae, Pichia thermotolerans (Pichia thermophila), Pichia sallica, Pichia guerkum, Pichia pijperi, Pichia stipitis, Pichia methanolica, Pichia pastoris (Pichia sp), Saccharomyces cerevisiae (Saccharomyces cerevisiae), Saccharomyces sp, hansenula polymorpha (Hansenula polymorpha), Kluyveromyces sp, Kluyveromyces lactis (Kluyveromyces lactis), Candida albicans (Candida albicans), Aspergillus nidulans (Aspergillus nidulans), Aspergillus niger (Aspergillus niger), Aspergillus oryzae (Aspergillus oryzae), Trichoderma reesei (Trichoderma reesei), Chrysosporium lucknowense, Fusarium (Fusarium sp.), Fusarium graminearum (Fusarium gramineum), Fusarium (Fusarium venenatum), Streptomyces griseus (Neurospora crassa), Chlamydomonas Chlamydomonas reinhardtii, and the like. In some embodiments, the host cell is a eukaryotic cell other than a plant cell.

In other embodiments, the host cell is a plant cell. Plant cells include cells of monocots and dicots.

In other embodiments, the host cell is a prokaryotic cell. Suitable prokaryotic cells include, but are not limited to: various laboratory strains of Escherichia coli (Escherichia coli), Lactobacillus (Lactobacillus sp.), Salmonella (Salmonella sp.), Shigella (Shigella sp.), and the like. See, e.g., Carrier et al (1992) j.immunol.148: 1176-1181; U.S. patent nos. 6,447,784; and Sizemore et al (1995) Science 270: 299-302. Examples of salmonella strains useful in the present invention include, but are not limited to: salmonella typhi (Salmonella typhi) and Salmonella typhimurium (s.typhimurium). Suitable shigella strains include, but are not limited to: shigella flexneri (Shigella flexneri), Shigella sonnei (Shigella sonnei) and Shigella disperiae. Typically, the laboratory strain is a non-pathogenic strain. Non-limiting examples of other suitable bacteria include, but are not limited to: bacillus subtilis (Bacillus subtilis), Pseudomonas purita (Pseudomonas pudita), Pseudomonas aeruginosa (Pseudomonas aeruginosa), Pseudomonas mevalonii, Rhodobacter sphaeroides (Rhodobacter sphaeroides), Rhodobacter capsulatus (Rhodobacter capsulatus), Rhodospirillum rubrum (Rhodospirillum rubrum), Rhodococcus rhodochrous (Rhodococcus sp.), and the like. In some embodiments, the host cell is escherichia coli.

To produce a genetically modified host cell of the invention, a nucleic acid of the invention comprising a nucleotide sequence encoding an isoprenoid-modifying enzyme is stably or transiently introduced into a parent host cell using established techniques, including, but not limited to: electroporation, calcium phosphate precipitation, DEAE-dextran mediated transfection, liposome mediated transfection, and the like. In stable transformation, nucleic acids also typically include selectable markers, such as several well-known selectable markers, e.g., neomycin resistance, ampicillin resistance, tetracycline resistance, chloramphenicol resistance, kanamycin resistance, and the like.

In some embodiments, the genetically modified host cell of the invention is a plant cell. The genetically modified plant cells of the invention are useful for producing selected isoprenoid compounds in vitro plant cell cultures. Guidance regarding plant tissue culture can be found, for example: plant Cell and Tissue Culture, 1994, compiled by Vasil and Thorpe, Kluwer Academic Publishers; and plant cell Culture Protocols (Methods in Molecular Biology) 111, 1999, compiled by Hall, Human Press.

Genetically modified host cells

In some embodiments, the genetically modified host cell of the invention comprises an expression vector of the invention comprising a nucleotide sequence encoding an isoprenoid-modifying enzyme. In some embodiments, a subject genetically modified host cell comprises a subject expression vector comprising a nucleotide sequence encoding a polypeptide having terpene hydroxylase and/or terpene oxidase activity.

In some embodiments, a subject genetically modified host cell comprises a first subject expression vector comprising a subject nucleic acid comprising a nucleotide sequence encoding a polypeptide having terpene hydroxylase and/or terpene oxidase activity; the second expression vector of the invention comprises a nucleic acid of the invention comprising a nucleotide sequence encoding a CPR. In other embodiments, the genetically modified host cell of the invention comprises an expression vector of the invention, wherein the expression vector of the invention comprises a nucleic acid of the invention comprising a nucleotide sequence encoding an isoprenoid-modifying enzyme and a nucleic acid of the invention comprising a nucleotide sequence encoding a CPR. In other embodiments, the genetically modified host cell of the invention comprises an expression vector of the invention comprising a nucleic acid of the invention comprising a nucleotide sequence encoding a fusion polypeptide (e.g., a polypeptide comprising an isoprenoid-modifying enzyme and a CPR).

Suitable CPR-encoding nucleic acids include nucleic acids encoding CPR found in plants. Suitable CPR-encoding nucleic acids include nucleic acids encoding CPR found in fungi. Examples of suitable CPR-encoding nucleic acids include: GenBank accession No. AJ303373 (Triticum aestivum) CPR); GenBank accession No. AY959320 (Taxus chinensis) CPR); GenBank accession No. AY532374 (amomi maju) CPR); GenBank accession No. AG211221 (Oryza sativa) CPR); and GenBank accession No. AF024635 (Petroselinum crispum) CPR).

In some embodiments, the genetically modified host cells of the invention are host cells that do not normally synthesize isopentyl-1-enyl pyrophosphate (IPP) or mevalonate via the mevalonate pathway. The mevalonate pathway includes: (a) condensing two molecules of acetyl-CoA into acetoacetyl-CoA; (b) condensing acetoacetyl-CoA with acetyl-CoA to form HMG-CoA; (c) converting HMG-CoA into mevalonate; (d) phosphorylating mevalonate to mevalonate 5-phosphate; (e) converting mevalonate 5-phosphate to mevalonate 5-pyrophosphate; and (f) converting mevalonate 5-pyrophosphate to isopentenyl pyrophosphate. The mevalonate pathway enzymes required for IPP production may vary depending on the culture conditions.

As noted above, in some embodiments, the genetically modified host cells of the invention are host cells that do not normally synthesize isopentyl-1-enyl pyrophosphate (IPP) or mevalonate via the mevalonate pathway. In some embodiments, a host cell is genetically modified with an expression vector of the invention comprising a nucleic acid of the invention encoding an isoprenoid-modifying enzyme; and genetically modifying the host cell with one or more heterologous nucleic acids comprising nucleotide sequences encoding an acetoacetyl-CoA thiolase, a hydroxymethylglutaryl-CoA synthase (HMGS), a hydroxymethylglutaryl-CoA reductase (HMGR), a Mevalonate Kinase (MK), a phosphomevalonate kinase (PMK), and a Mevalonate Pyrophosphate Decarboxylase (MPD) (and optionally also an IPP isomerase). In many such embodiments, the host cell is genetically modified with an expression vector comprising a nucleotide sequence encoding a CPR. In some embodiments, a host cell is genetically modified with an expression vector of the invention comprising a nucleic acid of the invention encoding an isoprenoid-modifying enzyme; the host cell is genetically modified with one or more heterologous nucleic acids comprising a nucleotide sequence encoding MK, PMK, MPD (and optionally also IPP isomerase). In many such embodiments, the host cell is genetically modified with an expression vector comprising a nucleotide sequence encoding a CPR.

In some embodiments, the genetically modified host cell of the invention is a host cell that does not normally synthesize IPP or mevalonate via the mevalonate pathway; genetically modifying a host cell with an expression vector of the invention comprising a nucleic acid of the invention encoding an isoprenoid-modifying enzyme; and genetically modifying the host cell with one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-CoA thiolase, an HMGS, an HMGR, an MK, a PMK, an MPD, an IPP isomerase, and a prenyltransferase. In many such embodiments, the host cell is genetically modified with an expression vector comprising a nucleotide sequence encoding a CPR. In some embodiments, the genetically modified host cell of the invention is a host cell that does not normally synthesize IPP or mevalonate via the mevalonate pathway; genetically modifying a host cell with an expression vector of the invention comprising a nucleic acid of the invention encoding an isoprenoid-modifying enzyme; and genetically modifying the host cell with one or more heterologous nucleic acids comprising nucleotide sequences encoding MK, PMK, MPD, IPP isomerase, and prenyltransferase. In many such embodiments, the host cell is genetically modified with an expression vector comprising a nucleotide sequence encoding a CPR.

In some embodiments, the genetically modified host cell of the invention is a host cell that normally synthesizes IPP or mevalonate via the mevalonate pathway, e.g., the host cell is a host cell comprising an endogenous mevalonate pathway. In some embodiments, the host cell is a yeast cell. In some embodiments, the host cell is saccharomyces cerevisiae.

In some embodiments, the genetically modified host cells of the invention are also genetically modified with one or more nucleic acids comprising a nucleotide sequence encoding a dehydrogenase that further modifies an isoprenoid compound. The encoded dehydrogenase may be a dehydrogenase found naturally in prokaryotic or eukaryotic cells, or may be a variant of such a dehydrogenase. In some embodiments, the invention provides isolated nucleic acids of nucleotide sequences encoding such dehydrogenases.

Mevalonate pathway nucleic acids

Nucleotide sequences encoding MEV pathway gene products are known in the art, and any known nucleotide sequence encoding MEV pathway gene products can be used to generate the genetically modified host cells of the invention. For example, nucleotide sequences encoding acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, MPD, and IDI are known in the art. The following are non-limiting examples of known nucleotide sequences encoding MEV pathway gene products, the contents of which in parentheses below represent GenBank accession numbers and organisms for each MEV pathway enzyme: acetoacetyl-CoA thiolase: (NC-000913 region: 2324131.. 2325315; E.coli), (D49362; Paracoccus denitrificans (Paracoccus denitificans)) and (L20428; Saccharomyces cerevisiae); HMGS: (NC-001145 complement 19061.. 20536; Saccharomyces cerevisiae), (X96617; Saccharomyces cerevisiae), (X83882; Arabidopsis (Arabidopsis thaliana)), (AB 037907; northern sporotrichum griseola) and (BT 007302; homo sapiens (Homosapiens)); HMGR: (NM-206548; Drosophila melanogaster), (NM-204485; Rhodotorula gallinarum), (AB 015627; Streptomyces sp.) -KO-3988), (AF 542543; tobacco (Nicotiana attennata)), (AB 037907; northern chrysosporium (Kitasaspora griseola) and (AX128213, providing a sequence encoding a truncated HMGR; Saccharomyces cerevisiae) and (NC-001145: the complement (115734.. 118898; Saccharomyces cerevisiae)); MK: (L77688; Arabidopsis) and (X55875; Saccharomyces cerevisiae); PMK: (AF 429385; Hevea brasiliensis), (NM-006556; Chile), (NC-001145, complement 712315.. 713670; Saccharomyces cerevisiae); MPD: (X97557; Saccharomyces cerevisiae), (AF 290095; Enterococcus faecium) and (U49260; homo sapiens); and IDI: (NC-000913, 3031087.. 3031635; E.coli) and (AF 082326; Haematococcus pluvialis).

In some embodiments, the HMGR coding region encodes a truncated form of HMGR that lacks the transmembrane domain of a wild-type HMGR ("tHMGR"). The transmembrane domain of HMGR contains the regulatory portion of the enzyme, but has no catalytic activity.

The coding sequence for any known MEV pathway enzyme may be altered in various ways known in the art to target changes in the amino acid sequence encoding the enzyme. The amino acids of a variant MEV pathway enzyme are typically substantially similar to the amino acid sequence of any known MEV pathway enzyme, i.e., differ by at least one amino acid, which may differ by at least two, at least 5, at least 10, or at least 20 amino acids, but generally does not exceed about 50 amino acids. The sequence alteration may be a substitution, insertion or deletion. For example, as described below, the nucleotide sequence may be altered according to the codon bias of a particular host cell. In addition, one or more nucleotide sequence differences can be introduced that result in conservative amino acid changes in the encoded protein.

Prenyltransferases

In some embodiments, a genetically modified host cell of the invention is genetically modified to comprise a nucleic acid comprising a nucleotide sequence encoding an isoprenoid-modifying enzyme; in some embodiments, the nucleic acid sequence is further genetically modified to comprise one or more nucleic acids comprising a nucleotide sequence encoding one or more mevalonate pathway enzymes (as described above); and a nucleic acid encoding a nucleotide sequence for a prenyltransferase.

Prenyltransferases constitute a broad class of enzymes that catalyze the sequential condensation of IPP leading to the formation of prenyl diphosphates of various chain lengths. Suitable prenyltransferases include enzymes that catalyze the condensation of IPP with an allylic initial substrate to form an isoprenoid compound having from about 2 prenyl units to about 6000 prenyl units or more, e.g., 2 prenyl units (geranyl pyrophosphate synthase), 3 prenyl units (farnesyl pyrophosphate synthase), 4 prenyl units (cage for animals geranyl pyrophosphate cage for animals geranyl synthase), 5 prenyl units, 6 prenyl units (cetyl pyrophosphate synthase), 7 prenyl units, 8 prenyl units (phytoene synthase, octaprenyl pyrophosphate synthase), 9 prenyl units (nonaprenyl pyrophosphate synthase), 10 prenyl units (decaprenyl pyrophosphate synthase), from about 10 prenyl units to about 15 prenyl units, About 15 isoamylene units to 20 isoamylene units, about 20 isoamylene units to 25 isoamylene units, about 25 isoamylene units to 30 isoamylene units, about 30 isoamylene units to 40 isoamylene units, about 40 isoamylene units to 50 isoamylene units, about 50 isoamylene units to 100 isoamylene units, about 100 isoamylene units to 250 isoamylene units, from about 250 isoamylene units to 500 isoamylene units, from about 500 isoamylene units to 1000 isoamylene units, from about 1000 isoamylene units to 2000 isoamylene units, from about 2000 isoamylene units to 3000 isoamylene units, from about 3000 isoamylene units to 4000 isoamylene units, from about 4000 isoamylene units to 5000 isoamylene units, or from about 5000 isoamylene units to 6000 isoamylene units or more.

Suitable prenyltransferases include, but are not limited to: e-prenyl diphosphate synthases, including but not limited to: cage for animals geranyl diphosphate (GPP) synthase, farnesyl diphosphate (FPP) synthase, cage for animals geranyl cage for animals geranyl diphosphate (GGPP) synthase, hexaprenyl diphosphate (HexPP) synthase, heptaprenyl diphosphate (HepPP) synthase, octaprenyl diphosphate (OPP) synthase, solanyl diphosphate (SPP) synthase, decaprenyl Diphosphate (DPP) synthase, chicle synthase, and gutta-percha synthase; and Z-prenyl diphosphate synthases, including but not limited to: nonaprenyl diphosphate (NPP) synthase, undecaprenyl diphosphate (UPP) synthase, dehydropolyterpene diphosphate (dehydrodolichyl diphosphate) synthase, eicosapenyl diphosphate synthase, natural rubber synthase, and other Z-prenyl diphosphate synthases.

The nucleotide sequences of a variety of prenyltransferases of various species are known and may be used or modified for use in producing the genetically modified host cells. Nucleotide sequences encoding prenyltransferases are known in the art. See, e.g., human farnesyl pyrophosphate synthetase mRNA (GenBank accession J05262; homo); farnesyl diphosphate synthase (FPP) gene (GenBank accession J05091; Saccharomyces cerevisiae); mono-prenyl diphosphate: dimethylallyl diphosphate isomerase gene (J05090; Saccharomyces cerevisiae); wang and ohuma (2000) biochim. biophysis. acta 1529: 33-48; U.S. patent nos. 6,645,747; arabidopsis thaliana farnesyl pyrophosphate synthase 2(FPS2)/FPP synthase 2/farnesyl diphosphate synthase 2(At4g17190) mRNA (GenBank accession No. NM-202836); ginkgo biloba (Ginkgo biloba) diphosphate cage for animals geranyl cage for animals geranyl ester synthase (ggpps) mRNA (GenBank accession AY 371321); arabidopsis thaliana pyrophosphate cage for animals geranylgeranyl synthase (GGPS1)/GGPP synthase/farnesyl transferase (At4g36810) mRNA (GenBank accession NM-119845); farnesyl diphosphate, cage for animals geranyl cage for animals geranyl, geranyl farnesyl, hexaprenyl, heptaprenyl ester synthase genes (SelF-HepPS) from Synechococcus elongatus (Synechococcus elongatus) (GenBank accession number AB 016095); and the like.

Terpene synthases

In some embodiments, a subject genetically modified host cell is genetically modified to comprise a nucleic acid comprising a nucleotide sequence encoding a terpene synthase. In some embodiments, the terpene synthase is a terpene synthase that modifies FPP to produce sesquiterpenes. In other embodiments, the terpene synthase is a terpene synthase that modifies GPP to produce a monoterpene. In other embodiments, the terpene synthase is a terpene synthase that modifies GGPP to produce a diterpene.

Nucleotide sequences encoding terpene synthases are known in the art, and host cells can be genetically modified with any known nucleotide sequence encoding a terpene synthase. For example, the following nucleotide sequences encoding terpene synthases are known and can be employed (followed by their GenBank accession numbers and the organisms to which they were identified): (-) -germacrene D synthase mRNA (AY 438099; Populus deltoides subsp.trichocarpa. trexoptera (Populus deltoids)); e, E-a-farnesene synthase mRNA (AY 640154; cucumber (Cucumis sativus)); 1, 8-cineole synthase mRNA (AY 691947; Arabidopsis thaliana); terpene synthase 5(TPS5) mRNA (AY 518314; maize (Zea mays)); terpene synthase 4(TPS4) mRNA (AY 518312; maize); myrcene/ocimene synthase (TPS10) (At2g24210) mRNA (NM _ 127982; Arabidopsis thaliana); geraniol synthase (GES) mRNA (AY 362553; Ocimum basilicum); pinene synthase mRNA (AY 237645; Picoasitchensis); myrcene synthase 1e20 mRNA (AY 195609; snapdragon maju); (E) - β -ocimene synthase (0e23) mRNA (AY 195607; Antirrhinum maju); e-beta-ocimene synthase mRNA (AY 151086; snapdragon); terpene synthase mRNA (AF 497492; Arabidopsis); (-) -camphene synthase (AG6.5) mRNA (U87910; Abies grandis); (-) -4S-limonene synthase gene (e.g., genomic sequence) (AF 326518; Abies North American); delta-cnidium synthase gene (AF 326513; Abies North America); amorpha-4, 11-diene synthase mRNA (AJ 251751; Artemisia annua (Artemisia annua)); e-alpha-bisabolene synthase mRNA (AF 006195; Abies North America); gamma-humulene synthase mRNA (U92267; Abies North America); delta-cnidium synthase mRNA (U92266; Abies North America); pinene synthase (AG3.18) mRNA (U87909; Abies sempervirens); myrcene synthase (AG2.2) mRNA (U87908; Abies sempervirens), and the like.

Codon usage

In some embodiments, the nucleotide sequence used to produce the genetically modified host cells of the invention is modified such that the nucleotide sequence reflects codons preferred by the particular host cell. For example, in some embodiments, the nucleotide sequence is modified according to yeast preferred codons. See, e.g., bennettzen and Hall (1982) j.biol.chem.257 (6): 3026-3031. As another non-limiting example, in other embodiments, the nucleotide sequence is modified according to the preferred codons of E.coli. See, e.g., Gouy and Gautier (1982) Nucleic acids sRs.10 (22): 7055-; Eyre-Walker (1996) mol.biol.Evol.13 (6): 864-872. See also Nakamura et al (2000) Nucleic Acids Res.28 (1): 292.

other genetic modifications

In some embodiments, a genetically modified host cell of the invention refers to a host cell that has been modified by: genetically modified to comprise one or more nucleic acids comprising a nucleotide sequence encoding an isoprenoid-modifying enzyme; and further genetically modified to enhance terpene biosynthetic pathway intermediate production, and/or further genetically modified to impair endogenous terpene biosynthetic pathway gene function. The term "functionally impaired" as used herein in reference to an endogenous terpene biosynthetic pathway gene refers to a genetic modification of a terpene biosynthetic pathway gene that results in a lower than normal level of production, and/or no function, of the gene product encoded by the gene.

Genetic modifications that increase the production of endogenous terpene biosynthetic pathway intermediates include, but are not limited to: a genetic modification that results in a reduction in the level and/or activity of phosphotransacetylase in the host cell. Increasing the intracellular concentration of a terpene biosynthetic pathway intermediate by increasing the intracellular concentration of acetyl-coa. Coli secretes large amounts of intracellular acetyl-coa in the acetate form into the culture medium. Deletion of the gene encoding phosphotransacetylase (pta), the first enzyme responsible for the conversion of acetyl-coa to acetate, will reduce acetate secretion. Genetic modifications that reduce the level and/or activity of phosphotransacetylase in prokaryotic host cells are particularly useful, wherein the genetically modified host cell is one that is genetically modified with a nucleic acid comprising a nucleotide sequence encoding one or more MEV pathway gene products.

In some embodiments, the genetic modification that results in a reduction in the level of phosphotransacetylase in the prokaryotic host cell is a genetic mutation that functionally impairs the endogenous pta gene of the prokaryotic host cell encoding phosphotransacetylase. The pta gene function can be impaired in a number of ways, including: insertion of mobile genetic elements (e.g., transposons, etc.); deletion of all or part of the gene, resulting in either no gene product being formed, or the product being truncated and not functional in the conversion of acetyl-coa to acetate; a gene mutation resulting in no formation of a gene product, a product that is truncated and is not functional in the conversion of acetyl-coa to acetate; deletion or mutation of one or more control elements that control expression of a pta gene, resulting in the production of no gene product; and so on.

In some embodiments, the endogenous pta gene of the genetically modified host cell is deleted. Any method of deleting a gene may be used. One non-limiting example of a pta gene deletion method is the use of a lambda Red recombination system. Datsenko and Wanner (2000) Proc Natl Acad Sci USA 97 (12): pages 6640-5. In some embodiments, the pta gene is deleted in a host cell (e.g., e.coli) genetically modified with a nucleic acid comprising nucleotide sequences encoding MK, PMK, MPD, and IDI. In some embodiments, the pta gene is deleted in a host cell (e.g., e.coli) genetically modified with a nucleic acid comprising nucleotide sequences encoding MK, PMK, MPD, and IPP. In some embodiments, the pta gene is deleted in a host cell (e.g., e.coli) genetically modified with a nucleic acid comprising nucleotide sequences encoding MK, PMK, MPD, IPP, and prenyltransferase.

In some embodiments, a subject genetically modified host cell is a host cell genetically modified to comprise one or more nucleic acids comprising a nucleotide sequence encoding a MEV biosynthetic pathway gene product; and a nucleotide sequence that is further genetically modified to functionally disrupt an endogenous DXP biosynthetic pathway gene. In other embodiments, a genetically modified host cell of the invention is a host cell genetically modified to comprise one or more nucleic acids comprising a nucleotide sequence encoding a gene product of the DXP biosynthetic pathway; and a nucleotide sequence further genetically modified to functionally disrupt an endogenous MEV biosynthetic pathway gene.

In some embodiments, when the genetically modified host cell of the invention is a prokaryotic host cell genetically modified with a nucleic acid comprising a nucleotide sequence encoding one or more MEV pathway gene products, the host cell is further genetically modified such that one or more endogenous DXP pathway genes are functionally impaired. DXP pathway genes that may be functionally impaired include one or more genes encoding any of the following DXP gene products: 1-deoxy-D-xylulose 5-phosphate synthase, 1-deoxy-D-xylulose 5-phosphate reductone isomerase (reductoisomerase), cytidylyl 4-diphosphate-2-C-methyl-D-erythritol synthase, cytidylyl 4-diphosphate-2-C-methyl-D-erythritol kinase, 2C-methyl-D-erythritol 2, 4-cyclodiphosphate synthase, and 1-hydroxy-2-methyl-2- (E) -butenyl 4-diphosphate synthase.

Endogenous DXP pathway genes can be functionally impaired in a variety of ways, including insertion of mobile genetic elements (e.g., transposons, etc.); deletion of all or part of the gene, resulting in either no gene product being formed, or the product being truncated and having no enzymatic activity; gene mutation resulting in no formation of gene product, truncated product and no enzyme function; deletion or mutation of one or more control elements that control gene expression, resulting in the production of no gene product; and so on.

In other embodiments, when the genetically modified host cell of the invention is a prokaryotic host cell genetically modified with a nucleic acid comprising a nucleotide sequence encoding one or more DXP pathway gene products, the host cell is further genetically modified such that one or more endogenous MEV pathway genes are functionally impaired. Endogenous MEV pathway genes that may be functionally impaired include one or more genes encoding any of the following MEV gene products: HMGS, HMGR, MK, PMK, MPD and IDI. Endogenous MEV pathway genes can be functionally impaired in a variety of ways, including insertion of mobile genetic elements (e.g., transposons, etc.); deletion of all or part of the gene, resulting in either no gene product being formed, or the product being truncated and having no enzymatic activity; gene mutation resulting in no formation of gene product, truncated product and no enzyme function; deletion or mutation of one or more control elements that control gene expression, resulting in the production of no gene product; and so on.

Compositions comprising the genetically modified host cells of the invention

The invention also provides compositions comprising the genetically modified host cells of the invention. The compositions of the invention comprise a genetically modified host cell of the invention; and in some embodiments further comprises one or more additional components selected, in part, based on the particular use for which the host cell is genetically modified. Suitable ingredients include, but are not limited to: salt; a buffering agent; a stabilizer; a protease inhibitor; a nuclease inhibitor; cell membrane and/or cell wall preserving compounds such as glycerol, dimethylsulfoxide, etc.; nutrient media suitable for the cells; and so on. In some embodiments, the cells are lyophilized.

Transgenic plants

In some embodiments, a nucleic acid of the invention or an expression vector of the invention (e.g., an isoprenoid-modifying enzyme nucleic acid of the invention or an expression vector of the invention comprising an isoprenoid-modifying enzyme nucleic acid) is used as a transgene to produce a transgenic plant that produces the encoded isoprenoid-modifying enzyme. Thus, the invention also provides a transgenic plant comprising a transgene comprising a nucleic acid of the invention comprising a nucleotide sequence encoding an enzyme having terpene hydroxylase and/or terpene oxidase activity (as described above). In some embodiments, the genome of the transgenic plant comprises a nucleic acid of the invention. In some embodiments, the transgenic plant is homozygous for the genetic modification. In some embodiments, the transgenic plant is heterozygous for the genetic modification.

In some embodiments, a transgenic plant of the invention produces a transgenic encoded polypeptide having terpene hydroxylase and/or oxidase activity with at least about 50%, at least about 2-fold, at least about 5-fold, at least about 10-fold, at least about 25-fold, at least about 50-fold, or at least about 100-fold or greater yield of the encoded polypeptide as compared to the yield of the polypeptide in a control plant of the same species, such as a non-transgenic plant (a plant that does not comprise a transgene encoding the polypeptide).

In some embodiments, the transgenic plants of the invention are transgenic versions of a non-transgenic control plant that normally produces an isoprenoid compound that is a compound produced by or a downstream product of a transgene-encoded polypeptide having terpene hydroxylase and/or oxidase activity; the transgenic plant has at least about 50%, at least about 2-fold, at least about 5-fold, at least about 10-fold, at least about 25-fold, at least about 50-fold, or at least about 100-fold or greater yield of an isoprenoid compound as compared to the yield of an isoprenoid compound in a control plant of the same species, such as a non-transgenic plant (a plant that does not comprise a transgene encoding the polypeptide).

Methods for introducing exogenous nucleic acids into plant cells are well known in the art. Such plant cells are considered to be "transformed" as described above. Suitable methods include viral infection (e.g., double-stranded DNA virus), transfection, conjugation, protoplast fusion, electroporation, biolistic techniques, calcium phosphate precipitation, direct microinjection, silicon carbide whisker (whisker) technique, agrobacterium-mediated transformation, and the like. The choice of method will generally depend on the cell type being transformed and the environment in which the transformation is to take place (i.e., in vitro, ex vivo or in vivo).

Transformation methods based on the soil bacterium Agrobacterium tumefaciens (Agrobacterium tumefaciens) are particularly suitable for introducing exogenous nucleic acid molecules into vascular plants. Wild-type Agrobacterium contains a Ti (tumor-inducing) plasmid that directs the production of tumorigenic crown gall growth on host plants. Transfer of the tumor-inducible T-DNA region of the Ti plasmid to the plant genome requires the virulence genes encoded by the Ti plasmid and the T-DNA borders, which are a set of syntropic DNA repeats delineating the region to be transferred. The agrobacterium-based vector is a modified Ti plasmid in which the tumor inducing function is replaced by a nucleic acid sequence of interest to be introduced into the plant host.

Agrobacterium-mediated transformation usually employs a co-integrated vector, or preferably a binary vector system, in which the elements of the Ti-plasmid are divided between a helper vector, which is permanently retained in the Agrobacterium host and carries the virulence genes, and a shuttle vector, which contains the gene of interest bordered by a T-DNA sequence. Various binary vectors are well known in the art and are available, for example, from clone tek technologies (Clontech) (palo alto, ca). Methods for co-culturing Agrobacterium with cultured plant cells or wounded tissue such as leaf tissue, root explants, hypocotyls (hypocotyledons), shoots or tubers are well known in the art, for example. See, e.g., Glick and Thompson (ed.), Methods in Plant Molecular Biology and biotechnology (Methods of Plant Molecular Biology and biotechnology), burkatton, florida: CRC Press (CRC Press) (1993).

Agrobacterium-mediated transformation can be used to produce a variety of transgenic vascular plants (Wang et al, supra, 1995) including at least one of Eucalyptus (Eucalyptus) and Leguminosae (grass leguminous) species such as alfalfa (alfalfa); calotropis procera, white clover, pennisetum (Stylosanthes), Rotton Bean (Lotononis bainsessii) and Ormosia micrantha.

Microprojectile-mediated transformation may also be used to produce transgenic plants of the invention. This method, first described by Klein et al (Nature 327: 70-73(1987)), relies on coating particles of the desired nucleic acid molecule, such as gold or tungsten, by precipitation with calcium chloride, spermidine or polyethylene glycol. The microparticles are injected into the angiosperm tissue at high speed using a device such as BIOLISTIC PD-1000 (Biorad); Hercules Calif., Calif.).

The nucleic acid of the invention is introduced into the plant in such a way that the nucleic acid can enter the plant cell, for example, by in vivo or ex vivo methods. By "in vivo" is meant that the nucleic acid is introduced into the living plant body, for example, by infiltration. "ex vivo" refers to the modification of a cell or explant outside of a plant, followed by regeneration of the cell or organ into a plant. A number of vectors have been described which are suitable for stably transforming Plant cells or for establishing transgenic plants, including Weissbach and Weissbach (1989) methods for Plant Molecular Biology, academic Press; and Gelvin et al, (1990) Plant Molecular Biology Manual (handbook of Plant Molecular Biology), Kluyveromyces academic Press group. Specific examples include vectors derived from the Ti plasmid of Agrobacterium tumefaciens, and Herrera-Estralla et al (1983) Nature 303: 209, Bevan (1984) nuclear Acid res.12: 8711-8721, Klee (1985) Bio/Technology 3: 637-642. Alternatively, non-Ti vectors can be used to transfer DNA into plants and cells by episomal DNA delivery techniques. Using these methods, transgenic plants such as wheat, rice (Christou (1991) Bio/Technology 9: 957-. Immature embryos may also be good monocot target tissue when using the direct DNA delivery technique using a gene gun (Weeks et al (1993) Plant Physiol 102: 1077-1084; Vasil (1993) Bio/technology 10: 667-19-674; Wan and Lemeaux (1994) Plant Physiol 104: 37-48; for Agrobacterium-mediated DNA transfer see (Ishida et al (1996) Nature Biotech 14: 745-750.) exemplary Methods for introducing DNA into the chloroplast are biolistic bombardment, polyethylene glycol transformation protoplasts and microinjection (Danieli et al, Nat.Biotech 16: 345-348, 1998; Staub et al, Nat.Biotech 18: 333-338, 2000; O' Neill et al, Plant J.3: 729-738, 1993; Knoauch et al, Nat.Biotech 17: Biotech-338, 3632; 5,451,513-3632; 361993; WO 36510; 5,451,513; 3632; 5,451,513; 361993), svab et al, proc.natl.acad.sci.usa 90: 913-917(1993) and McBride et al, Proc. Natl. Acad. Sci. USA 91: 7301-7305(1994)). Any vector suitable for biolistic bombardment, polyethylene glycol transformation of protoplasts, and microinjection methods can be used as a targeting vector for chloroplast transformation. Any double-stranded DNA vector can be used as a transformation vector, particularly when the introduction method does not employ Agrobacterium.

Plants that may be genetically modified include cereals, pasture crops, fruits, vegetables, oilseed crops, palms, forestry plants and vines. Specific examples of plants that can be modified are as follows: corn, banana, peanut, purple pea, sunflower, tomato, canola, tobacco, wheat, barley, oat, potato, soybean, cotton, carnation, sorghum, lupin, and rice. Other examples include artemisia annua, or other plants known to produce isoprenoid compounds of interest.

The invention also provides transformed plant cells, tissues, plants and products comprising the transformed plant cells. The transformed cells of the invention, and tissues and products containing the cells, are characterized by the integration of a nucleic acid of the invention into the genome, and the ability of the plant cells to produce a polypeptide having terpene hydroxylase and/or terpene oxidase activity, e.g., sesquiterpene oxidase activity. The recombinant plant cells of the invention can be used as recombinant cell populations, or as tissues, seeds, whole plants, stems, fruits, leaves, roots, flowers, stems, tubers, grains, animal feeds, large pieces of plants, and the like.

The invention also provides propagation material for the transgenic plants of the invention, including seeds, progeny plants and vegetative propagation material.

Methods of producing isoprenoid compounds

The present invention provides methods of producing isoprenoid compounds. In some embodiments, the methods generally comprise culturing a genetically modified host cell in a suitable culture medium, wherein the host cell is genetically modified with a nucleic acid of the invention comprising a nucleotide sequence encoding an isoprenoid-modifying enzyme. In other embodiments, the methods generally comprise maintaining a transgenic plant of the invention under conditions conducive to the production of the encoded isoprenoid-modifying enzyme. Production of an isoprenoid-modifying enzyme results in production of an isoprenoid compound. For example, in some embodiments, the methods generally comprise culturing a genetically modified host cell in a suitable culture medium, wherein the host cell is genetically modified with a nucleic acid of the invention comprising a nucleotide sequence encoding a terpene oxidase. Production of terpene oxidase results in production of isoprenoid compounds. The method is generally performed in vitro, but it is also contemplated that the isoprenoid compound is produced in vivo. In some embodiments, the host cell is a eukaryotic cell, such as a yeast cell. In other embodiments, the host cell is a prokaryotic cell. In some embodiments, the host cell is a plant cell. In some embodiments, the method is performed in a transgenic plant of the invention.

Cells typically produce isoprenoids or isoprenoid precursors (e.g., IPP, polyprenyl diphosphate, etc.) in one of two ways. FIGS. 13-15 are used to illustrate the pathway by which cells produce isoprenoid compounds or precursors such as polyprenyl diphosphate.

FIG. 13 shows an isoprenoid pathway involving modification of isopentenyl diphosphate (IPP) and/or its isomer dimethylallyl Diphosphate (DMAPP) with isopentenyl transferase to produce cage for animals geranyl diphosphate (GPP), farnesyl diphosphate (FPP) and cage for animals geranylgeranyl diphosphate (GGPP). GPP and FPP are further modified by terpene synthases to produce monoterpenes and sesquiterpenes, respectively; GGPP is further modified by terpene synthases to form diterpenes and carotenoids. IPP and DMAPP are produced by one of two pathways: the Mevalonate (MEV) pathway and the 1-deoxy-D-xylulose 5-phosphate (DXP) pathway.

Figure 14 schematically shows the MEV pathway in which acetyl-coa is converted to IPP through a series of reactions.

FIG. 15 schematically shows the DXP pathway in which pyruvate and D-glyceraldehyde-3-phosphate are converted to IPP and DMAPP through a series of reactions. Eukaryotic cells other than plant cells utilize only the MEV isoprenoid pathway to convert acetyl-coenzyme a (acetyl-CoA) to IPP, and then to isomerize to DMAPP. Plants utilize both MEV and mevalonate independent, or DXP pathways to synthesize isoprenoids. With some exceptions, prokaryotes utilize the DXP pathway through branch points to produce IPP and DMAPP, respectively.

In some embodiments, a host cell is genetically modified with a nucleic acid of the invention comprising a nucleotide sequence encoding a sesquiterpene oxidase, and the host cell is cultured with a medium comprising the sesquiterpene. Sesquiterpenes enter the cell and are modified in the cell by sesquiterpene oxidases. In many embodiments, the sesquiterpene is selected from the group consisting of amorphadiene, isolongifolene, (-) - α -trans-bergamotene, (-) - β -elemene, (+) -germacrene A, germacrene B, (+) - γ -guvacene, (+) -limonene, decahydrodimethylmethylvinylnaphthol (neointemediol), (+) - β -cnidium, and (+) -valencene. In some embodiments, the sesquiterpene oxidase is amorphadiene oxidase, and the host cell is cultured in a medium comprising amorpha-4, 11-diene oxidase.

In other embodiments, the host cell is also genetically modified with a nucleic acid comprising a nucleotide sequence encoding a terpene synthase. Thus, for example, a host cell is genetically modified with one or more nucleic acids comprising nucleotide sequences encoding terpene synthases and isoprenoid-modifying enzymes (e.g., sesquiterpene oxidases). Such host cells can be cultured in a suitable medium to produce terpene synthases and isoprenoid-modifying enzymes (e.g., sesquiterpene oxidases). For example, terpene synthases modify farnesyl pyrophosphate to produce sesquiterpene substrates for the sesquiterpene oxidases.

Depending on the medium in which the host cell is cultured and whether the host cell synthesizes IPP via the DXP pathway or the mevalonate pathway, in some embodiments, the host cell may further comprise other genetic modifications. For example, in some embodiments, the host cell is a host cell that does not have an endogenous mevalonate pathway, e.g., the host cell is a host cell that does not normally synthesize IPP or mevalonate via the mevalonate pathway. For example, in some embodiments, the host cell is one that does not normally synthesize IPP via the mevalonate pathway, the host cell being genetically modified with one or more nucleic acids comprising nucleotide sequences encoding two or more enzymes of the mevalonate pathway, an IPP isomerase, an isopentenyl transferase, a terpene synthase, and an isoprenoid-modifying enzyme (e.g., an isoprenoid-modifying enzyme encoded by a nucleic acid of the present invention). Culturing such host cells produces mevalonate pathway enzymes, IPP isomerase, prenyltransferase, terpene synthase and isoprenoid-modifying enzyme (e.g., sesquiterpene oxidase). Production of mevalonate pathway enzymes, IPP isomerases, prenyltransferases, terpene synthases, and isoprenoid-modifying enzymes (e.g., sesquiterpene oxidases) results in production of isoprenoid compounds. In many embodiments, the prenyltransferase is an FPP synthase that produces a sesquiterpene substrate of a sesquiterpene oxidase encoded by a nucleic acid of the invention; production of sesquiterpene oxidases results in oxidation of sesquiterpene substrates in the host cell. Any nucleic acid encoding a mevalonate pathway enzyme, an IPP isomerase, an isopentenyl transferase, and a terpene synthase is suitable for use. For example, suitable nucleic acids are described, e.g., in Martin et al (2003) supra.

In some of the above embodiments, when the host cell is genetically modified with one or more nucleic acids comprising nucleotide sequences encoding two or more mevalonate pathway enzymes, including MK, PMK, and MPD, the host cell is cultured in a medium comprising mevalonate. In other embodiments, the two or more mevalonate pathway enzymes comprise acetoacetyl CoA thiolase, HMGS, HMGR, MK, PMK, and MPD.

In some embodiments, the host cell is a host cell that does not normally synthesize IPP via the mevalonate pathway, the host cell being genetically modified as described above, the host cell further comprising a functionally impaired DXP pathway.

In some embodiments, the host cell is genetically modified with a nucleic acid comprising a nucleotide sequence encoding a Cytochrome P450 Reductase (CPR). Various CPR nucleotide sequences are known and any known CPR-encoding nucleic acid can be used, provided that the encoded CPR has the activity of transferring electrons from NADPH. In some embodiments, the CPR-encoding nucleic acid encodes a CPR that is capable of transferring electrons from NADPH to an isoprenoid-modifying enzyme, such as a sesquiterpene oxidase, encoded by the isoprenoid-modifying enzyme-encoding nucleic acid of the present invention. In some embodiments, the CPR-encoding nucleic acid is a CPR nucleic acid of the invention.

The methods of the invention can be used to produce various isoprenoid compounds, including but not limited to: artemisinic acid (e.g. when the sesquiterpene substrate is amorpha-4, 11-diene), isolongifolene alcohol (e.g. when the substrate is isolongifolene), (E) -trans-bergamot-2, 12-dien-14-ol (e.g. when the substrate is (-) - α -trans-bergamotene), (-) -elemi-1, 3, 11(13) -trien-12-ol (e.g. when the substrate is (-) - β -elemene), ma cage for animals trienol (germacra) -1(10), 4,11 (13) -trien-12-ol (e.g. when the substrate is (+) -germacrene A), germacrene B alcohol (e.g. when the substrate is germacrene B), 5, 11(13) -guaiadiene-12-ol (e.g. when the substrate is (+) - γ -guvacrene), A ledene alcohol (e.g., when the substrate is (+) -ledene), 4 beta-H-eucalyptol-11 (13) -ene-4, 12-diol (e.g., when the substrate is decahydrodimethylmethylvinylnaphthol (neointemienol)), and (+) -beta-costunolide (e.g., when the substrate is (+) -beta-cnidium, etc.); and derivatives of any of the foregoing.

In some embodiments, a genetically modified host cell of the invention is cultured in a suitable medium (e.g., Luria-Bertoni broth, optionally supplemented with one or more additional substances, e.g., an inducer (e.g., when the nucleotide sequence encoding the isoprenoid-modifying enzyme is under the control of an inducible promoter), etc.); the medium is covered with an organic solvent such as dodecane to form an organic layer. The isoprenoid compound produced by the genetically modified host cell partitions into the organic layer from which it is purified. In some embodiments, when the nucleotide sequence encoding the isoprenoid-modifying enzyme is operably linked to an inducible promoter, an inducer is added to the medium; after an appropriate time, the isoprenoid compound is isolated from the organic layer overlaid on the medium.

In some embodiments, the isoprenoid compound is separated from other products that may be present in the organic layer. It is not difficult to separate the isoprenoid compound from other products that may be present in the organic layer using, for example, standard chromatographic methods.

In some embodiments, the isoprenoid compounds synthesized by the methods of the present invention are further chemically modified in a cell-free reaction. For example, in some embodiments, artemisinic acid is isolated from the culture medium and/or cell lysate and further modified chemically in a cell-free reaction to produce artemisinin.

In some embodiments, the isoprenoid compound is pure, e.g., at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 98%, or greater than 98% pure, where reference to an isoprenoid compound is made, "pure" to refer to an isoprenoid compound that is free of other isoprenoid compounds, macromolecules, contaminants, and the like.

Examples

The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention nor are they intended to represent that the experiments below are all or the only experiments performed. Efforts have been made to ensure accuracy with respect to numbers used (e.g., amounts, temperature, etc.) but some experimental error and deviation should be accounted for. Unless otherwise indicated, parts are parts by weight, molecular weight is weight average molecular weight, temperature is in degrees Celsius, and pressure is at or near atmospheric. Standard abbreviations may be used, for example: bp, base pair; kb, kilobases; p1, picoliter; s or sec, seconds; min, min; h or hr, hours; aa, an amino acid; kb, kilobases; bp, base pair; nt, nucleotide; i.m., intramuscular; i.p., intraperitoneally; s.c., subcutaneous; and so on.

Example 1: cloning and sequencing of isoprenoid-modifying enzymes

Most enzymes known to hydroxylate terpenes are cytochrome P450 s. All available amino acid sequences of terpene hydroxylases were aligned with the amino acid sequence of cytochrome P450 of sunflower and lettuce. These two plants belong to the family Asteraceae (Asteraceae), to which Artemisia annua belongs. Isoprenoid-modifying enzymes such as the CYP71D family have clustered together, indicating that they have a common ancestor. Simple Polymerase Chain Reaction (PCR) primers were designed for amplification of genes of the Asteraceae CYP71D family.

Clones CYP71AV1 (also known as CYP71D-a4 or AMO) and CPR cDNA. A cDNA library was prepared from 50ng of total RNA purified from Artemisia annua trichome-rich (trichome-enriched) cells by SuperSMART PCR cDNA Synthesis kit (BD biosciences). Degenerate P450 primers were designed from conserved amino acid motifs of the lettuce and sunflower CYP71 subfamilies; primer 1 of [ Y/QGE/D ] [ H/Y ] WR (Forward) and primer 2 of FIPERF (reverse) (Table I provides sequence information for the primers).

Polymerase Chain Reaction (PCR) using these primers and Artemisia annua cDNA yielded a 1-kb DNA fragment. The PCR program used was 7 cycles with an annealing temperature of 48 ℃ and 27 cycles with an annealing temperature of 55 ℃. The putative amino acid identities of the amplified gene fragment were 85% and 88% for sunflower (QH _ CA _ Contig1442) and lettuce (QG _ CA _ Contig7108) contigs, respectively. The Asteraceae EST-database can be found on cgpdb. ucdavis. edu. The Artemisia annua CPR fragment was isolated using a forward primer (primer 3) and a reverse primer (primer 4) designed from conserved QYEHFNKI (SEQ ID NO: 32) and CGDAKGMA (SEQ ID NO: 33) motifs, respectively. The PCR procedure used was 30 cycles with an annealing temperature of 50 ℃. The 5 '-and 3' -terminal sequences of CYP71AV1 ("CYP 71D-A4") and CPR were determined using RLM-RACE kit (Ambion), and then the full-length cDNA was recovered from the Artemisia annua leaf cDNA. The open reading frames for CYP71AV1 and CPR were PCR amplified and ligated into the SpeI and BamHI/SalI sites of pESC-URA (Setchatty Co.) during FLAG and cMyc labeling, respectively. In the PCR amplification of CYP71AV1, primers 5 and 6 were used; in the PCR amplification of CPR, primers 7 and 8 are used. The PCR procedure used was 35 cycles with an annealing temperature of 55 ℃. All clones were sequenced to verify the sequence.

And (4) analyzing the plant extract. Artemisia annua leaves (100-200mg fresh weight) were vigorously shaken in 1mL of hexane mixed with 5.8. mu.M octadecane as an internal standard for 2 hours. The hexane extract was concentrated to 200. mu.L and the 1. mu.L samples were subjected to GC-MS analysis using a DB-XLB column (0.25 mmi.d.. times.0.25. mu. m.times.30 m, J & W Scientific) to determine the artemisinin content of the 14 plant samples as described above. Woerdenbag et al (1991) phytochem. anal., 2, 215-. The GC oven program used was from 100 to 250 ℃ at a ramp rate of 5 ℃/min. The botanical hexane extract was derivatized with TMS-diazomethane via GC-FID equipped with a DB5 column to determine the artemisinic acid content (n-8). The GC oven procedure used was 80 ℃ (hold for 2 minutes), ramping up to 140 ℃ at a ramp rate of 20 ℃/minute, and product isolation by ramping up to 220 ℃ at a ramp rate of 5 ℃/minute. Artemisinin standards were purchased from Sigma-Aldrich, st.

Synthesizing the arteannuin. Artemisinic acid (100.0mg, 0.43mmol) was dissolved in THF (10.0mL), LiAlH was added₄(17.0mg, 0.45 mmol). The heterogeneous mixture was kept under reflux (70 ℃ C.) for 15 hours. After cooling, the reaction was quenched with 3.0mL of water) and 15% aqueous NaOH (3.0mL), stirred for 10 minutes, and filtered through celite. Separating the organic phase with MgSO₄Dried and concentrated on a rotary evaporator. The product was purified by column chromatography (2: 1 hexanes/EtOAc) to give 61.0mg (65% yield) of the alcohol as a colorless oil. A small amount of artemisinic acid contaminant was further removed by column chromatography on neutral alumina (brueckman activity 1). The identification data are consistent with literature values.

Synthesizing the artemisinic aldehyde. The oxidation of arteannuin to artemisinic aldehyde is carried out according to literature reported methods. Sharpless et al Tetrahedron Letters 17, 2503-. Under an argon atmosphere containing RuCl₂(PPh₃)₃(17.0mg, 0.018mmol) and N-methylmorpholine N-oxide (60.0mg, 0.51mmol) in a flame-dried 10-mL flaskAcetone (4.0mL) was added. To this solution was added arteannuin (55.0mg, 0.25mmol) dissolved in acetone (1.0mL) by syringe. The mixture was stirred at 23 ℃ for 2 hours and concentrated in vacuo. The crude product was purified by column chromatography (4: 1 hexane/EtOAc) to give 32.0mg (59% yield) of artemisinic aldehyde as a colorless oil. The identification data is consistent with literature reports.

Production and characterization of EPY strains

A chemical. Dodecane and caryophyllene were purchased from sigma aldrich of st louis, missouri. 5-Fluoroorotic acid (5-FOA) was purchased from fermentation Research, Orench, Calif. (Zymo Research, Orange, Calif.). Complete supplement mixtures for formulating Synthetic Definitive (SD) media were purchased from Q biogene, Irvine, CA. All other media components were purchased from either sigma-aldrich or Becton-dikinson of Franklin lake, new jersey (Becton, Dickinson, Franklin Lakes, NJ).

Strains and culture media. Bacterial transformation and plasmid amplification were performed with the E.coli strains DH10B and DH5 α to construct expression plasmids for use in this study. With a solution containing 100mg L^-1Ampicillin LB (Luria-Bertani) Medium this strain was cultured at 37 ℃ except that DH5a was used in a medium containing 50mgL^-1Ampicillin-based media was grown with p.delta. -UB-based plasmids.

Saccharomyces cerevisiae strain BY4742(Brachmann et al, Yeast 14, 115-132(1998)) is a derivative of S288C and was used as a parent strain for all Yeast strains. The strain was cultured in YPD-rich medium. Burke et al, Methods in year Genetics: a Cold Spring Harbor Laboratory Manual (method of Yeast genetics: Manual of Cold Spring Harbor Laboratory) (Cold Spring Harbor Laboratory Press, Plainview, N.Y., 2000, of Prolain Wieder, N.Y.). The engineered yeast strains were cultured in SD medium (Burke et al, supra) with the appropriate omission of leucine, uracil, histidine and/or methionine. To induce the gene expressed by the GAL1 promoter, the s.cerevisiae strain was cultured with 2% galactose as the sole carbon source.

And (5) constructing a plasmid. To establish plasmid pRS425ADS expressing ADS with the GAL1 promoter, ADS was PCR amplified from pADS (Martin et al nat. Biotechnol.21: page 796-802 (2003)) using primer pairs 9 and 10 (Table 1). Using these primers, the nucleotide sequence 5 '-AAAACA-3' was cloned just upstream of the start codon of ADS. This consensus sequence was used to efficiently translate ADS and other galactose-inducible genes used in this study. The amplification product was cleaved with SpeI and HindIII and cloned into SpeI and HindIII digested pRS425GAL 1(Mumberg et al, Nucleic Acids Research 22, 5767-5768 (1994)).

To integrate the expression cassette of tHMGR, plasmid p.delta. -HMGR was constructed. First, SacII restriction sites were introduced into pRS426GAL1(Mumberg et al, supra) 5 'of the GAL1 promoter and 3' of the CYC1 terminator. To achieve this, PCR amplification was performed with primer pair 11 and 12 on the promoter-multiple cloning site-terminator cassette of pRS426GAL 1. The amplified product was cloned directly into PvuII-digested pRS426GAL1 to construct vector pRS 426-SacII. The catalytic domain of HMGl was PCR amplified from plasmid pRH127-3(Donald et al, appl. environ. Microbiol. 63: 3341-3344(1997)) using primer pair 13 and 14. The amplified product was cleaved with BamHI and SalI and cloned into BamHI and XhoI digested pRS 426-SacII. pRS-HMGR was cleaved with SacII, the cassette fragment gel extracted and cloned into SacII digested p.delta. -UB (Lee et al, Biotechnol. prog.13, 368-.

The UPC2-1 allele of UPC2 was PCR amplified from plasmid pBD33 using primer pair 15 and 16. The amplified product was cleaved with BamHI and SalI and cloned into BamHI and XhoI digested pRS426-SacII to generate plasmid pRS-UPC 2. To integrate UPC2-1, p δ -UPC2 was generated in the same manner by digesting pRS-UPC2 with SacII and transferring the appropriate fragment to p δ -UB.

In order to replace the ERG9 promoter with the MET3 promoter, plasmid pRS-ERG9 was constructed. Plasmid pRH973(Gardner et al, J.biol. chem.274, 31671-31678(1999)) contains a truncated 5' segment of ERG9 following the MET3 promoter. pRH973 was cleaved with ApaI and ClaI and cloned into ApaI and ClaI digested pRS403 (containing the HIS3 selectable marker) (Sikorski et al, Genetics 122, 19-27 (1989)).

To express ERG20, plasmid p δ -ERG20 was constructed. Plasmid pRS-SacII was first digested with SalI and XhoI to generate compatible cohesive ends. The plasmid was then self-ligated, eliminating SalI and XhoI sites, resulting in plasmid pRS-SacII-DX. ERG20 was PCR amplified from genomic DNA of BY4742 using primer pairs 17 and 18. The amplified product was digested with SpeI and SmaI and cloned into SpeI and SmaI digested pRS-SacII-DX. The pRS-ERG20 was then cleaved with SacII, the expression cassette fragment gel extracted, and cloned into SacII-digested p.delta. -UB.

Yeast transformation and strain construction. Saccharomyces cerevisiae strain BY4742(Brachmann et al, supra) is a derivative of S288C and was used as a parental strain for all strains of Saccharomyces cerevisiae. All strains of s.cerevisiae were transformed by the standard lithium acetate method. Gietz, R.D. and Woods, R.A. guide to Yeast Genetics and Molecular and cell biology (guide in Yeast Genetics and Molecular cell biology), part B, 87-96 (Academic Press Inc., San Diego, 2002). 3-10 colonies of each transformation were screened to select the transformants with the highest amorphadiene production. The strain BY4742 was transformed with the plasmid pRS425ADS and selected on SD-LEU plates to construct strain EPY 201. Plasmid p.delta. -HMGR was digested with XhoI and the DNA was then transformed into strain EPY 201. After primary selection on SD-LEU-URA plates, transformants were cultured and inoculated containing 1gL^-15-FOA on SD-LEU plates to select for the absence of the URA3 marker. The resulting uracil auxotrophic EPY208 was then transformed with XhoI-digested p delta-UPC 2 plasmid DNA. After primary selection on SD-LEU-URA plates, transformants were cultured and inoculated containing 1gL^-15-FOA on SD-LEU plate to construct EPY 210. Plasmid pRS-ERG9 was cut with HindII to cut P_MET3The ERG9 fusion is integrated into the ERG9 locus of EPY208 and EPY210 to construct EPY213 and EPY225, respectively. These strains were selected on SD-LEU-HIS-MET plates. EPY213 was then transformed with XhoI digested p.delta. -HMGR plasmid DNA. After primary selection on SD-LEU-URA-HIS-MET plates, transformation was culturedSeed and seed in a medium containing 1gL^-15-FOA on SD-LEU-HIS-MET plates to construct EPY 219. EPY219 was transformed with XhoI digested p.delta. -ERG20 plasmid DNA. After primary selection on SD-LEU-URA-HIS-MET plates, transformants were cultured and inoculated containing 1gL of^-15-FOA on SD-LEU-HIS-MET plates to construct EPY 224.

Integration of pRS-ERG9 was identified by PCR analysis using two sets of primers. Each set of primers contains one oligonucleotide capable of binding to the inserted DNA and one oligonucleotide capable of binding to the genomic DNA surrounding the insert. The full-length insert in all other integers was verified with primers that bound to the 5 '-end of the GAL1 promoter and the 3' -end of the fusion gene.

And (5) culturing the yeast. Determination of Optical Density (OD) at 600nm with a Beckman DU-640 Spectrophotometer₆₀₀). To examine amorphadiene production, culture tubes containing 5mL of SD (2% galactose) medium (with appropriate omission of certain amino acids as described above) were inoculated with the strain of interest. These inocula were grown to OD at 30 ℃₆₀₀Is 1-2. These seed cultures were used to inoculate unbaffled flasks (250mL) containing 50mL of SD medium to OD₆₀₀Is 0.05. Amorphadiene production was determined after 6 days of growth. 1mM methionine was present in each culture to inhibit P at the ERG9 locus_MET3ERG9 fusions. All flasks also contained 5mL of dodecane. The dodecane layer was sampled, diluted with ethyl acetate, and the amorphadiene production was determined by GC-MS.

Results

Artemisinin is produced in the specialized plant cell, glandular trichomes. Isolating glandular hair cells from artemisia annua; RNA was extracted from these cells. Using degenerate primers, a partial cDNA of a novel gene designated CYP71D-A4 was isolated. The full-length gene was recovered by Rapid Amplification of CDNA Ends (RACE). The nucleotide sequence of the cDNA coding region is shown in FIG. 1(SEQ ID NO: 1); the translated amino acid sequence is shown in FIG. 2(SEQ ID NO: 2).

Full-length CYP71D-A4cDNA was expressed in yeast cells. To determine amorphadiene oxidase activity, CYP71D-A4 was placed under the transcriptional control of the Gal10 promoter of the backbone plasmid pESC-URA (Setchaku Co.), where the CPR gene of Artemisia annua was expressed from the Gal1 promoter (AACPR; FIG. 3; amino acid sequence of encoded protein is shown in FIG. 4). The AACPR gene is obtained from Artemisia annua glandular hair mRNA by the degenerate primer PCR and RACE method.

To perform an in vivo assay for amorpha-4, 11-diene oxidase activity, the plasmid (p71D-A4/CPR:: pESC-URA) and a control plasmid lacking the CYP71D-A4 gene were transformed into a Saccharomyces cerevisiae cell engineered to produce amorpha-4, 11-diene. Briefly, these cells are strain BY4742 carrying an integrated gene encoding a truncated HMGCoA reductase, which is soluble in yeast. These cells carry pRS425ADS with a codon-optimized ADS gene under the control of the GAL1 promoter. Transformed cells were cultured in a medium omitting the synthesis of leucine and uracil, induced with 2% galactose for 29 hours, and the medium was extracted with ether. The extract was concentrated and 1. mu.l was analyzed by gas chromatography-mass spectrometry (GC-MS) equipped with an EXL column using a temperature program of raising the temperature from 50 ℃ to 250 ℃ at a rate of 5 ℃/min. Arteannuol and artemisinic aldehyde were synthesized using artemisinic acid standards and used as standards. In this way, two peaks were detected with cells expressing CPR and CYP71D-a4, but not with control cells expressing CPR alone. These peaks were determined to correspond to artenol and artemisinic aldehyde by comparing the retention times and mass spectra to standards. No artemisinic acid was detected; due to low volatility, it is estimated that artemisinic acid cannot be detected by GC without derivatization.

An in vivo feeding experiment of amorpha-4, 11-diene oxidase activity was performed in which the same two plasmids were transformed into wild-type s.cerevisiae strain YPH499, respectively. Yeast cells were cultured with 50mL of 2% dextrose and uracil-deprived medium and induced with 2% galactose for 24 hours. 5mL of the induced yeast cells were collected by centrifugation and resuspended in fresh medium containing 150. mu.M amorpha-4, 11-diene, artenol or artemisinic aldehyde. The yeast cells were then cultured at 30 ℃ for 5 hours. The medium was extracted with ether and then derivatized with N- (tert-butyldimethylsilyl) -N-methyltrifluoroacetamide to detect any artemisinic acid by GC-MS. Similar methods were also used to derive the artenol and artemisinic acid standards. The derivatized control and sample were each analyzed by GC-MS at 1. mu.L. The temperature program used was a ramp from 50 ℃ to 250 ℃ at a ramp rate of 5 ℃/min.

When the cells amorpha-4, 11-diene were administered, significant accumulation of artemisinic acid and small amounts of artemisinin alcohol and aldehyde compounds were detected only in yeast cells expressing both CPR and CYP71D-a4 (fig. 5A). The relative accumulation of artemisinic acid in the medium of yeast cells transformed with CPR/CYP71D-a4 was higher when cells were administered artemisinic alcohol or artemisinic aldehyde than in control strains transformed with CPR only (fig. 5B and 5C).

FIGS. 5A-C. Amorphadiene (fig. 5A) and two other artemisinin intermediates, arteannuin (fig. 5B) and artemisinic aldehyde (fig. 5C), were added at a concentration of 150 μ M to a medium in which yeast cells transformed with CPR only (upper chromatogram) and with CPR and CYP71D-a4 (lower chromatogram) were cultured, which had been induced with 2% galactose. Arrows indicate amorphadiene (1), arteannuol (2), artemisinic aldehyde (3) and artemisinic acid (4). Artemisinic alcohol (2) and artemisinic acid (4) were detected after derivatization with N- (tert-butyldimethylsilyl) -N-methyltrifluoroacetamide. Asterisks indicate substrate added to the medium.

The samples were verified for the actual presence of derivatized artemisinic acid using artemisinic acid standards (FIGS. 6A and 6B). These data indicate that the cytochrome P450 enzyme encoded in the CYP71D-a4 clone catalyzes the first hydroxylation, followed by oxidative conversion of artenol to artemisinic aldehyde and of artemisinic aldehyde to artemisinic acid, most likely by the CYP71D-a4 recombinase as well as the yeast endogenous oxidative activity.

Fig. 6A and 6B. The mass spectrum and retention time of the novel compound produced after administration of amorphadiene to CPR/71D-a4 transformed yeast cells is shown in figure 6A and the mass spectrum and retention time of the artemisinic acid standard is shown in figure 6B. After derivatization to increase the base molecular weight by 114 mass units, the products and standards were detected by GC-MS.

EPY224 was genetically modified with pESC-URA (pESC-URA:: AACPR/AMO) carrying CPR ("AACPR") and AMO ("CYP 17D-A4") to synthesize artemisinic acid de novo from simple sugars such as galactose in the engineered yeast. The construct encoding the truncated yeast HMGCoA reductase was integrated twice into yeast strain BY 4742. The transcription factor upc2-1 is overexpressed to increase the transcription level of several genes in the ergosterol biosynthesis pathway. The squalene synthase gene (ERG9) was down-regulated with the methionine repressible promoter MET 3. FPP synthase was overexpressed using the Gall promoter, and ADS was also overexpressed using the Gall promoter in the pRS425 backbone. Yeast strains EPY 2245 carrying pESC-URA:: AACPR/AMO were cultured in synthetic medium containing 1.8% galactose and 0.2% glucose at 30 ℃. The yeast cells were centrifuged and washed with alkaline buffer (Tris buffer pH 9). The buffer was acidified to pH2 by addition of HCl; the acidified buffer was extracted with ethyl acetate. TMS-diazomethane and methanol were added to the ethyl acetate component to derivatize artemisinic acid. Detecting artemisinic acid in the form of methyl ester by GC-MS.

FIGS. 7A-7C depict de novo production of artemisinic acid in yeast expressing AACPR and AMO. In contrast, no artemisinic acid was detected in control yeast strains expressing only AACPR. The new peak at 13.62 minutes (fig. 7A, peak 1) shows the same mass fragmentation pattern as real artemisinic acid from artemisia annua (fig. 7B and C).

FIGS. 8A-8C depict in vitro AMO enzyme assays. Microsomes were isolated from Saccharomyces cerevisiae YPH499 expressing AACPR or CPR/AMO. The chromatographic peaks of the substrate used are indicated by asterisks. For each enzyme assay, 10 μ M amorphadiene (a), 25 μ M artesunol (b) or 25 μ M artemisinic aldehyde (c) was used. The ether-extracted fractions were derivatized and analyzed by GC-MS in selective ionic mode (m/z: 121, 189, 204, 218, 220 and 248). The enzyme products were: 1, arteannuin [ retention time (Rt) ═ 13.20 ]; 2, artemisinic aldehyde (Rt. 11.79); artemisinic acid (Rt ═ 13.58, detected as the methyl ester).

FIG. 9 depicts the nucleotide sequence of a cDNA clone designated 71D-B1 (also referred to as "AMH" when used for amorphadiene hydroxylase) encoding a terpene hydroxylase.

FIG. 10 depicts the amino acid sequence of the protein encoded by 71D-B1 (AMH).

FIGS. 11A-C depict the hydroxylation activity of the recombinase encoded in AMH clone (71D-B1). When AMO is expressed in yeast overexpressing HMGCoA, the peak at 16.82 minutes in panel a is artemisinic acid; when AMH and AACPR are overexpressed in yeast overexpressing HMGCoA, the peak at 18.50 minutes in panel B is hydroxylated amorphadiene. The mass fragmentation pattern of hydroxylated amorphadiene is shown in FIG. 11C. The peak of the parent ion (220) of hydroxylated amorphadiene is shown, as are other typical ion fragmentation patterns for sesquiterpenes and terpenes (e.g., 93, 119, 132, 145, 159 and 177).

FIG. 12 depicts the nucleotide sequence of genomic DNA encoding terpene hydroxylase/oxidase.

While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the true spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation, material, composition of matter, process step or steps, to the objective, concept and scope of the present invention. All such modifications are intended to fall within the scope of the appended claims.

Claims

1. An isolated polynucleotide encoding an enzyme that modifies an isoprenoid compound, the polynucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO: 2, which polypeptide exhibits amorpha-4, 11-diene oxidase activity.

2. The polynucleotide of claim 1, wherein said polynucleotide consists of the nucleotide sequence of SEQ ID NO: 1, or a nucleotide sequence listed in the specification.

3. A recombinant vector comprising the polynucleotide of claim 1.

4. A host cell comprising the polynucleotide of claim 1.

5. A host cell comprising the recombinant vector of claim 3.

6. A method of producing an isoprenoid compound in a host cell, the method comprising:

culturing a genetically modified host cell in a suitable culture medium, wherein the host cell is genetically modified to produce an amorphadiene oxidase with a nucleic acid encoding an amino acid sequence as set forth in seq id NO: 2 or a pharmaceutically acceptable salt thereof,

wherein production of the amorpha diene oxidase in the presence of an amorpha-4, 11-diene substrate results in enzymatic modification of the amorpha-4, 11-diene substrate and production of an isoprenoid compound that is one or more of artemisinic alcohol, artemisinic aldehyde and artemisinic acid.

7. The method of claim 6, wherein the host cell is a eukaryotic host cell.

8. The method of claim 7, wherein the host cell is a yeast cell.

9. The method of claim 8, wherein the yeast cell is Saccharomyces cerevisiae.

10. The method of claim 7, wherein the host cell is a plant cell.

11. The method of claim 6, wherein said host cell is further genetically modified with a nucleic acid encoding an amorphadiene synthase, wherein said culturing produces said amorphadiene synthase, wherein said amorphadiene synthase modifies farnesyl pyrophosphate to produce amorpha-4, 11-diene, which is a substrate for said amorphadiene oxidase.

12. The method of claim 6, wherein said host cell is a host cell that does not normally synthesize isopentenyl pyrophosphate (IPP) via a mevalonate pathway, wherein said host cell is genetically modified with one or more nucleic acids encoding two or more enzymes of a mevalonate pathway, an IPP isomerase, a farnesyl pyrophosphate synthase, and an amorphadiene synthase, and wherein said culturing produces said mevalonate pathway enzymes, wherein production of said two or more mevalonate pathway enzymes, said IPP isomerase, said farnesyl pyrophosphate synthase, said amorphadiene synthase, and said amorphadiene oxidase results in production of an isoprenoid compound.

13. The method of claim 12, wherein the two or more mevalonate pathway enzymes comprise mevalonate kinase, phosphomevalonate kinase, and mevalonate pyrophosphate decarboxylase, and wherein the host cell is cultured in the presence of mevalonate.

14. The method of claim 12, wherein the two or more mevalonate pathway enzymes comprise acetoacetyl-CoA thiolase, hydroxymethylglutaryl-CoA synthase, hydroxymethylglutaryl-CoA reductase, mevalonate kinase, phosphomevalonate kinase, and mevalonate pyrophosphate decarboxylase.

15. The method of claim 6, wherein the host cell is genetically modified with a nucleic acid encoding a cytochrome P450 reductase CPR whose amino acid sequence is set forth in SEQ ID No. 4.

16. The method of claim 6, wherein the CPR-encoding nucleic acid consists of the nucleotide sequence set forth in SEQ ID NO. 3.

17. The method of claim 6, wherein the nucleotide sequence encoding said amorphadiene oxidase is operably linked to an inducible promoter.

18. The method of claim 6, wherein the host cell is a prokaryotic host cell.

19. The method of claim 18, wherein the prokaryotic host cell is e.

20. The method of claim 18, wherein the prokaryotic host cell is a prokaryotic host cell that synthesizes IPP generally via the 1-deoxy-D-xylulose 5-phosphate (DXP) pathway.

21. The method of claim 20, wherein the DXP pathway is inactivated.

22. The method of claim 6, further comprising isolating the isoprenoid compound.

23. The method of claim 6, wherein the isoprenoid compound is artemisinic acid.

24. The method of claim 23, further comprising modifying artemisinic acid to produce artemisinin.

25. An isolated polynucleotide encoding a cytochrome P450 reductase, the amino acid sequence of the cytochrome P450 reductase being as set forth in SEQ ID NO 4.

26. The isolated polynucleotide of claim 25, consisting of SEQ ID NO:3, or a nucleotide sequence listed in the specification.

27. A recombinant vector comprising the polynucleotide of claim 25.

28. A host cell comprising the polynucleotide of claim 25.

29. A host cell comprising the recombinant vector of claim 27.

30. A method of making a transgenic plant comprising: a genetic plant constructed with a nucleic acid comprising a polynucleotide of claim 1 encoding an isoprenoid-modifying enzyme, wherein the nucleic acid is expressed in a plant cell to produce the isoprenoid-modifying enzyme in the cell.

31. The method of claim 30, wherein the plant is a monocot.

32. The method of claim 30, wherein the plant is a dicot.

33. The method of claim 30, wherein the plant is tobacco.

34. The method of claim 30, wherein the nucleotide sequence encoding the isoprenoid-modifying enzyme is operably linked to a constitutive promoter.

35. The method of claim 30, wherein the nucleotide sequence encoding the isoprenoid-modifying enzyme is operably linked to an inducible promoter.

36. The method of claim 30, wherein the nucleotide sequence encoding the isoprenoid-modifying enzyme is operably linked to a tissue-specific promoter.

37. The method of claim 36, wherein the tissue-specific promoter is a trichome-specific promoter.

38. The method of claim 30, wherein the plant is artemisia annua.