HK1177464B

HK1177464B - Polyunsaturated fatty acid synthase nucleic acid molecules and polypeptides, compositions, and methods of making and uses thereof

Info

Publication number: HK1177464B
Application number: HK13104546.6A
Authority: HK
Inventors: K．E．阿普特; L.里克特; D.辛普森; R．泽克尔
Original assignee: 帝斯曼知识产权资产有限公司
Priority date: 2009-03-19
Filing date: 2010-03-19
Publication date: 2021-04-09

Description

Polyunsaturated fatty acid synthase nucleic acid molecules and polypeptides, compositions, preparation methods, and uses thereof

发明背景Background of the Invention

发明领域Field of the Invention

本发明涉及参与PUFA生产的多不饱和脂肪酸(PUFA)合酶的分离的核酸分子和多肽，其中所述PUFA包括富含二十二碳六烯酸(DHA)，二十碳五烯酸(EPA)或其组合的PUFA。本发明涉及含有所述核酸分子和所述核酸分子编码的多肽的载体和宿主细胞，含有所述核酸分子或多肽的组合物，及其制备方法和用途。The present invention relates to isolated nucleic acid molecules and polypeptides encoding polyunsaturated fatty acid (PUFA) synthases involved in PUFA production, wherein the PUFAs include PUFAs rich in docosahexaenoic acid (DHA), eicosapentaenoic acid (EPA), or a combination thereof. The present invention also relates to vectors and host cells containing the nucleic acid molecules and polypeptides encoded by the nucleic acid molecules, compositions containing the nucleic acid molecules or polypeptides, and methods for preparing and using the nucleic acid molecules or polypeptides.

发明背景Background of the Invention

破囊壶菌是破囊壶菌目的微生物，包括破囊壶菌属(Thraustochytrium)和裂殖壶菌属(Schizochytrium)，被认为是PUFA的替代来源。参见例如，美国专利号5,130,242。近来发现海洋细菌和破囊壶菌中的聚酮化合物合酶(PKS)样系统能从乙酰基-CoA和丙二酰CoA合成多不饱和脂肪酸(PUFA)。本文中这些PKS合酶样系统也称作PUFA合酶系统。美国专利6,140,486描述了海洋细菌希瓦氏菌(Shewanella)和海水螺菌(Vibrio marinus)中的PUFA系统。美国专利6,566,583中描述了破囊壶菌裂殖壶菌属中的PUFA合酶系统。美国专利7,247,461描述了裂殖壶菌属(Schizochytrium)和破囊壶菌属(Thraustochytrium)的破囊壶菌中的PUFA合酶系统。美国专利7,211,418描述了破囊壶菌属的破囊壶菌的PUFA合酶系统，及使用该系统产生二十碳五烯酸(C20:5，Ω3)(EPA)和其他PUFA。美国专利7,217,856描述了奥利那希瓦氏菌(Shewanella olleyana)和日本希瓦氏菌(Shewanella japonica)中的PUFA合酶。WO 2005/097982描述了SAM2179菌株中的PUFA合酶。美国专利7,208,590和7,368,552描述了来自金黄色破囊壶菌(Thraustochytrium aureum)的PUFA合酶基因和蛋白质。Thraustochytrium is a microorganism of the order Thraustochytriales, including the genera Thraustochytrium and Schizochytrium, and is considered an alternative source of PUFAs. See, for example, U.S. Patent No. 5,130,242. Recently, polyketide synthase (PKS)-like systems have been discovered in marine bacteria and thraustochytrium that can synthesize polyunsaturated fatty acids (PUFAs) from acetyl-CoA and malonyl-CoA. These PKS synthase-like systems are also referred to herein as PUFA synthase systems. U.S. Patent No. 6,140,486 describes PUFA systems in the marine bacteria Shewanella and Vibrio marinus. U.S. Patent No. 6,566,583 describes a PUFA synthase system in the thraustochytrium genus Schizochytrium. U.S. Patent 7,247,461 describes a PUFA synthase system in thraustochytrium of the genera Schizochytrium and Thraustochytrium. U.S. Patent 7,211,418 describes a PUFA synthase system in thraustochytrium of the genus Thraustochytrium and uses the system to produce eicosapentaenoic acid (C20:5, Ω3) (EPA) and other PUFAs. U.S. Patent 7,217,856 describes PUFA synthases in Shewanella olleyana and Shewanella japonica. WO 2005/097982 describes a PUFA synthase in strain SAM2179. U.S. Patents 7,208,590 and 7,368,552 describe PUFA synthase genes and proteins from Thraustochytrium aureum.

文献经典描述的PKS系统无非三种基本类型之一，通常称作I型(模块化或迭代)、II型和III型。I型模块PKS系统也称作“模块”PKS系统，I型迭代PKS系统也称作“I型”PKS系统。II型系统的特征是可分离的蛋白质，它们中的每一个都执行不同的酶反应。酶协同工作产生终产物，系统中的每种酶在终产物的产生中参与数次。这种类型的系统与植物和细菌中的脂肪酸合酶(FAS)系统的操作方式相似。I型迭代PKS系统与II型系统相似，因为酶以迭代方式用于产生终产物。I型迭代系统与II型系统的区别在于，其酶活不与可分离的蛋白质相关联，而存在于较大蛋白的结构域中。该系统与动物和真菌中的I型FAS系统类似。The PKS systems classically described in the literature are of one of three basic types, usually referred to as Type I (modular or iterative), Type II, and Type III. Type I modular PKS systems are also referred to as "modular" PKS systems, and Type I iterative PKS systems are also referred to as "Type I" PKS systems. Type II systems are characterized by separable proteins, each of which performs a different enzymatic reaction. The enzymes work together to produce the end product, and each enzyme in the system participates several times in the production of the end product. This type of system operates in a similar way to the fatty acid synthase (FAS) system in plants and bacteria. Type I iterative PKS systems are similar to Type II systems in that enzymes are used to produce the end product in an iterative manner. The difference between Type I iterative systems and Type II systems is that their enzymatic activity is not associated with separable proteins, but is present in the structural domain of larger proteins. This system is similar to the Type I FAS system in animals and fungi.

与II型系统不同，I型模块PKS系统的每一酶结构域在产生终产物的过程中仅使用一次。结构域存在于非常大的蛋白质中，并且每一反应的产物传递至PKS蛋白质的另一结构域。Unlike Type II systems, each enzyme domain in a Type I modular PKS system is used only once in the process of producing the final product. The domains are present in very large proteins, and the product of each reaction is passed to another domain of the PKS protein.

最近发现的III型系统属于植物缩合酶中的查耳酮合酶家族。III型PKS不同于I型和II型PKS系统，其通常利用游离CoA底物以迭代缩合反应产生杂环终产物。The recently discovered type III system belongs to the chalcone synthase family of plant condensing enzymes. Type III PKSs differ from type I and type II PKS systems in that they typically utilize free CoA substrates to produce heterocyclic end products through iterative condensation reactions.

在PUFA合成的常规或标准途径中，通过一系列的延伸和去饱和反应对中等链长的饱和脂肪酸(脂肪酸合酶(FAS)系统的产物)进行修饰。延伸反应的底物是脂肪酰基-CoA(待延伸的脂肪酸链)和丙二酰-CoA(在每一延伸反应中加入的两个碳的来源)。延伸酶反应的产物是线性链中加入两个碳的脂肪酰基-CoA。去饱和酶通过在氧依赖性反应中提取两个氢在存在的脂肪酸链中产生顺式双键。去饱和酶的底物是酰基辅酶A(在某些动物中)或酯化成磷脂(如磷脂酰胆碱)甘油主链的脂肪酸。In the conventional or standard pathway for PUFA synthesis, medium-chain saturated fatty acids (products of the fatty acid synthase (FAS) system) are modified through a series of elongation and desaturation reactions. The substrates for the elongation reactions are fatty acyl-CoA (the fatty acid chain to be elongated) and malonyl-CoA (the source of the two carbons added in each elongation reaction). The product of the elongase reaction is a fatty acyl-CoA with two carbons added to the linear chain. Desaturases create cis-double bonds in existing fatty acid chains by abstracting two hydrogens in an oxygen-dependent reaction. The substrates for desaturases are acyl-CoA (in some animals) or fatty acids esterified to the glycerol backbone of phospholipids (such as phosphatidylcholine).

依照脂肪酸的碳链长度和饱和特征对其进行分类。按照链中碳的数目可分为短链、中链或长链脂肪酸，当碳原子间没有双键时称为饱和脂肪酸，存在双键时则称为不饱和脂肪酸。当仅有一个双键存在时，不饱和长链脂肪酸是单不饱和的，当存在多于一个双键时则是多不饱和的。Fatty acids are classified based on their carbon chain length and saturation characteristics. Depending on the number of carbon atoms in the chain, they can be classified as short-chain, medium-chain, or long-chain fatty acids. When there are no double bonds between the carbon atoms, they are called saturated fatty acids, while when double bonds are present, they are called unsaturated fatty acids. Unsaturated long-chain fatty acids are monounsaturated when only one double bond is present, and polyunsaturated when more than one double bond is present.

按照脂肪酸甲基端的第一个双键的位置对PUFA进行归类：ω-3(n-3)脂肪酸在第三个碳上含有第一个双键，而ω-6(n-6)脂肪酸则在第六个碳上含有第一个双键。例如，二十二碳六烯酸(“DHA”)是链长22个碳，含有6个双键的ω-3 PUFA，通常记作“22:6 n-3”。其他ω-3 PUFA包括记作“20:5 n-3”的二十碳五烯酸(“EPA”)和记作“22:5 n-3”的ω-3二十二碳五烯酸(“DPA n-3”)。DHA和EPA称作“必需”脂肪酸。ω-6 PUFA包括记作“20:4 n-6”的花生四烯酸(″ARA″)，记作“22:5 n-6”的ω-6二十二碳五烯酸(“DPA n-6”)。PUFAs are classified according to the position of the first double bond at the methyl end of the fatty acid: ω-3 (n-3) fatty acids have their first double bond at the third carbon, while ω-6 (n-6) fatty acids have their first double bond at the sixth carbon. For example, docosahexaenoic acid ("DHA") is a 22-carbon ω-3 PUFA with six double bonds, often designated "22:6 n-3." Other ω-3 PUFAs include eicosapentaenoic acid ("EPA"), designated "20:5 n-3," and ω-3 docosapentaenoic acid ("DPA n-3"), designated "22:5 n-3." DHA and EPA are considered "essential" fatty acids. ω-6 PUFAs include arachidonic acid ("ARA"), designated "20:4 n-6," and ω-6 docosapentaenoic acid ("DPA n-6"), designated "22:5 n-6."

由于出现于细胞膜上，ω-3脂肪酸是影响细胞生理功能的重要生物学分子，它能调节生物活性化合物的产生和基因表达，并作为生物合成的底物。Roche，H.M.，Proc.Nutr.Soc.58：397-401(1999)。例如DHA，约占人大脑皮层中脂质的15％-20％，视网膜脂质的30％-60％，在睾丸和精子中富集，是母乳中的重要成分。Bergé，J.P.，和Barnathan，G.Adv.Biochem.Eng.Biotechnol.96：49-125(2005)。DHA占脑中ω-3脂肪酸的多达97％，占视网膜中ω-3脂肪酸的多达93％。而且，DHA是胎儿和婴儿发育，以及成人认知功能的维持所必需的。同上。因为在人体中ω-3脂肪酸并非从头合成，因此这些脂肪酸必需来源于营养成分。Because they appear on cell membranes, ω-3 fatty acids are important biological molecules that affect the physiological functions of cells. They can regulate the production of bioactive compounds and gene expression, and serve as substrates for biosynthesis. Roche, H.M., Proc. Nutr. Soc. 58: 397-401 (1999). For example, DHA accounts for about 15%-20% of the lipids in the human cerebral cortex and 30%-60% of the retinal lipids. It is enriched in testicles and sperm and is an important component of breast milk. Bergé, J.P., and Barnathan, G. Adv. Biochem. Eng. Biotechnol. 96: 49-125 (2005). DHA accounts for up to 97% of the ω-3 fatty acids in the brain and up to 93% of the ω-3 fatty acids in the retina. Moreover, DHA is necessary for fetal and infant development, as well as for the maintenance of cognitive function in adults. Same as above. Because ω-3 fatty acids are not synthesized de novo in the human body, these fatty acids must be derived from nutritional ingredients.

亚麻籽油和鱼油是公认的ω-3脂肪酸的良好膳食来源。亚麻籽油不包含EPA、DHA、DPA或ARA，而包含使身体制造EPA的基础成分亚麻酸(C18:3 n-3)。然而，有证据表明其代谢转化速率缓慢且不同，尤其是健康受损患者中。鱼油中脂肪酸组成的类型和含量存在显著差异，这取决于具体的物种及其饮食。例如，水产养殖鱼比野生鱼类的ω-3脂肪酸含量低。而且，鱼油有包含环境污染物的风险，且有稳定性问题和鱼的气味或味道。Flaxseed oil and fish oil are recognized as good dietary sources of omega-3 fatty acids. Flaxseed oil does not contain EPA, DHA, DPA, or ARA, but does contain linolenic acid (C18:3 n-3), a building block for the body to make EPA. However, there is evidence that its metabolic conversion rate is slow and variable, particularly in patients with compromised health. The type and amount of fatty acid composition in fish oils varies significantly, depending on the species and its diet. For example, aquacultured fish have lower levels of omega-3 fatty acids than wild fish. Furthermore, fish oils carry the risk of containing environmental contaminants, stability issues, and a fishy odor or taste.

破囊壶菌产生的油类与对应的鱼油或细菌油类相比常常具有更简单的多不饱和脂肪酸分布。Lewis，T.E.，Mar.Biotechnol.1：580-587(1999)。有报道称破囊壶菌菌株产生的ω-3脂肪酸占该有机体产生的总脂肪酸的比例高。美国专利5,130,242；Huang，J.等，J.Am.Oil.Chem.Soc.78：605-610(2001)；Huang，J.等，Mar.Biotechnol.5：450-457(2003)。然而，分离的破囊壶菌所产生的PUFA种类和数量不同，因此之前描述的一些菌株可具有不期望的PUFA分布特征。Oils produced by thraustochytrids often have a simpler polyunsaturated fatty acid profile than corresponding fish oils or bacterial oils. Lewis, T.E., Mar. Biotechnol. 1:580-587 (1999). It has been reported that strains of thraustochytrids produce a high proportion of ω-3 fatty acids in the total fatty acids produced by the organism. U.S. Patent 5,130,242; Huang, J. et al., J. Am. Oil. Chem. Soc. 78:605-610 (2001); Huang, J. et al., Mar. Biotechnol. 5:450-457 (2003). However, isolated thraustochytrids vary in the types and amounts of PUFAs produced, and therefore some previously described strains may have undesirable PUFA profiles.

通过对内源产生的脂肪酸进行修饰从而在油籽作物中产生PUFA。对具有脂肪酸延伸酶和去饱和酶的不同个体基因的植物进行遗传修饰，产生包含可检测水平的PUFA如EPA的叶子或种子，但也包含显著水平的混合短链和较少不饱和PUFA(Qi等，NatureBiotech.22：739(2004)；PCT公开号WO 04/071467；Abbadi等，PlantCell 16：1(2004))；Napier和Sayanova，Proc.Nutrition Society 64：387-393(2005)；Robert等，FunctionalPlant Biology 32：473-479(2005)；和美国申请公开号2004/0172682)。PUFAs are produced in oilseed crops by modifying endogenously produced fatty acids. Plants genetically modified with different individual genes for fatty acid elongases and desaturases produce leaves or seeds that contain detectable levels of PUFAs such as EPA, but also contain significant levels of mixed short-chain and less unsaturated PUFAs (Qi et al., Nature Biotech. 22:739 (2004); PCT Publication No. WO 04/071467; Abbadi et al., Plant Cell 16:1 (2004); Napier and Sayanova, Proc. Nutrition Society 64:387-393 (2005); Robert et al., Functional Plant Biology 32:473-479 (2005); and U.S. Application Publication No. 2004/0172682).

因此，还需要对与期望PUFA分布相关的核酸分子和多肽进行分离，以及通过使用所述核酸分子和多肽产生所需PUFA分布的方法。Therefore, there is a need for isolating nucleic acid molecules and polypeptides associated with a desired PUFA profile, and for methods of producing a desired PUFA profile using such nucleic acid molecules and polypeptides.

发明概述SUMMARY OF THE INVENTION

本发明涉及分离的核酸分子，其选自下组：(a)包含与SEQ ID NO：1至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有多不饱和脂肪酸(PUFA)合酶活性的多肽，所述活性选自β-酮酰-ACP合酶(KS)活性、丙二酰CoA:ACP酰基转移酶(MAT)活性、酰基运载体蛋白(ACP)活性、酮还原酶(KR)活性、β-羟酰基-ACP脱水酶(DH)活性及其组合；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 1, wherein the polynucleotide sequence encodes a polypeptide comprising a polyunsaturated fatty acid (PUFA) synthase activity selected from the group consisting of β-ketoacyl-ACP synthase (KS) activity, malonyl-CoA:ACP acyltransferase (MAT) activity, acyl carrier protein (ACP) activity, ketoreductase (KR) activity, β-hydroxyacyl-ACP dehydratase (DH) activity, and combinations thereof;

(b)包含与SEQ ID NO：7至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有KS活性的多肽；(b) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 7, wherein the polynucleotide sequence encodes a polypeptide comprising KS activity;

(c)包含与SEQ ID NO：9至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有MAT活性的多肽；(c) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 9, wherein the polynucleotide sequence encodes a polypeptide comprising MAT activity;

(d)包含与SEQ ID NO：13、15、17、19、21或23中任一项至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ACP活性的多肽；(d) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to any one of SEQ ID NOs: 13, 15, 17, 19, 21, or 23, wherein the polynucleotide sequence encodes a polypeptide comprising ACP activity;

(e)包含与SEQ ID NO：11至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ACP活性的多肽；(e) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 11, wherein the polynucleotide sequence encodes a polypeptide comprising ACP activity;

(f)包含与SEQ ID NO：25至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有KR活性的多肽；以及(f) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 25, wherein the polynucleotide sequence encodes a polypeptide comprising KR activity; and

以及(g)包含与SEQ ID NO：27至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽。在一些实施方式中，所述多核苷酸序列分别与SEQID NO：1、7、9、11、13、15、17、19、21、23、25和27至少90％或至少95％相同。在一些实施方式中，所述核酸分子分别包含SEQ ID NO：1、7、9、11、13、15、17、19、21、23、25和27的多核苷酸序列。and (g) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 27, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity. In some embodiments, the polynucleotide sequence is at least 90% or at least 95% identical to SEQ ID NO: 1, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, and 27, respectively. In some embodiments, the nucleic acid molecule comprises the polynucleotide sequence of SEQ ID NO: 1, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, and 27, respectively.

本发明涉及分离的核酸分子，其选自下组：(a)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：2至少80％相同的氨基酸序列，并且所述多肽含有选自KS活性、MAT活性、ACP活性、KR活性、DH活性及其组合的PUFA合酶活性；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 2, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, DH activity, and combinations thereof;

(b)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：8至少80％相同的氨基酸序列，并且所述多肽含有KS活性；(b) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 8, and the polypeptide comprises KS activity;

(c)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：10至少80％相同的氨基酸序列，并且所述多肽含有MAT活性；(c) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 10, and the polypeptide comprises MAT activity;

(d)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含与SEQ ID NO：14、16、18、20或24中任一项至少80％相同的氨基酸序列，并且所述多肽含有ACP活性；(d) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to any one of SEQ ID NO: 14, 16, 18, 20, or 24, and the polypeptide comprises ACP activity;

(e)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：12至少80％相同的氨基酸序列，并且所述多肽含有ACP活性；(e) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 12, and the polypeptide comprises ACP activity;

(f)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：26至少80％相同的氨基酸序列，并且所述多肽含有KR活性；(f) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 26, and the polypeptide comprises KR activity;

以及(g)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ IDNO：28至少80％相同的氨基酸序列，并且所述多肽含有DH活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：2、8、10、12、14、16、18、20、22、24、26和28至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ ID NO：2、8、10、12、14、16、18、20、22、24、26和28的氨基酸序列。and (g) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 28, and the polypeptide comprises DH activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NO: 2, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, and 28, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NO: 2, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, and 28, respectively.

本发明涉及分离的核酸分子，其选自下组：(a)含有与SEQ ID NO：3至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码包含选自KS活性、链长因子(CLF)活性、酰基转移酶(AT)活性、烯酰-ACP还原酶(ER)活性及其组合的PUFA合酶活性的多肽；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 3, wherein the polynucleotide sequence encodes a polypeptide comprising a PUFA synthase activity selected from the group consisting of KS activity, chain length factor (CLF) activity, acyltransferase (AT) activity, enoyl-ACP reductase (ER) activity, and combinations thereof;

(b)包含与SEQ ID NO：29至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有KS活性的多肽；(b) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 29, wherein the polynucleotide sequence encodes a polypeptide comprising KS activity;

(c)包含与SEQ ID NO：31至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有CLF活性的多肽；(c) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 31, wherein the polynucleotide sequence encodes a polypeptide comprising CLF activity;

(d)包含与SEQ ID NO：33至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有AT活性的多肽；以及(d) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 33, wherein the polynucleotide sequence encodes a polypeptide comprising AT activity; and

以及(e)包含与SEQ ID NO：35至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ER活性的多肽。在一些实施方式中，所述多核苷酸序列分别与SEQID NO：3、29、31、33和35至少90％或者95％相同。在一些实施方式中，所述核酸分子分别包含SEQ ID NO：3、29、31、33和35的多核苷酸序列。and (e) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 35, wherein the polynucleotide sequence encodes a polypeptide comprising ER activity. In some embodiments, the polynucleotide sequence is at least 90% or 95% identical to SEQ ID NOs: 3, 29, 31, 33, and 35, respectively. In some embodiments, the nucleic acid molecule comprises the polynucleotide sequence of SEQ ID NOs: 3, 29, 31, 33, and 35, respectively.

本发明涉及分离的核酸分子，其选自下组：(a)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：4至少80％相同的氨基酸序列，并且所述多肽含有选自KS活性、CLF活性、AT活性、ER活性及其组合的PUFA合酶活性；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 4, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, CLF activity, AT activity, ER activity, and combinations thereof;

(b)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：30至少80％相同的氨基酸序列，并且所述多肽含有KS活性；(b) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 30, and the polypeptide comprises KS activity;

(c)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：32至少80％相同的氨基酸序列，并且所述多肽含有CLF活性；(c) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 32, and the polypeptide comprises CLF activity;

(d)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：34至少80％相同的氨基酸序列，并且所述多肽含有AT活性；(d) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 34, and the polypeptide comprises AT activity;

以及(e)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ IDNO：36至少80％相同的氨基酸序列，并且所述多肽含有ER活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：4、30、32、34和36至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ ID NO：4、30、32、34和36的氨基酸序列。and (e) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 36, and the polypeptide comprises ER activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NOs: 4, 30, 32, 34, and 36, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NOs: 4, 30, 32, 34, and 36, respectively.

本发明涉及分离的核酸分子，其选自下组：(a)包含与SEQ ID NO：5至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码包含选自DH活性、ER活性及其组合的PUFA合酶活性的多肽；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 5, wherein the polynucleotide sequence encodes a polypeptide comprising a PUFA synthase activity selected from the group consisting of DH activity, ER activity, and combinations thereof;

(b)包含与SEQ ID NO：37至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽；(b) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 37, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity;

(c)包含与SEQ ID NO：39至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽；(c) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 39, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity;

以及(d)包含与SEQ ID NO：41至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ER活性的多肽；在一些实施方式中，所述多核苷酸序列分别与SEQID NO：5、37、39和41至少90％或者95％相同。在一些实施方式中，所述核酸分子分别包含SEQ ID NO：5、37、39和41的多核苷酸序列。and (d) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 41, wherein the polynucleotide sequence encodes a polypeptide comprising ER activity; in some embodiments, the polynucleotide sequence is at least 90% or 95% identical to SEQ ID NOs: 5, 37, 39, and 41, respectively. In some embodiments, the nucleic acid molecule comprises the polynucleotide sequences of SEQ ID NOs: 5, 37, 39, and 41, respectively.

本发明涉及分离的核酸分子，其选自下组：(a)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：6至少80％相同的氨基酸序列，并且所述多肽含有选自DH活性、ER活性及其组合的PUFA合酶活性；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 6, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of DH activity, ER activity, and combinations thereof;

(b)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：38至少80％相同的氨基酸序列，并且所述多肽含有DH活性；(b) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 38, and the polypeptide comprises DH activity;

(c)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：40至少80％相同的氨基酸序列，并且所述多肽含有DH活性；(c) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 40, and the polypeptide comprises DH activity;

以及(d)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ IDNO：42至少80％相同的氨基酸序列，并且所述多肽含有ER活性；在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：6、38、40和42至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ ID NO：6、38、40和42的氨基酸序列。and (d) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 42, and the polypeptide comprises ER activity; in some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NOs: 6, 38, 40, and 42, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NOs: 6, 38, 40, and 42, respectively.

本发明涉及分离的核酸分子，其选自下组：(a)含有与SEQ ID NO：68或120至少80％相同的核苷酸序列的核酸分子，其中所述多核苷酸编码包含选自KS活性、MAT活性、ACP活性、KR活性、DH活性及其组合的PUFA合酶活性的多肽；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a nucleotide sequence at least 80% identical to SEQ ID NO: 68 or 120, wherein the polynucleotide encodes a polypeptide comprising a PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, DH activity, and combinations thereof;

(b)包含与SEQ ID NO：74至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有KS活性的多肽；(b) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 74, wherein the polynucleotide sequence encodes a polypeptide comprising KS activity;

(c)包含与SEQ ID NO：76至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有MAT活性的多肽；(c) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 76, wherein the polynucleotide sequence encodes a polypeptide comprising MAT activity;

(d)包含与SEQ ID NO：80、82、84、86、88、90、92、94、96或98至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ACP活性的多肽；(d) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 80, 82, 84, 86, 88, 90, 92, 94, 96, or 98, wherein the polynucleotide sequence encodes a polypeptide comprising ACP activity;

(e)包含与SEQ ID NO：78至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ACP活性的多肽；(e) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 78, wherein the polynucleotide sequence encodes a polypeptide comprising ACP activity;

(f)包含与SEQ ID NO：100至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有KR活性的多肽；以及(f) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 100, wherein the polynucleotide sequence encodes a polypeptide comprising KR activity; and

以及(g)包含与SEQ ID NO：118至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽。在一些实施方式中，所述多核苷酸序列分别与SEQID NO：68、74、76、78、80、82、84、86、88、90、92、94、96、98、100、118和120至少90％或者95％相同。在一些实施方式中，所述核酸分子包含SEQ ID NO：68、74、76、78、80、82、84、86、88、90、92、94、96、98、100、118和120的多核苷酸序列。and (g) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 118, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity. In some embodiments, the polynucleotide sequence is at least 90% or 95% identical to SEQ ID NOs: 68, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 118, and 120, respectively. In some embodiments, the nucleic acid molecule comprises the polynucleotide sequence of SEQ ID NOs: 68, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 118, and 120.

本发明涉及分离的核酸分子，其选自下组：(a)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：69至少80％相同的氨基酸序列，并且所述多肽含有选自KS活性、MAT活性、ACP活性、KR活性、DH活性及其组合的PUFA合酶活性；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 69, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, DH activity, and combinations thereof;

(b)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：75至少80％相同的氨基酸序列，并且所述多肽含有KS活性；(b) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 75, and the polypeptide comprises KS activity;

(c)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：77至少80％相同的氨基酸序列，并且所述多肽含有MAT活性；(c) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 77, and the polypeptide comprises MAT activity;

(d)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：81、83、87、89、91、93、95、97或99至少80％相同的氨基酸序列，并且所述多肽含有ACP活性；(d) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 81, 83, 87, 89, 91, 93, 95, 97, or 99, and the polypeptide comprises ACP activity;

(e)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：79至少80％相同的氨基酸序列，并且所述多肽含有ACP活性；(e) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 79, and the polypeptide comprises ACP activity;

(f)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：101至少80％相同的氨基酸序列，并且所述多肽含有KR活性；(f) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 101, and the polypeptide comprises KR activity;

以及(g)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ IDNO：119至少80％相同的氨基酸序列，并且所述多肽含有DH活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：69、75、77、79、81、83、85、87、89、91、93、95、97、99、101和119至少90％或者95％相同。在一些实施方式中，所述多肽包含SEQ ID NO：69、75、77、79、81、83、85、87、89、91、93、95、97、99、101和119的氨基酸序列。and (g) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 119, and the polypeptide comprises DH activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NO: 69, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, and 119, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NO: 69, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, and 119.

本发明涉及分离的核酸分子，其选自下组：(a)包含与SEQ ID NO：70或SEQ ID NO：121至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码多肽包含选自KS活性、链长因子(CLF)活性、酰基转移酶(AT)活性、烯脂酰还原酶(ER)活性及其组合的PUFA合酶活性；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 70 or SEQ ID NO: 121, wherein the polynucleotide sequence encodes a polypeptide comprising a PUFA synthase activity selected from the group consisting of KS activity, chain length factor (CLF) activity, acyltransferase (AT) activity, enoyl reductase (ER) activity, and combinations thereof;

(b)包含与SEQ ID NO：102至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有KS活性的多肽；(b) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 102, wherein the polynucleotide sequence encodes a polypeptide comprising KS activity;

(c)包含与SEQ ID NO：104至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有CLF活性的多肽；(c) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 104, wherein the polynucleotide sequence encodes a polypeptide comprising CLF activity;

(d)包含与SEQ ID NO：106至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有AT活性的多肽；以及(d) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 106, wherein the polynucleotide sequence encodes a polypeptide comprising AT activity; and

以及(e)包含与SEQ ID NO：108至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ER活性的多肽。在一些实施方式中，所述多核苷酸序列分别与SEQID NO：70、102、104、106、108和121至少90％或者95％相同。在一些实施方式中，所述核酸分子分别包含SEQ ID NO：70、102、104、106、108和121的多核苷酸序列。and (e) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 108, wherein the polynucleotide sequence encodes a polypeptide comprising ER activity. In some embodiments, the polynucleotide sequence is at least 90% or 95% identical to SEQ ID NOs: 70, 102, 104, 106, 108, and 121, respectively. In some embodiments, the nucleic acid molecule comprises the polynucleotide sequence of SEQ ID NOs: 70, 102, 104, 106, 108, and 121, respectively.

本发明涉及分离的核酸分子，其选自下组：(a)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：71至少80％相同的氨基酸序列，并且所述多肽含有选自KS活性、CLF活性、AT活性、ER活性及其组合的PUFA合酶活性；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 71, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, CLF activity, AT activity, ER activity, and combinations thereof;

(b)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：103至少80％相同的氨基酸序列，并且所述多肽含有KS活性；(b) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 103, and the polypeptide comprises KS activity;

(c)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：105至少80％相同的氨基酸序列，并且所述多肽含有CLF活性；(c) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 105, and the polypeptide comprises CLF activity;

(d)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：107至少80％相同的氨基酸序列，并且所述多肽含有AT活性；(d) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 107, and the polypeptide comprises AT activity;

以及(e)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ IDNO：109至少80％相同的氨基酸序列，并且所述多肽含有ER活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：71、103、105、107和109至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ ID NO：71、103、105、107和109的氨基酸序列。and (e) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 109, and the polypeptide comprises ER activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NO: 71, 103, 105, 107, and 109, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NO: 71, 103, 105, 107, and 109, respectively.

本发明涉及分离的核酸分子，其选自下组：(a)含有与SEQ ID NO：72或122至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码包含选自DH活性、ER活性及其组合的PUFA合酶活性的多肽；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 72 or 122, wherein the polynucleotide sequence encodes a polypeptide comprising a PUFA synthase activity selected from the group consisting of DH activity, ER activity, and combinations thereof;

(b)包含与SEQ ID NO：110至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽；(b) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 110, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity;

(c)包含与SEQ ID NO：112至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽；(c) a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 112, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity;

以及(d)包含与SEQ ID NO：114至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ER活性的多肽。在一些实施方式中，所述多核苷酸序列分别与SEQID NO：72、110、112、114和122至少90％或者95％相同。在一些实施方式中，所述核酸分子分别包含SEQ ID NO：72、110、112、114和122的多核苷酸序列。and (d) a nucleic acid molecule comprising a polynucleotide sequence at least 80% identical to SEQ ID NO: 114, wherein the polynucleotide sequence encodes a polypeptide comprising ER activity. In some embodiments, the polynucleotide sequence is at least 90% or 95% identical to SEQ ID NOs: 72, 110, 112, 114, and 122, respectively. In some embodiments, the nucleic acid molecule comprises the polynucleotide sequence of SEQ ID NOs: 72, 110, 112, 114, and 122, respectively.

本发明涉及分离的核酸分子，其选自下组：(a)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：73至少80％相同的氨基酸序列，并且所述多肽含有选自DH活性、ER活性及其组合的PUFA合酶活性；The present invention relates to an isolated nucleic acid molecule selected from the group consisting of: (a) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 73, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of DH activity, ER activity, and combinations thereof;

(b)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：111至少80％相同的氨基酸序列，并且所述多肽含有DH活性；(b) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 111, and the polypeptide comprises DH activity;

(c)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：113至少80％相同的氨基酸序列，并且所述多肽含有DH活性；(c) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 113, and the polypeptide comprises DH activity;

以及(d)含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ IDNO：115至少80％相同的氨基酸序列，并且所述多肽含有ER活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：73、111、113和115至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ ID NO：73、111、113和115的氨基酸序列。and (d) a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 115, and the polypeptide comprises ER activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NO: 73, 111, 113, and 115, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NO: 73, 111, 113, and 115, respectively.

本发明涉及包含编码多肽的多核苷酸序列的分离的核酸分子，所述多肽包含选自KS活性、MAT活性、ACP活性、KR活性、CLF活性、AT活性、ER活性、DH活性及其组合的PUFA合酶活性，其中所述多核苷酸在严谨条件下与上述任何多核苷酸序列的互补序列杂交。The present invention relates to an isolated nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide comprising a PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, CLF activity, AT activity, ER activity, DH activity, and combinations thereof, wherein the polynucleotide hybridizes under stringent conditions to the complement of any of the above polynucleotide sequences.

本发明涉及含有多核苷酸序列的分离的核酸分子，所述多核苷酸序列与上述任何多核苷酸序列完全互补。The present invention relates to an isolated nucleic acid molecule comprising a polynucleotide sequence that is fully complementary to any of the polynucleotide sequences described above.

本发明涉及包含上述任何核酸分子及其组合以及转录控制序列的重组核酸分子。在一些实施方式中，所述重组核酸分子是重组载体。The present invention relates to a recombinant nucleic acid molecule comprising any of the above nucleic acid molecules and combinations thereof and a transcription control sequence. In some embodiments, the recombinant nucleic acid molecule is a recombinant vector.

本发明涉及表达上述任何核酸分子、上述任何重组核酸分子及其组合的宿主细胞。在一些实施方式中，所述宿主细胞选自植物细胞、微生物细胞和动物细胞。在一些实施方式中，所述微生物细胞是细菌。在一些实施方式中，所述细菌是大肠杆菌(E.coli)。在一些实施方式中，所述细菌是海洋细菌。在一些实施方式中，所述微生物细胞是破囊壶菌。在一些实施方式中，所述破囊壶菌是裂殖壶菌属(Schizochytrium)。在一些实施方式中，所述破囊壶菌是破囊壶菌属(Thraustochytrium)。在一些实施方式中，所述破囊壶菌是吾肯氏壶菌属(Ulkenia)。The present invention relates to host cells expressing any of the above-mentioned nucleic acid molecules, any of the above-mentioned recombinant nucleic acid molecules, and combinations thereof. In some embodiments, the host cell is selected from plant cells, microbial cells, and animal cells. In some embodiments, the microbial cell is a bacterium. In some embodiments, the bacterium is Escherichia coli (E. coli). In some embodiments, the bacterium is a marine bacterium. In some embodiments, the microbial cell is a thraustochytrium. In some embodiments, the thraustochytrium is a Schizochytrium. In some embodiments, the thraustochytrium is a Thraustochytrium. In some embodiments, the thraustochytrium is an Ulkenia.

本发明涉及产生至少一种PUFA的方法，包括：在能有效产生PUFA的条件下，于宿主细胞中表达PUFA合酶基因，其中所述PUFA合酶基因包含任何上述分离的核酸分子，任何上述重组核酸分子或其组合，以及产生至少一种PUFA。在本实施方式的一个方面，所述宿主细胞选自植物细胞、分离的动物细胞和微生物细胞。在本实施方式的另一方面，至少一种PUFA包含二十二碳六烯酸(DHA)或二十碳五烯酸(EPA)。The present invention relates to a method for producing at least one PUFA, comprising: expressing a PUFA synthase gene in a host cell under conditions effective to produce the PUFA, wherein the PUFA synthase gene comprises any of the above-described isolated nucleic acid molecules, any of the above-described recombinant nucleic acid molecules, or a combination thereof, and producing the at least one PUFA. In one aspect of this embodiment, the host cell is selected from a plant cell, an isolated animal cell, and a microbial cell. In another aspect of this embodiment, the at least one PUFA comprises docosahexaenoic acid (DHA) or eicosapentaenoic acid (EPA).

本发明涉及一种产生富含DHA、EPA或其组合的脂质的方法，包括：在能有效产生脂质的条件下，于宿主细胞中表达PUFA合酶基因，其中所述PUFA合酶基因包含任何上述分离的核酸分子，任何上述重组核酸分子或其组合，并且产生富含DHA、EPA或其组合的脂质。本发明涉及产生重组载体的方法，包括将任何一种上述分离的核酸分子插入载体。The present invention relates to a method for producing lipids enriched in DHA, EPA, or a combination thereof, comprising: expressing a PUFA synthase gene in a host cell under conditions effective to produce the lipids, wherein the PUFA synthase gene comprises any of the above-described isolated nucleic acid molecules, any of the above-described recombinant nucleic acid molecules, or a combination thereof, and producing lipids enriched in DHA, EPA, or a combination thereof. The present invention also relates to a method for producing a recombinant vector, comprising inserting any of the above-described isolated nucleic acid molecules into a vector.

本发明涉及产生重组宿主细胞的方法，包括将上述重组载体引入宿主细胞。在一些实施方式中，所述宿主细胞选自植物细胞、分离的动物细胞和微生物细胞。The present invention relates to a method for producing a recombinant host cell, comprising introducing the above-mentioned recombinant vector into a host cell. In some embodiments, the host cell is selected from plant cells, isolated animal cells and microbial cells.

本发明涉及上述任何多核苷酸序列编码的分离多肽。The present invention relates to isolated polypeptides encoded by any of the above-mentioned polynucleotide sequences.

本发明涉及分离多肽，其选自下组：(a)含有与SEQ ID NO：2至少80％相同的氨基酸序列的多肽，其中所述多肽包含选自KS活性、MAT活性、ACP活性、KR活性、DH活性及其组合的PUFA合酶活性；The present invention relates to an isolated polypeptide selected from the group consisting of: (a) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 2, wherein the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, DH activity, and combinations thereof;

(b)含有与SEQ ID NO：8至少80％相同的氨基酸序列的多肽，其中所述多肽含有KS活性；(b) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 8, wherein the polypeptide comprises KS activity;

(c)含有与SEQ ID NO：10至少80％相同的氨基酸序列的多肽，其中所述多肽含有MAT活性；(c) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 10, wherein the polypeptide comprises MAT activity;

(d)含有与SEQ ID NO：14、16、18、20或24中任一项至少80％相同的氨基酸序列的多肽，其中所述多肽含有ACP活性；(d) a polypeptide comprising an amino acid sequence that is at least 80% identical to any one of SEQ ID NOs: 14, 16, 18, 20, or 24, wherein the polypeptide comprises ACP activity;

(e)含有与SEQ ID NO：12至少80％相同的氨基酸序列的多肽，其中所述多肽含有ACP活性；(e) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 12, wherein the polypeptide comprises ACP activity;

(f)含有与SEQ ID NO：26至少80％相同的氨基酸序列的多肽，其中所述多肽含有KR活性；并且(f) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 26, wherein the polypeptide comprises KR activity; and

以及(g)含有与SEQ ID NO：28至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：2、8、10、12、14、16、18、20、22、24、26和28至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ IDNO：2、8、10、12、14、16、18、20、22、24、26和28的氨基酸序列。and (g) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 28, wherein the polypeptide comprises DH activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NOs: 2, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, and 28, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NOs: 2, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, and 28, respectively.

本发明涉及分离多肽，其选自下组：(a)含有与SEQ ID NO：4至少80％相同的氨基酸序列的多肽，其中所述多肽包含选自KS活性、CLF活性、AT活性、ER活性及其组合的PUFA合酶活性；The present invention relates to an isolated polypeptide selected from the group consisting of: (a) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 4, wherein the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, CLF activity, AT activity, ER activity, and combinations thereof;

(b)含有与SEQ ID NO：30至少80％相同的氨基酸序列的多肽，其中所述多肽含有KS活性；(b) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 30, wherein the polypeptide comprises KS activity;

(c)含有与SEQ ID NO：32至少80％相同的氨基酸序列的多肽，其中所述多肽含有CLF活性；(c) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 32, wherein the polypeptide comprises CLF activity;

(d)含有与SEQ ID NO：34至少80％相同的氨基酸序列的多肽，其中所述多肽含有AT活性；并且(d) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 34, wherein the polypeptide comprises AT activity; and

以及(e)含有与SEQ ID NO：36至少80％相同的氨基酸序列的多肽，其中所述多肽含有ER活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：4、30、32、34和36至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ ID NO：4、30、32、34和36的氨基酸序列。and (e) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 36, wherein the polypeptide comprises ER activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NOs: 4, 30, 32, 34, and 36, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NOs: 4, 30, 32, 34, and 36, respectively.

本发明涉及分离多肽，其选自下组：(a)含有与SEQ ID NO：6至少80％相同的氨基酸序列的多肽，其中所述多肽包含选自DH活性、ER活性及其组合的PUFA合酶活性；The present invention relates to an isolated polypeptide selected from the group consisting of: (a) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 6, wherein the polypeptide comprises a PUFA synthase activity selected from the group consisting of DH activity, ER activity, and combinations thereof;

(b)含有与SEQ ID NO：38至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性；(b) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 38, wherein the polypeptide comprises DH activity;

(c)含有与SEQ ID NO：40至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性；(c) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 40, wherein the polypeptide comprises DH activity;

以及(e)含有与SEQ ID NO：42至少80％相同的氨基酸序列的多肽，其中所述多肽含有ER活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：6、38、40和42至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ ID NO：6、38、40和42的氨基酸序列。and (e) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 42, wherein the polypeptide comprises ER activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NOs: 6, 38, 40, and 42, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NOs: 6, 38, 40, and 42, respectively.

本发明涉及分离多肽，其选自下组：(a)含有与SEQ ID NO：69至少80％相同的氨基酸序列的多肽，其中所述多肽包含选自KS活性、MAT活性、ACP活性、KR活性、DH活性及其组合的PUFA合酶活性；The present invention relates to an isolated polypeptide selected from the group consisting of: (a) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 69, wherein the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, DH activity, and combinations thereof;

(b)含有与SEQ ID NO：75至少80％相同的氨基酸序列的多肽，其中所述多肽含有KS活性；(b) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 75, wherein the polypeptide comprises KS activity;

(c)含有与SEQ ID NO：77至少80％相同的氨基酸序列的多肽，其中所述多肽含有MAT活性；(c) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 77, wherein the polypeptide comprises MAT activity;

(d)含有与SEQ ID NO：81、83、85、87、89、91、93、95、97或99中任一项至少80％相同的氨基酸序列的多肽，其中所述多肽含有ACP活性；(d) a polypeptide comprising an amino acid sequence that is at least 80% identical to any one of SEQ ID NOs: 81, 83, 85, 87, 89, 91, 93, 95, 97, or 99, wherein the polypeptide comprises ACP activity;

(e)含有与SEQ ID NO：79至少80％相同的氨基酸序列的多肽，其中所述多肽含有ACP活性；(e) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 79, wherein the polypeptide comprises ACP activity;

(f)含有与SEQ ID NO：101至少80％相同的氨基酸序列的多肽，其中所述多肽含有KR活性；并且(f) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 101, wherein the polypeptide comprises KR activity; and

以及(g)含有与SEQ ID NO：119至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：69、75、77、79、81、83、85、87、89、91、93、95、97、99、101和119至少90％或者95％相同。在一些实施方式中，所述多肽包含SEQ ID NO：69、75、77、79、81、83、85、87、89、91、93、95、97、99、101和119的氨基酸序列。and (g) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 119, wherein the polypeptide comprises DH activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NOs: 69, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, and 119, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NOs: 69, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, and 119.

本发明涉及分离多肽，其选自下组：(a)含有与SEQ ID NO：71至少80％相同的氨基酸序列的多肽，其中所述多肽包含选自KS活性、CLF活性、AT活性、ER活性及其组合的PUFA合酶活性；The present invention relates to an isolated polypeptide selected from the group consisting of: (a) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 71, wherein the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, CLF activity, AT activity, ER activity, and combinations thereof;

(b)含有与SEQ ID NO：103至少80％相同的氨基酸序列的多肽，其中所述多肽含有KS活性；(b) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 103, wherein the polypeptide comprises KS activity;

(c)含有与SEQ ID NO：105至少80％相同的氨基酸序列的多肽，其中所述多肽含有CLF活性；(c) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 105, wherein the polypeptide comprises CLF activity;

(d)含有与SEQ ID NO：107至少80％相同的氨基酸序列的多肽，其中所述多肽含有AT活性；并且(d) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 107, wherein the polypeptide comprises AT activity; and

以及(e)含有与SEQ ID NO：109至少80％相同的氨基酸序列的多肽，其中所述多肽含有ER活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：71、103、105、107和109至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ ID NO：71、103、105、107和109的氨基酸序列。And (e) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 109, wherein the polypeptide comprises ER activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NOs: 71, 103, 105, 107, and 109, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NOs: 71, 103, 105, 107, and 109, respectively.

本发明涉及分离多肽，其选自下组：(a)含有与SEQ ID NO：73至少80％相同的氨基酸序列的多肽，其中所述多肽包含选自DH活性、ER活性及其组合的PUFA合酶活性；The present invention relates to an isolated polypeptide selected from the group consisting of: (a) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 73, wherein the polypeptide comprises a PUFA synthase activity selected from the group consisting of DH activity, ER activity, and combinations thereof;

(b)含有与SEQ ID NO：111至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性；(b) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 111, wherein the polypeptide comprises DH activity;

(c)含有与SEQ ID NO：113至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性；(c) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 113, wherein the polypeptide comprises DH activity;

以及(e)含有与SEQ ID NO：115至少80％相同的氨基酸序列的多肽，其中所述多肽含有ER活性。在一些实施方式中，所述氨基酸序列分别与SEQ ID NO：73、111、113和115至少90％或者95％相同。在一些实施方式中，所述多肽分别包含SEQ ID NO：73、111、113和115的氨基酸序列。and (e) a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 115, wherein the polypeptide comprises ER activity. In some embodiments, the amino acid sequence is at least 90% or 95% identical to SEQ ID NOs: 73, 111, 113, and 115, respectively. In some embodiments, the polypeptide comprises the amino acid sequence of SEQ ID NOs: 73, 111, 113, and 115, respectively.

在一些实施方式中，任何本发明的分离多肽可以是融合多肽。In some embodiments, any of the isolated polypeptides of the present invention can be a fusion polypeptide.

本发明涉及包含上述任何多肽和生物学可接受运载体的组合物。The present invention relates to a composition comprising any of the above-mentioned polypeptides and a biologically acceptable carrier.

本发明涉及一种在具有PUFA合酶活性的有机体中增加DHA、EPA或其组合产量的方法，包括：在可有效产生DHA、EPA或其组合的条件下，于有机体中，表达上述任何分离的核酸分子，上述任何的重组核酸分子，或其组合，其中PUFA合酶活性在有机体中取代失活或缺失的活性、引入新活性或增强已有活性，其中有机体中DHA、EPA或其组合的产量是增加的。The present invention relates to a method for increasing the production of DHA, EPA or a combination thereof in an organism having PUFA synthase activity, comprising: expressing any of the above-described isolated nucleic acid molecules, any of the above-described recombinant nucleic acid molecules, or a combination thereof in the organism under conditions effective to produce DHA, EPA or a combination thereof, wherein the PUFA synthase activity replaces an inactivated or deleted activity, introduces a new activity, or enhances an existing activity in the organism, wherein the production of DHA, EPA or a combination thereof in the organism is increased.

本发明涉及从宿主细胞中分离脂质的方法：包括(a)在能有效产生脂质的条件下，于宿主细胞中表达PUFA合酶基因，其中所述PUFA合酶基因包含任何上述分离的核酸分子、任何上述重组核酸分子或其组合，和(b)从所述宿主细胞中分离脂质。在一些实施方式中，所述宿主细胞选自植物细胞、分离的动物细胞和微生物细胞。在一些实施方式中，所述脂质包括DHA、EPA或其组合。The present invention relates to a method for isolating lipids from a host cell, comprising (a) expressing a PUFA synthase gene in the host cell under conditions effective to produce the lipids, wherein the PUFA synthase gene comprises any of the above-described isolated nucleic acid molecules, any of the above-described recombinant nucleic acid molecules, or a combination thereof, and (b) isolating the lipids from the host cell. In some embodiments, the host cell is selected from a plant cell, an isolated animal cell, and a microbial cell. In some embodiments, the lipids comprise DHA, EPA, or a combination thereof.

附图简要说明BRIEF DESCRIPTION OF THE DRAWINGS

图1显示本发明中裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 PUFA合酶的基因结构。FIG1 shows the gene structure of the Schizochytrium sp. ATCC PTA-9695 PUFA synthase of the present invention.

图2显示本发明中破囊壶菌(Thraustochytrium sp)ATCC PTA-10212 PUFA合酶的基因结构。FIG2 shows the gene structure of the Thraustochytrium sp. ATCC PTA-10212 PUFA synthase of the present invention.

图3显示本发明中裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695和破囊壶菌(Thraustochytrium sp)ATCC PTA-10212 PUFA合酶，以及来自裂殖壶菌(Schizochytriumsp.)ATCC 20888、破囊壶菌(Thraustochytrium sp)ATCC20892、金黄色破囊壶菌(Thraustochytrium aureum)和SAM2179的合酶的结构域结构。3 shows the domain structures of Schizochytrium sp. ATCC PTA-9695 and Thraustochytrium sp. ATCC PTA-10212 PUFA synthases of the present invention, as well as synthases from Schizochytrium sp. ATCC 20888, Thraustochytrium sp. ATCC 20892, Thraustochytrium aureum, and SAM2179.

图4显示本发明裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa1p氨基酸序列(SEQ ID NO：2)和破囊壶菌(Thraustochytrium sp)ATCC PTA-10212 Pfa1p氨基酸序列(SEQ ID NO：69)与来自裂殖壶菌(Schizochytrium sp.)ATCC 20888(SEQ ID NO：54)和破囊壶菌(Thraustochytrium sp)ATCC 20892(SEQ ID NO：56)的OrfA序列以及来自金黄色破囊壶菌(Thraustochytrium aureum)的ORF A序列(SEQ ID NO：55)的比对。Figure 4 shows an alignment of the Schizochytrium sp. ATCC PTA-9695 Pfa1p amino acid sequence (SEQ ID NO: 2) and the Thraustochytrium sp. ATCC PTA-10212 Pfa1p amino acid sequence (SEQ ID NO: 69) of the present invention with the OrfA sequences from Schizochytrium sp. ATCC 20888 (SEQ ID NO: 54) and Thraustochytrium sp. ATCC 20892 (SEQ ID NO: 56), and the ORF A sequence from Thraustochytrium aureum (SEQ ID NO: 55).

图5显示本发明裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa2p氨基酸序列(SEQ ID NO：4)和破囊壶菌(Thraustochytrium sp)ATCC PTA-10212 Pfa2p氨基酸序列(SEQ ID NO：71)与来自裂殖壶菌(Schizochytrium sp.)ATCC 20888(SEQ ID NO：57)与破囊壶菌(Thraustochytrium sp)ATCC 20892(SEQ ID NO：58)的OrfB序列以及来自金黄色破囊壶菌(Thraustochytrium aureum)的ORF B序列(SEQ ID NO：59)的比对。Figure 5 shows an alignment of the Schizochytrium sp. ATCC PTA-9695 Pfa2p amino acid sequence (SEQ ID NO: 4) and the Thraustochytrium sp. ATCC PTA-10212 Pfa2p amino acid sequence (SEQ ID NO: 71) of the present invention with the OrfB sequences from Schizochytrium sp. ATCC 20888 (SEQ ID NO: 57) and Thraustochytrium sp. ATCC 20892 (SEQ ID NO: 58), and the ORF B sequence from Thraustochytrium aureum (SEQ ID NO: 59).

图6显示本发明裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa3p氨基酸序列(SEQ ID NO：6)和破囊壶菌(Thraustochytrium sp)ATCC PTA-10212 Pfa3p氨基酸序列(SEQ ID NO：73)与来自裂殖壶菌(Schizochytrium sp.)ATCC 20888(SEQ ID NO：61)与破囊壶菌(Thraustochytrium sp)ATCC 20892(SEQ ID NO：60)的ORF C序列的比对。Figure 6 shows an alignment of the Schizochytrium sp. ATCC PTA-9695 Pfa3p amino acid sequence (SEQ ID NO: 6) and the Thraustochytrium sp. ATCC PTA-10212 Pfa3p amino acid sequence (SEQ ID NO: 73) of the present invention with the ORF C sequence from Schizochytrium sp. ATCC 20888 (SEQ ID NO: 61) and Thraustochytrium sp. ATCC 20892 (SEQ ID NO: 60).

图7显示裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 PFA1的多核苷酸序列(SEQ ID NO：1)。FIG. 7 shows the polynucleotide sequence of Schizochytrium sp. ATCC PTA-9695 PFA1 (SEQ ID NO: 1).

图8显示裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa1p的氨基酸序列(SEQ ID NO：2)。FIG8 shows the amino acid sequence of Schizochytrium sp. ATCC PTA-9695 Pfa1p (SEQ ID NO: 2).

图9显示裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 PFA2的多核苷酸序列(SEQ ID NO：3)。FIG. 9 shows the polynucleotide sequence of Schizochytrium sp. ATCC PTA-9695 PFA2 (SEQ ID NO: 3).

图10显示裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa2p的氨基酸序列(SEQ ID NO：4)。FIG. 10 shows the amino acid sequence of Schizochytrium sp. ATCC PTA-9695 Pfa2p (SEQ ID NO: 4).

图11显示裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 PFA3的多核苷酸序列(SEQ ID NO：5)。FIG. 11 shows the polynucleotide sequence of Schizochytrium sp. ATCC PTA-9695 PFA3 (SEQ ID NO: 5).

图12显示裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa3p的氨基酸序列(SEQ ID NO：6)。FIG. 12 shows the amino acid sequence of Schizochytrium sp. ATCC PTA-9695 Pfa3p (SEQ ID NO: 6).

图13显示裂殖壶菌(Schizochytrium sp.)ATCC PTA-10212 PFA1的多核苷酸序列(SEQ ID NO：68)。FIG. 13 shows the polynucleotide sequence of Schizochytrium sp. ATCC PTA-10212 PFA1 (SEQ ID NO: 68).

图14显示为在裂殖壶菌中表达进行密码子优化的破囊壶菌(Thraustochytriumsp.)ATCC PTA-10212PFA1的多核苷酸序列(SEQ ID NO：120)。FIG. 14 shows the polynucleotide sequence of Thraustochytrium sp. ATCC PTA-10212 PFA1 codon-optimized for expression in Schizochytrium (SEQ ID NO: 120).

图15显示破囊壶菌(Thraustochytrium sp.)ATCC PTA-10212 Pfa1p的氨基酸序列(SEQ ID NO：69)。FIG. 15 shows the amino acid sequence of Thraustochytrium sp. ATCC PTA-10212 Pfa1p (SEQ ID NO: 69).

图16显示破囊壶菌(Thraustochytrium sp)ATCC PTA-10212 PFA2的多核苷酸序列(SEQ ID NO：70)。FIG. 16 shows the polynucleotide sequence of Thraustochytrium sp. ATCC PTA-10212 PFA2 (SEQ ID NO: 70).

图17显示为在裂殖壶菌中表达进行密码子优化的破囊壶菌(Thraustochytriumsp.)ATCC PTA-10212 PFA2的多核苷酸序列(SEQ ID NO：121)。FIG. 17 shows the polynucleotide sequence of Thraustochytrium sp. ATCC PTA-10212 PFA2 (SEQ ID NO: 121) codon-optimized for expression in Schizochytrium.

图18显示破囊壶菌(Thraustochytrium sp.)ATCC PTA-10212 Pfa2p的氨基酸序列(SEQ ID NO：71)。FIG. 18 shows the amino acid sequence of Thraustochytrium sp. ATCC PTA-10212 Pfa2p (SEQ ID NO: 71).

图19显示破囊壶菌(Thraustochytrium sp)ATCC PTA-10212 PFA3的多核苷酸序列(SEQ ID NO：72)。FIG. 19 shows the polynucleotide sequence of Thraustochytrium sp. ATCC PTA-10212 PFA3 (SEQ ID NO: 72).

图20显示为在裂殖壶菌中表达进行密码子优化的破囊壶菌(Thraustochytriumsp.)ATCC PTA-10212 PFA3的多核苷酸序列(SEQ ID NO：122)。FIG. 20 shows the polynucleotide sequence of Thraustochytrium sp. ATCC PTA-10212 PFA3 (SEQ ID NO: 122) codon-optimized for expression in Schizochytrium.

图21显示破囊壶菌(Thraustochytrium sp.)ATCC PTA-10212 Pfa3p的氨基酸序列(SEQ ID NO：73)。FIG. 21 shows the amino acid sequence of Thraustochytrium sp. ATCC PTA-10212 Pfa3p (SEQ ID NO: 73).

图22显示裂殖壶菌的密码子使用表。Figure 22 shows a codon usage table for Schizochytrium.

发明详述Detailed Description of the Invention

本发明涉及参与PUFA产生的多不饱和脂肪酸(PUFA)合酶的分离的核酸分子和多肽，所述PUFA包括富含二十二碳六烯酸(DHA)、二十碳五烯酸(EPA)及其组合的PUFA。本发明涉及包含所述核酸分子和所述核酸分子编码的多肽的载体和宿主细胞，包含所述核酸分子或多肽的组合物及其制备方法和用途。The present invention relates to isolated nucleic acid molecules and polypeptides that are involved in the production of polyunsaturated fatty acid (PUFA) synthases, including PUFAs rich in docosahexaenoic acid (DHA), eicosapentaenoic acid (EPA), and combinations thereof. The present invention also relates to vectors and host cells containing the nucleic acid molecules and polypeptides encoded by the nucleic acid molecules, compositions containing the nucleic acid molecules or polypeptides, and methods for preparing and using the compositions.

PUFA合酶PUFA synthase

如本文所用术语“PUFA合酶”指参与多不饱和脂肪酸产生的酶。参见例如，Metz等，Science 293：290-293(2001).As used herein, the term "PUFA synthase" refers to an enzyme involved in the production of polyunsaturated fatty acids. See, for example, Metz et al., Science 293:290-293 (2001).

本发明部分涉及三个名为Pfa1p(SEQ ID NO：2或SEQ ID NO：69)、Pfa2p(SEQ IDNO：4或SEQ ID NO：71)和Pfa3p(SEQ ID NO：6或SEQ ID NO：73)的PUFA合酶亚基，以及编码名为PFA1(SEQ ID NO：1、SEQ ID NO：68或SEQ ID NO：120)、PFA2(SEQ ID NO：3、SEQ ID NO：70或SEQ ID NO：121)和PFA3(SEQ ID NO：5、SEQ ID NO：72或SEQ ID NO：122)亚基的基因。参见图1-3和7-21。其他破囊壶菌的PUFA合酶分别记作ORF 1、ORF 2和ORF 3，或OrfA、OrfB、和OrfC。参见美国专利号7,247,461和7,256,022中的裂殖壶菌(Schizochytrium sp.)(ATCC 20888)和破囊壶菌(Thraustochytrium sp.)(ATCC 20892)中记作orfA、orfB和orfC基因以及对应的OrfA、OrfB和OrfC蛋白，以及美国专利号7,368,552中的金黄色破囊壶菌(Thraustochytrium aureum)(ATCC 34304)中记作ORF A、ORF B和ORF C基因和蛋白质。参见WO/2005/097982中的SAM2179菌株，记作ORF 1、ORF 2和ORF 3基因和蛋白质。The present invention relates, in part, to three PUFA synthase subunits designated Pfa1p (SEQ ID NO:2 or SEQ ID NO:69), Pfa2p (SEQ ID NO:4 or SEQ ID NO:71), and Pfa3p (SEQ ID NO:6 or SEQ ID NO:73), as well as genes encoding subunits designated PFA1 (SEQ ID NO:1, SEQ ID NO:68, or SEQ ID NO:120), PFA2 (SEQ ID NO:3, SEQ ID NO:70, or SEQ ID NO:121), and PFA3 (SEQ ID NO:5, SEQ ID NO:72, or SEQ ID NO:122). See Figures 1-3 and 7-21. PUFA synthases from other thraustochytrids are designated ORF 1, ORF 2, and ORF 3, or OrfA, OrfB, and OrfC, respectively. See U.S. Patent Nos. 7,247,461 and 7,256,022 for the orfA, orfB, and orfC genes and corresponding OrfA, OrfB, and OrfC proteins of Schizochytrium sp. (ATCC 20888) and Thraustochytrium sp. (ATCC 20892), and U.S. Patent No. 7,368,552 for the ORFA, ORFB, and ORFC genes and proteins of Thraustochytrium aureum (ATCC 34304). See WO/2005/097982 for the SAM2179 strain for the ORF1, ORF2, and ORF3 genes and proteins.

核酸分子Nucleic acid molecules

本发明涉及包含PUFA合酶基因多核苷酸序列的分离的核酸分子，以及来源于分离的微生物的结构域，它是2009年3月19日提交的共待审美国申请号12/407,687的主题，将其整体纳入作参考。该微生物于2009年1月7日保存在布达佩斯条约下的美国典型培养物保藏中心，专利保藏处，大学大道10801号，马纳萨斯，VA20110-2209(American Type CultureCollection，Patent Depository，10801 University Boulevard，Manassas，VA 20110-2209)，ATCC登录号PTA-9695，也被称为裂殖壶菌(Schizochytrium sp)ATCC PTA-9695。这些基因表达产生独特的脂肪酸分布，部分特征是高水平的ω-3脂肪酸，尤其是高水平的DHA。The present invention relates to isolated nucleic acid molecules comprising polynucleotide sequences of PUFA synthase genes, and domains derived from the isolated microorganism, which is the subject of co-pending U.S. application Ser. No. 12/407,687, filed on March 19, 2009, which is incorporated by reference in its entirety. The microorganism was deposited under the Budapest Treaty with the American Type Culture Collection, Patent Depository, 10801 University Boulevard, Manassas, VA 20110-2209, on January 7, 2009, under ATCC Accession No. PTA-9695, also known as Schizochytrium sp. ATCC PTA-9695. Expression of these genes results in a unique fatty acid profile characterized in part by high levels of omega-3 fatty acids, particularly DHA.

本发明涉及包含PUFA合酶基因多核苷酸序列的分离的核酸分子，以及来源于分离的微生物的结构域，它是2010年1月19日提交的美国申请号61/296,456的主题，将其整体纳入作参考。该微生物于2009年7月14日保存在布达佩斯条约下的美国典型培养物保藏中心，专利登记，大学大道10801号，马纳萨斯，VA20110-2209(American Type CultureCollection，Patent Depository，10801 University Boulevard，Manassas，VA 20110-2209)，ATCC登录号PTA-10212，也被称为在破囊壶菌(Thraustochytrium sp)ATCC PTA-10212。这些基因表达产生独特的脂肪酸分布，其部分特征是高水平的ω-3脂肪酸，尤其是高水平的DHA、EPA及其组合。The present invention relates to isolated nucleic acid molecules comprising polynucleotide sequences of PUFA synthase genes, and domains derived from the isolated microorganism, which is the subject of U.S. Application No. 61/296,456, filed January 19, 2010, which is incorporated by reference in its entirety. The microorganism was deposited with the American Type Culture Collection, Patent Depository, 10801 University Boulevard, Manassas, VA 20110-2209, under the Budapest Treaty, on July 14, 2009, under ATCC Accession No. PTA-10212, also known as Thraustochytrium sp. ATCC PTA-10212. Expression of these genes results in a unique fatty acid profile characterized, in part, by high levels of omega-3 fatty acids, particularly high levels of DHA, EPA, and combinations thereof.

如本文所用“多核苷酸”可包含常规磷酸二酯键或非常规键(如酰胺键，如肽核酸(PNA)中的酰胺键)。多核苷酸可包含全长cDNA序列的核苷酸序列，包括未翻译的5′和3′序列，编码序列，以及片段、表位、结构域和核酸序列变体。多核苷酸可由任何多核糖核苷酸或多脱氧核糖核苷酸组成，可以是未修饰的RNA或DNA或修饰的RNA或DNA。例如，多核苷酸可由单链或双链DNA、单链或双链区混合物DNA、单链或双链RNA、单链或双链区混合物RNA、包含单链或更典型双链或者单链区和双链区混合物的DNA和RNA的杂交分子组成。此外，多核苷酸可由包含RNA或DNA或RNA和DNA的三链区组成。多核苷酸可包含核糖核苷(腺苷、鸟苷、尿苷或胞苷；“RNA分子”)或脱氧核糖核苷(脱氧腺苷、脱氧鸟苷、脱氧胸苷或脱氧胞苷；“DNA分子”)或任何其磷酸酯类似物，如硫代磷酸酯和硫酯。多核苷酸还可包含一个或多个修饰的碱基或为了稳定性或其它原因修饰的DNA或RNA主链。“修饰的”碱基包括，例如三苯甲基碱基和非常见碱基如次黄嘌呤核苷。可对DNA和RNA进行多种修饰；因此“多核苷酸”涵盖化学、酶或代谢修饰形式。术语核酸分子仅仅指分子的一级和二级结构，不限定任何特定三级形式。因此，该术语包含双链DNA，尤其是线状或环状DNA分子(如限制性片段)、质粒和染色体中的双链DNA等。当讨论特殊双链DNA分子的结构时，本文可以按普通常规描述序列，仅在沿非转录的DNA链的5′到3′方向上给出序列(即具有mRNA同源序列的链)。As used herein, "polynucleotide" may comprise conventional phosphodiester bonds or unconventional bonds (such as amide bonds, such as the amide bonds in peptide nucleic acids (PNA)). A polynucleotide may comprise the nucleotide sequence of a full-length cDNA sequence, including untranslated 5' and 3' sequences, coding sequences, as well as fragments, epitopes, domains, and nucleic acid sequence variants. A polynucleotide may be composed of any polyribonucleotide or polydeoxyribonucleotide, and may be unmodified RNA or DNA or modified RNA or DNA. For example, a polynucleotide may be composed of single-stranded or double-stranded DNA, single-stranded or double-stranded region mixture DNA, single-stranded or double-stranded RNA, single-stranded or double-stranded region mixture RNA, a hybrid molecule comprising a single-stranded or more typically double-stranded or single-stranded and double-stranded region mixture of DNA and RNA. In addition, a polynucleotide may be composed of a triple-stranded region comprising RNA or DNA or RNA and DNA. A polynucleotide may comprise ribonucleosides (adenosine, guanosine, uridine, or cytidine; "RNA molecules") or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; "DNA molecules") or any phosphate analogs thereof, such as phosphorothioates and thioesters. A polynucleotide may also comprise one or more modified bases or a DNA or RNA backbone modified for stability or other reasons. "Modified" bases include, for example, trityl bases and unusual bases such as inosine. DNA and RNA can be modified in a variety of ways; thus, "polynucleotide" encompasses chemically, enzymatically, or metabolically modified forms. The term nucleic acid molecule refers only to the primary and secondary structure of the molecule and is not limited to any particular tertiary form. Thus, the term encompasses double-stranded DNA, particularly double-stranded DNA in linear or circular DNA molecules (e.g., restriction fragments), plasmids, and chromosomes. When discussing the structure of a particular double-stranded DNA molecule, the sequence may be described herein in the general convention that the sequence is given only in the 5' to 3' direction along the non-transcribed DNA strand (ie, the strand with mRNA homologous sequence).

术语“分离的”核酸分子指从天然环境中移出的核酸分子，DNA或RNA。分离的核酸分子的其它例子包括含有异源宿主细胞中维持的重组多核苷酸或溶液中的纯化(部分或基本全部)的多核苷酸的核酸分子。分离的RNA分子包括本发明多核苷酸的体内或体外RNA转录物。根据本发明，分离的核酸分子还包括合成产生的此类分子。此外，核酸分子或多核苷酸可包括调控元件，如启动子，核糖体结合位点，或转录终止子。The term "isolated" nucleic acid molecule refers to a nucleic acid molecule, DNA or RNA, that has been removed from its natural environment. Other examples of isolated nucleic acid molecules include nucleic acid molecules containing recombinant polynucleotides maintained in heterologous host cells or purified (partially or substantially completely) polynucleotides in solution. Isolated RNA molecules include in vivo or in vitro RNA transcripts of polynucleotides of the present invention. According to the present invention, isolated nucleic acid molecules also include synthetically produced such molecules. In addition, the nucleic acid molecule or polynucleotide may include regulatory elements, such as promoters, ribosome binding sites, or transcription terminators.

“基因”指编码多肽的核苷酸的组装，包括cDNA和基因组DNA核酸。“基因”还指表达特定蛋白质的核酸片段，包括各编码片段(外显子)间的间隔序列(内含子)，以及编码序列前(5’非编码序列)后(3’非编码序列)的调控序列。“天然基因”指自然存在具有自身调控序列的基因。"Gene" refers to an assembly of nucleotides that encodes a polypeptide, including cDNA and genomic DNA nucleic acids. "Gene" also refers to a nucleic acid segment that expresses a specific protein, including intervening sequences (introns) between coding segments (exons), as well as regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence. "Native gene" refers to a gene that occurs in nature with its own regulatory sequences.

本发明涉及包含与裂殖壶菌ATCC PTA-9695 PFA1(SEQ ID NO：1)、裂殖壶菌ATCCPTA-9695 PFA2(SEQ ID NO：3)、裂殖壶菌ATCC PTA-9695 PFA3(SEQ ID NO：5)、破囊壶菌ATCC PTA-10212 PFA1(SEQ ID NO：68或SEQ ID NO：120)、破囊壶菌ATCC PTA-10212 PFA2(SEQ ID NO：70或SEQ ID NO：121)、破囊壶菌ATCC PTA-10212 PFA3(SEQ ID NO：72或SEQID NO：122)的多核苷酸序列至少80％相同的多核苷酸序列及其组合的分离核酸分子，其中所述多核苷酸编码包含一种或多种PUFA合酶活性的多肽。The present invention relates to isolated nucleic acid molecules comprising a polynucleotide sequence that is at least 80% identical to the polynucleotide sequence of Schizochytrium ATCC PTA-9695 PFA1 (SEQ ID NO: 1), Schizochytrium ATCC PTA-9695 PFA2 (SEQ ID NO: 3), Schizochytrium ATCC PTA-9695 PFA3 (SEQ ID NO: 5), Thraustochytrium ATCC PTA-10212 PFA1 (SEQ ID NO: 68 or SEQ ID NO: 120), Thraustochytrium ATCC PTA-10212 PFA2 (SEQ ID NO: 70 or SEQ ID NO: 121), Thraustochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 72 or SEQ ID NO: 122), and combinations thereof, wherein the polynucleotide encodes a polypeptide comprising one or more PUFA synthase activities.

PUFA合酶活性与每一合酶多肽中的一个或多个结构域相关，其中所述结构域可通过其保守结构或功能性基序与已知基序同源性鉴别，也可基于特定生化活性鉴别。参见例如，美国专利号7,217,856，将其整体纳入本文参考。PUFA合酶结构域的例子包括：Pfa1p中的β-酮酰基ACP合酶(KS)结构域，丙二酰-CoA：ACP酰基转移酶(MAT)结构域，酰基载体蛋白(ACP)结构域，酮还原酶(KR)结构域和β-羟酰基-ACP脱水酶(DH)结构域；Pfa2p中的KS结构域，链长因子(CLF)结构域，酰基转移酶(AT)结构域和烯酰基-ACP还原酶(ER)结构域；Pfa3p中的DH结构域和ER结构域。PUFA synthase activity is associated with one or more domains within each synthase polypeptide, wherein the domains can be identified by homology to known motifs through conserved structural or functional motifs, or based on specific biochemical activity. See, for example, U.S. Patent No. 7,217,856, which is incorporated herein by reference in its entirety. Examples of PUFA synthase domains include: a β-ketoacyl ACP synthase (KS) domain, a malonyl-CoA:ACP acyltransferase (MAT) domain, an acyl carrier protein (ACP) domain, a ketoreductase (KR) domain, and a β-hydroxyacyl-ACP dehydratase (DH) domain in Pfa1p; a KS domain, a chain length factor (CLF) domain, an acyltransferase (AT) domain, and an enoyl-ACP reductase (ER) domain in Pfa2p; and a DH domain and an ER domain in Pfa3p.

具有β-酮酰基-ACP合酶(KS)生物学活性(功能)的多肽或多肽结构域之前被证明能够进行脂肪酸延伸反应循环的最初步骤。术语“β-酮酰基-ACP合酶”与术语“3-酮酰基-ACP合酶”“β-酮酰基-ACP合酶”和“酮酰基ACP合酶”可互换使用。在其它系统中，用于延伸的酰基与半胱氨酸残基在KS活性位点通过硫酯键链接，酰基-KS与丙二酰ACP缩合形成酮酰基-ACP、CO₂以及未成键的(“游离的”)KS。在这类系统中，与反应循环的其它多肽相比，KS显示出更强的底物特异性。通过与已知的KS序列同源性，易于鉴定多肽(或多肽的结构域)属于KS家族。Polypeptides or polypeptide domains having β-ketoacyl-ACP synthase (KS) biological activity (function) have previously been shown to be capable of carrying out the initial step of the fatty acid elongation reaction cycle. The term "β-ketoacyl-ACP synthase" is used interchangeably with the terms "3-ketoacyl-ACP synthase,""β-ketoacyl-ACPsynthase," and "ketoacyl-ACP synthase." In other systems, the acyl group for elongation is linked to a cysteine residue via a thioester bond in the KS active site, and the acyl-KS condenses with malonyl-ACP to form ketoacyl-ACP, _CO2 , and unbonded ("free") KS. In such systems, KS exhibits enhanced substrate specificity compared to other polypeptides in the reaction cycle. Polypeptides (or domains of polypeptides) can be readily identified as belonging to the KS family by homology to known KS sequences.

具有丙二酰CoA：ACP酰基转移酶(MAT)活性的多肽或多肽结构域显示能够将丙二酰部分从丙二酰-CoA部分转移到ACP上。术语“丙二酰-CoA:ACP酰基转移酶”与“丙二酰酰基转移酶”可互换使用。除了活性位点基序(GxSxG)外，MAT还具有扩展基序(关键位置的R和Q氨基酸)。通过与已知MAT序列的同源性或扩展基序结构，易于鉴定多肽(或多肽的结构域)属于MAT家族。Polypeptides or polypeptide domains having malonyl-CoA:ACP acyltransferase (MAT) activity are shown to be able to transfer a malonyl moiety from a malonyl-CoA moiety to an ACP. The terms "malonyl-CoA:ACP acyltransferase" and "malonyl acyltransferase" are used interchangeably. In addition to the active site motif (GxSxG), MATs also possess an extended motif (R and Q amino acids at key positions). Polypeptides (or domains of polypeptides) belonging to the MAT family can be readily identified by homology to known MAT sequences or by the structure of the extended motif.

具有酰基运载体蛋白质(ACP)活性的多肽或多肽结构域显示能够作为通过硫酯链接至共价结合的辅因子而生长脂肪酰基链的运载体。ACP通常长约80-100个氨基酸，通过将CoA的磷酸泛酰巯基乙胺基部分转移到ACP的高度保守的丝氨酸残基，从而由失活的脱辅基形式转化为活性全蛋白形式。通过硫酯链接，酰基与ACP在磷酸泛酰巯基乙胺基部分的游离末端链接。多种活性位点基序(LGIDS*)的出现也被认为是ACP的特征。活性位点丝氨酸(S*)的功能性已在细菌PUFA合酶中证实(Jiang等，J.Am.Chem.Soc.130：6336-7(2008))。利用放射性泛酰巯基乙胺标记或通过与已知ACP的序列同源性易于鉴定多肽(或多肽结构域)属于ACP家族。Polypeptides or polypeptide domains with acyl carrier protein (ACP) activity are shown to function as carriers for the growth of fatty acyl chains via thioester linkage to a covalently bound cofactor. ACPs are typically approximately 80-100 amino acids in length and are converted from an inactive apoprotein form to an active holoprotein form by transferring the phosphopantetheinyl moiety of CoA to a highly conserved serine residue in the ACP. The acyl group is linked to the ACP at the free end of the phosphopantetheinyl moiety via a thioester linkage. The presence of various active site motifs (LGIDS*) is also considered a characteristic of ACPs. The functionality of the active site serine (S*) has been demonstrated in bacterial PUFA synthases (Jiang et al., J. Am. Chem. Soc. 130:6336-7 (2008)). Polypeptides (or polypeptide domains) belonging to the ACP family can be readily identified using radioactive pantethein labeling or by sequence homology to known ACPs.

具有脱水酶或脱水酶(DH)活性的多肽或多肽结构域显示能够催化脱水反应。提及DH活性通常指FabA-样β-羟酰基-ACP脱水酶生物学活性。FabA-样β-羟酰基-ACP脱水酶生物学活性从β-酮酰基基-ACP中移除HOH，首先在碳链中产生反式双键。术语“FabA-样β-羟酰基-ACP脱水酶”与术语“FabA-样β-羟酰基-ACP脱水酶”、“β-羟酰基-ACP脱水酶”和“脱水酶”互换使用。PUFA合酶系统的DH结构域与FAS系统相关细菌DH酶(而非其它PKS系统的DH结构域)同源。参见例如，美国专利号7,217,856，将其整体纳入本文参考。细菌DH的一个子集，FabA-样DH，具有顺反异构酶活性(Heath等，J.Biol.Chem.，271，27795(1996))。基于与FabA-样DH蛋白的同源性，一种或全部PUFA合酶系统DH结构域可将顺式双键插入PUFA合酶产物中。多肽或结构域也可具有非FabA-样DH活性，或者非FabA-样β-羟酰基-ACP脱水酶(DH)活性。更具体说，在PUFA合酶DH结构域中鉴定出长约13个氨基酸的保守活性位点基序：LxxHxxxGxxxxP(基序中的L位置也可以是I)。参见例如，美国专利号7,217,856,和DonadioS，Katz L.，Gene 111(1)：51-60(1992)，将每一纳入整体纳入作参考。在所有已知PUFA合酶序列中的相似区域可以找到该保守基序，可能与非FabA样脱水相关。A polypeptide or polypeptide domain having dehydratase or dehydratase (DH) activity exhibits the ability to catalyze a dehydration reaction. Reference to DH activity generally refers to the biological activity of a FabA-like β-hydroxyacyl-ACP dehydratase. The biological activity of a FabA-like β-hydroxyacyl-ACP dehydratase removes HOH from β-ketoacyl-ACP, initially creating a trans double bond in the carbon chain. The term "FabA-like β-hydroxyacyl-ACP dehydratase" is used interchangeably with the terms "FabA-like β-hydroxyacyl-ACP dehydratase," "β-hydroxyacyl-ACP dehydratase," and "dehydratase." The DH domain of the PUFA synthase system is homologous to the bacterial DH enzymes associated with the FAS system, but not to the DH domains of other PKS systems. See, for example, U.S. Patent No. 7,217,856, which is incorporated herein by reference in its entirety. A subset of bacterial DHs, FabA-like DHs, possess cis-trans isomerase activity (Heath et al., J. Biol. Chem., 271, 27795 (1996)). Based on homology to FabA-like DH proteins, one or all PUFA synthase system DH domains can insert a cis double bond into the PUFA synthase product. The polypeptide or domain may also possess non-FabA-like DH activity, or non-FabA-like β-hydroxyacyl-ACP dehydratase (DH) activity. More specifically, a conserved active site motif of approximately 13 amino acids has been identified in the PUFA synthase DH domain: LxxHxxxGxxxxP (the L position in the motif can also be I). See, for example, U.S. Patent No. 7,217,856 and Donadio S, Katz L., Gene 111(1): 51-60 (1992), each of which is incorporated by reference in its entirety. This conserved motif can be found in similar regions in all known PUFA synthase sequences and may be associated with non-FabA-like dehydration.

具有β-酮酰基-ACP还原酶(KR)活性的多肽或多肽结构域显示能够催化吡啶-核苷酸-依赖的3-酮酰基形式的ACP的还原。术语“β-酮酰基-ACP还原酶”与“酮还原酶”、“3-酮酰基-ACP还原酶”和“酮酰基ACP还原酶”互换使用。在其它系统中确定KR功能包括在从头脂肪酸生物合成延伸循环的第一个还原步骤。通过与已知PUFA合酶KR的序列同源性，易于鉴定多肽(或多肽的结构域)属于KR家族。Polypeptides or polypeptide domains having β-ketoacyl-ACP reductase (KR) activity are shown to catalyze the pyridine-nucleotide-dependent reduction of the 3-ketoacyl form of ACP. The term "β-ketoacyl-ACP reductase" is used interchangeably with "ketoreductase," "3-ketoacyl-ACP reductase," and "ketoacyl ACP reductase." KR function has been determined in other systems to include the first reduction step in the elongation cycle of de novo fatty acid biosynthesis. Polypeptides (or polypeptide domains) belonging to the KR family can be readily identified by sequence homology to known PUFA synthase KRs.

具有链长因子(CLF)活性的多肽或多肽结构域之前定义为具有如下一种或多种活性或特征：(1)可确定延伸循环的数目，因此可确定最终产物的链长，A polypeptide or polypeptide domain having chain length factor (CLF) activity has been previously defined as having one or more of the following activities or characteristics: (1) determining the number of elongation cycles, and therefore the chain length of the final product,

(2)与KS同源，但缺少KS活性位点的半胱氨酸，(2) Homologous to KS, but lacking the cysteine in the KS active site,

(3)可与KS异源二聚化，(3) Can heterodimerize with KS,

(4)可提供最初的酰基进行延伸，(4) can provide the initial acyl group for extension,

或者(5)可对丙二酸进行脱羧(形成丙二酰-ACP)，因此形成能够转移至KS活性位点并可作为进行初始延伸(缩合)反应的“起始”分子的乙酸基团。目前鉴定的所有PUFA合酶系统均含有CLF结构域，并且均为多结构域蛋白质的一部分。通过与已知PUFA合酶CLF的序列同源性，易于鉴定多肽(或多肽的结构域)属于CLF家族。Alternatively, (5) the malonate can be decarboxylated (forming malonyl-ACP), thereby forming an acetate group that can be transferred to the KS active site and serve as the "starter" molecule for the initial elongation (condensation) reaction. All PUFA synthase systems identified to date contain a CLF domain and are part of a multidomain protein. Polypeptides (or domains of polypeptides) can be readily identified as belonging to the CLF family by sequence homology to known PUFA synthase CLFs.

具有酰基转移酶(AT)活性的多肽或多肽结构域之前定义为具有如下一种或多种活性或特征：(1)可将脂肪酰基基团从ACP结构域转移到水上(即硫酯酶)，以游离脂肪酸形式释放脂肪酰基基团，A polypeptide or polypeptide domain having acyltransferase (AT) activity has been previously defined as having one or more of the following activities or characteristics: (1) the ability to transfer a fatty acyl group from an ACP domain to water (i.e., a thioesterase), releasing the fatty acyl group as a free fatty acid;

(2)可将脂肪酰基基团转移到接受体如CoA，(2) can transfer fatty acyl groups to acceptors such as CoA,

(3)可在不同ACP结构域间转移酰基基团，(3) Acyl groups can be transferred between different ACP domains,

或者(4)可将脂肪酰基基团转移到亲脂性接收分子上(如至溶血磷脂酸)。通过与已知PUFA合酶AT的序列同源性，易于鉴定多肽(或多肽的结构域)属于AT家族。Alternatively (4) the fatty acyl group may be transferred to a lipophilic acceptor molecule (eg, to lysophosphatidic acid). Polypeptides (or domains of polypeptides) belonging to the AT family can be readily identified by sequence homology to known PUFA synthases, AT.

具有烯酰基-ACP还原酶(ER)生物学活性的多肽或多肽结构域之前显示能够还原脂肪酰基-ACP中的反式双键(通过DH活性引入)，从而使相关碳饱和。PUFA合酶系统中的ER结构域之前显示与ER酶家族同源(Heath等，Nature 406：145-146(2000)，整体纳入作参考)，ER同源物显示在体外作为烯酰基-ACP还原酶(Bumpus等.J.Am.Chem.Soc.，130：11614-11616(2008)，整体纳入作参考)。术语“烯酰基-ACP还原酶”与“烯酰基还原酶”、“烯酰ACP-还原酶”和“烯酰-ACP还原酶”互换使用。通过与已知PUFA合酶ER的序列同源性，易于鉴定多肽(或多肽的结构域)属于ER家族。Polypeptides or polypeptide domains having enoyl-ACP reductase (ER) biological activity have previously been shown to reduce the trans double bond (introduced by DH activity) in fatty acyl-ACP, thereby saturating the associated carbon. ER domains in the PUFA synthase system have previously been shown to be homologous to the ER enzyme family (Heath et al., Nature 406:145-146 (2000), incorporated by reference in its entirety), and ER homologs have been shown to function as enoyl-ACP reductases in vitro (Bumpus et al., J. Am. Chem. Soc., 130:11614-11616 (2008), incorporated by reference in its entirety). The term "enoyl-ACP reductase" is used interchangeably with "enoyl reductase," "enoyl ACP-reductase," and "enoyl-ACP reductase." Polypeptides (or domains of polypeptides) belonging to the ER family can be readily identified by sequence homology to known PUFA synthases.

在一些实施方式中，本发明涉及包含多核苷酸序列的核酸分子，所述多核苷酸序列与PFA1(SEQ ID NO：1、SEQ ID NO：68或SEQ ID NO：120)中编码一种或多种PUFA合酶结构域的多核苷酸序列至少80％相同。在一些实施方式中，所述核酸分子包含与PFA1(SEQ IDNO：1、SEQ ID NO：68或SEQ ID NO：120)中编码一种或多种PUFA合酶结构域的多核苷酸序列至少80％相同的多核苷酸序列，所述PUFA合酶结构域如KS结构域(SEQ ID NO：7或SEQ IDNO：74)，MAT结构域(SEQ ID NO：9或SEQ ID NO：76)，ACP结构域(如SEQ ID NO：13、15、17、19、21、23、80、82、84、86、88、90、92、94、96或98中任一)，两种或多种ACP结构域的组合，如2、3、4、5、6、7、8、9或10种ACP结构域的组合，包括串联结构域(SEQ ID NO：11或SEQ ID NO：78及其部分)，KR结构域(SEQ ID NO：25或SEQ ID NO：100)，DH结构域(SEQ ID NO：27或SEQ IDNO：118)及其组合。在一些实施方式中，所述核酸分子包含两种或多种多核苷酸序列，其中至少两种或多种多核苷酸序列各自与PFA1(SEQ ID NO：1、SEQ ID NO：68或SEQ ID NO：120)中编码一种或多种PUFA合酶结构域的多核苷酸序列80％相同。在一些实施方式中，所述至少两种或多种多核苷酸序列与编码一种或多种PUFA合酶结构域的SEQ ID NO：1、SEQ IDNO：68或SEQ ID NO：120中的相同多核苷酸序列80％相同。在一些实施方式中，所述至少两种或多种多核苷酸序列与编码一种或多种PUFA合酶结构域的SEQ ID NO：1、SEQ ID NO：68或SEQ ID NO：120中的不同多核苷酸序列80％相同。在一些实施方式中，所述至少两种或多种多核苷酸序列与SEQ ID NO：1、SEQ ID NO：68或SEQ ID NO：120中不同的多核苷酸序列80％相同，与SEQ ID NO：1、SEQ ID NO：68或SEQ ID NO：120中对应序列的顺序相比，所述至少两种或多种多核苷酸序列在核酸分子中的顺序相同或不同。在一些实施方式中，所述至少两种或多种多核苷酸序列各自与PFA1(SEQ ID NO：1，SEQ ID NO：68或SEQ ID NO：120)中编码一种或多种PUFA合酶结构域的多核苷酸序列至少80％相同，所述PUFA合酶结构域如KS结构域(SEQ ID NO：7或SEQ ID NO：74)，MAT结构域(SEQ ID NO：9或SEQ ID NO：76)，ACP结构域(如SEQ ID NO：13、15、17、19、21、23、80、82、84、86、88、90、92、94、96或98中任一)，两种或多种ACP结构域的组合，如2、3、4、5、6、7、8、9或10种ACP结构域的组合，包括串联结构域(SEQ ID NO：11或SEQ ID NO：78及其部分)，KR结构域(SEQ ID NO：25或SEQ ID NO：100)，DH结构域(SEQ ID NO：27或SEQ ID NO：118)及其组合。在一些实施方式中，所述核酸分子包含PFA1(SEQ ID NO：1、SEQ ID NO：68或SEQ ID NO：120)中编码一种或多种PUFA合酶结构域的一个或多个多核苷酸序列，包括一个或多个拷贝的任何独立结构域与一个或多个拷贝任何其它独立结构域的组合。In some embodiments, the invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains in PFA1 (SEQ ID NO: 1, SEQ ID NO: 68, or SEQ ID NO: 120). In some embodiments, the nucleic acid molecule comprises a polynucleotide sequence that is at least 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains of PFA1 (SEQ ID NO: 1, SEQ ID NO: 68, or SEQ ID NO: 120), such as a KS domain (SEQ ID NO: 7 or SEQ ID NO: 74), a MAT domain (SEQ ID NO: 9 or SEQ ID NO: 76), an ACP domain (such as any of SEQ ID NOs: 13, 15, 17, 19, 21, 23, 80, 82, 84, 86, 88, 90, 92, 94, 96, or 98), a combination of two or more ACP domains, such as a combination of 2, 3, 4, 5, 6, 7, 8, 9, or 10 ACP domains, including a tandem domain (SEQ ID NO: 11 or SEQ ID NO: 78 and portions thereof), a KR domain (SEQ ID NO: 25 or SEQ ID NO: 100), a DH domain (SEQ ID NO: 13), or a combination of two or more ACP domains. NO: 27 or SEQ ID NO: 118) and combinations thereof. In some embodiments, the nucleic acid molecule comprises two or more polynucleotide sequences, wherein at least two or more polynucleotide sequences are each 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains in PFA1 (SEQ ID NO: 1, SEQ ID NO: 68, or SEQ ID NO: 120). In some embodiments, the at least two or more polynucleotide sequences are 80% identical to the same polynucleotide sequence encoding one or more PUFA synthase domains in SEQ ID NO: 1, SEQ ID NO: 68, or SEQ ID NO: 120. In some embodiments, the at least two or more polynucleotide sequences are 80% identical to a different polynucleotide sequence encoding one or more PUFA synthase domains in SEQ ID NO: 1, SEQ ID NO: 68, or SEQ ID NO: 120. In some embodiments, the at least two or more polynucleotide sequences are 80% identical to different polynucleotide sequences in SEQ ID NO: 1, SEQ ID NO: 68 or SEQ ID NO: 120, and the order of the at least two or more polynucleotide sequences in the nucleic acid molecule is the same or different compared to the order of the corresponding sequences in SEQ ID NO: 1, SEQ ID NO: 68 or SEQ ID NO: 120. In some embodiments, the at least two or more polynucleotide sequences are each at least 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains of PFA1 (SEQ ID NO: 1, SEQ ID NO: 68 or SEQ ID NO: 120), such as a KS domain (SEQ ID NO: 7 or SEQ ID NO: 74), a MAT domain (SEQ ID NO: 9 or SEQ ID NO: 76), an ACP domain (such as any of SEQ ID NOs: 13, 15, 17, 19, 21, 23, 80, 82, 84, 86, 88, 90, 92, 94, 96 or 98), a combination of two or more ACP domains, such as a combination of 2, 3, 4, 5, 6, 7, 8, 9 or 10 ACP domains, including a tandem domain (SEQ ID NO: 11 or SEQ ID NO: 78 and portions thereof), a KR domain (SEQ ID NO: 25 or SEQ ID NO: 26), or a tandem domain (SEQ ID NO: 12 or SEQ ID NO: 13). NO: 100), DH domain (SEQ ID NO: 27 or SEQ ID NO: 118), and combinations thereof. In some embodiments, the nucleic acid molecule comprises one or more polynucleotide sequences encoding one or more PUFA synthase domains of PFA1 (SEQ ID NO: 1, SEQ ID NO: 68, or SEQ ID NO: 120), including one or more copies of any individual domain in combination with one or more copies of any other individual domain.

在一些实施方式中，本发明涉及包含多核苷酸序列的核酸分子，所述多核苷酸序列与PFA2(SEQ ID NO：3、SEQ ID NO：70，或SEQ ID NO：121)中编码一种或多种PUFA合酶结构域的多核苷酸序列至少80％相同。在一些实施方式中，所述核酸分子包含与PFA2(SEQ IDNO：3、SEQ ID NO：70或SEQ ID NO：121)中编码一种或多种PUFA合酶结构域的多核苷酸序列至少80％相同的多核苷酸序列，所述PUFA合酶结构域如KS结构域(SEQ ID NO：29或SEQ IDNO：102)，CLF结构域(SEQ ID NO：31或SEQ ID NO：104)，AT结构域(SEQ ID NO：33或SEQ IDNO：106)，ER结构域(SEQ ID NO：35或SEQ ID NO：108)及其组合。在一些实施方式中，所述核酸分子包含两种或多种多核苷酸序列，其中所述至少两种或多种多核苷酸序列与PFA2(SEQID NO：3、SEQ ID NO：70或SEQ ID NO：121)中编码一种或多种PUFA合酶结构域的多核苷酸序列80％相同。在一些实施方式中，所述至少两种或多种多核苷酸序列与编码一种或多种PUFA合酶结构域的SEQ ID NO：3、SEQ ID NO：70或SEQ ID NO：121中的相同多核苷酸序列80％相同。在一些实施方式中，所述至少两种或多种多核苷酸序列与编码一种或多种PUFA合酶结构域的SEQ ID NO：3、SEQ ID NO：70或SEQ ID NO：121中的不同多核苷酸序列80％相同。在一些实施方式中，所述至少两种或多种多核苷酸序列与SEQ ID NO：3、SEQ ID NO：70或SEQ ID NO：121中不同的多核苷酸序列80％相同，与SEQ ID NO：3、SEQ ID NO：70或SEQID NO：121中对应序列的顺序相比，所述至少两种或多种多核苷酸序列在核酸分子中的顺序相同或不同。在一些实施方式中，所述至少两种或多种多核苷酸序列各自与PFA2(SEQ IDNO：3、SEQ ID NO：70或SEQ ID NO：121)中编码一种或多种PUFA合酶结构域的多核苷酸序列80％相同，所述PUFA合酶结构域如KS结构域(SEQ ID NO：29或SEQ ID NO：102)，CLF结构域(SEQ ID NO：31或SEQ ID NO：104)，AT结构域(SEQ ID NO：33或SEQ ID NO：106)，ER结构域(SEQ ID NO：35或SEQ ID NO：108)及其组合。在一些实施方式中，所述核酸分子包含PFA2(SEQ ID NO：3、SEQ ID NO：70或SEQ ID NO：121)中编码一种或多种PUFA合酶结构域的一个或多个多核苷酸序列，包括一个或多个拷贝的任何独立结构域与一个或多个拷贝任何其它独立结构域的组合。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains of PFA2 (SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121). In some embodiments, the nucleic acid molecule comprises a polynucleotide sequence that is at least 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains of PFA2 (SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121), such as a KS domain (SEQ ID NO: 29 or SEQ ID NO: 102), a CLF domain (SEQ ID NO: 31 or SEQ ID NO: 104), an AT domain (SEQ ID NO: 33 or SEQ ID NO: 106), an ER domain (SEQ ID NO: 35 or SEQ ID NO: 108), and combinations thereof. In some embodiments, the nucleic acid molecule comprises two or more polynucleotide sequences, wherein at least two or more polynucleotide sequences are 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains in PFA2 (SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121). In some embodiments, the at least two or more polynucleotide sequences are 80% identical to the same polynucleotide sequence encoding one or more PUFA synthase domains in SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121. In some embodiments, the at least two or more polynucleotide sequences are 80% identical to a different polynucleotide sequence encoding one or more PUFA synthase domains in SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121. In some embodiments, the at least two or more polynucleotide sequences are 80% identical to a different polynucleotide sequence in SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121, and the at least two or more polynucleotide sequences are arranged in the same or different order in the nucleic acid molecule compared to the order of the corresponding sequences in SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121. In some embodiments, the at least two or more polynucleotide sequences are each 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains in PFA2 (SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121), such as the KS domain (SEQ ID NO: 29 or SEQ ID NO: 102), the CLF domain (SEQ ID NO: 31 or SEQ ID NO: 104), the AT domain (SEQ ID NO: 33 or SEQ ID NO: 106), the ER domain (SEQ ID NO: 35 or SEQ ID NO: 108), and combinations thereof. In some embodiments, the nucleic acid molecule comprises one or more polynucleotide sequences encoding one or more PUFA synthase domains of PFA2 (SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121), including one or more copies of any individual domain in combination with one or more copies of any other individual domain.

在一些实施方式中，本发明涉及包含多核苷酸序列的核酸分子，所述多核苷酸序列与PFA3(SEQ ID NO：5、SEQ ID NO：72或SEQ ID NO：122)中编码一种或多种PUFA合酶结构域的多核苷酸序列至少80％相同。在一些实施方式中，所述核酸分子包含与PFA3(SEQ IDNO：5、SEQ ID NO：72或SEQ ID NO：122)中编码一种或多种PUFA合酶结构域的多核苷酸序列至少80％相同的多核苷酸序列，所述PUFA合酶结构域如DH结构域(SEQ ID NO：37、SEQ IDNO：39、SEQ ID NO：110或SEQ ID NO：112)，ER结构域(SEQ ID NO：41或SEQ ID NO：114)及其组合。在一些实施方式中，所述核酸分子包含两种或多种多核苷酸序列，其中所述至少两种或多种多核苷酸序列与PFA3(SEQ ID NO：5，SEQ ID NO：72或SEQ ID NO：122)中编码一种或多种PUFA合酶结构域的多核苷酸序列80％相同。在一些实施方式中，所述至少两种或多种多核苷酸序列与编码一种或多种PUFA合酶结构域的SEQ ID NO：5、SEQ ID NO：72或SEQ IDNO：122中的相同多核苷酸序列80％相同。在一些实施方式中，所述至少两种或多种多核苷酸序列与编码一种或多种PUFA合酶结构域的SEQ ID NO：5、SEQ ID NO：72或SEQ ID NO：122中的不同多核苷酸序列80％相同。在一些实施方式中，所述至少两种或多种多核苷酸序列与SEQ ID NO：5、SEQ ID NO：72或SEQ ID NO：122中不同的多核苷酸序列80％相同，与SEQID NO：5、SEQ ID NO：72或SEQ ID NO：122中对应序列的顺序相比，所述至少两种或多种多核苷酸序列在核酸分子中的顺序相同或不同。在一些实施方式中，所述至少两种或多种多核苷酸序列各自与PFA3(SEQ ID NO：5、SEQ ID NO：72或SEQ ID NO：122)中编码一种或多种PUFA合酶结构域的多核苷酸序列80％相同，所述PUFA合酶结构域如DH结构域(SEQ ID NO：37、SEQ ID NO：39、SEQ ID NO：110或SEQ ID NO：112)，ER结构域(SEQ ID NO：41或SEQ IDNO：114)及其组合。在一些实施方式中，所述核酸分子包含PFA3(SEQ ID NO：5、SEQ ID NO：72或SEQ ID NO：122)中编码一种或多种PUFA合酶结构域的一个或多个多核苷酸序列，包括一个或多个拷贝的任何独立结构域与一个或多个拷贝任何其它独立结构域的组合。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains of PFA3 (SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122). In some embodiments, the nucleic acid molecule comprises a polynucleotide sequence that is at least 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains of PFA3 (SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122), such as a DH domain (SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 110, or SEQ ID NO: 112), an ER domain (SEQ ID NO: 41 or SEQ ID NO: 114), and combinations thereof. In some embodiments, the nucleic acid molecule comprises two or more polynucleotide sequences, wherein at least two or more polynucleotide sequences are 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains in PFA3 (SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122). In some embodiments, the at least two or more polynucleotide sequences are 80% identical to the same polynucleotide sequence encoding one or more PUFA synthase domains in SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122. In some embodiments, the at least two or more polynucleotide sequences are 80% identical to a different polynucleotide sequence encoding one or more PUFA synthase domains in SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122. In some embodiments, the at least two or more polynucleotide sequences are 80% identical to a different polynucleotide sequence in SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122, and the at least two or more polynucleotide sequences are arranged in the same or different order in the nucleic acid molecule compared to the order of the corresponding sequences in SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122. In some embodiments, the at least two or more polynucleotide sequences are each 80% identical to a polynucleotide sequence encoding one or more PUFA synthase domains in PFA3 (SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122), such as the DH domain (SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 110, or SEQ ID NO: 112), the ER domain (SEQ ID NO: 41 or SEQ ID NO: 114), and combinations thereof. In some embodiments, the nucleic acid molecule comprises one or more polynucleotide sequences encoding one or more PUFA synthase domains of PFA3 (SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122), including one or more copies of any individual domain in combination with one or more copies of any other individual domain.

在一些实施方式中，本发明涉及一种含有与SEQ ID NO：1、SEQ ID NO：68或SEQ IDNO：120至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码包含选自KS活性、MAT活性、ACP活性、KR活性、DH活性及其组合的PUFA合酶活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 1, SEQ ID NO: 68, or SEQ ID NO: 120, wherein the polynucleotide sequence encodes a polypeptide comprising a PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, DH activity, and combinations thereof.

在一些实施方式中，本发明涉及含有与SEQ ID NO：7或SEQ ID NO：74至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有KS活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 7 or SEQ ID NO: 74, wherein the polynucleotide sequence encodes a polypeptide comprising KS activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：9或SEQ ID NO：76至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有MAT活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 9 or SEQ ID NO: 76, wherein the polynucleotide sequence encodes a polypeptide comprising MAT activity.

在一些实施方式中，本发明涉及一种包含与SEQ ID NO：13、15、17、19、21、23、80、82、84、86、88、90、92、94、96或98中任一项至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ACP活性的多肽；In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to any one of SEQ ID NO: 13, 15, 17, 19, 21, 23, 80, 82, 84, 86, 88, 90, 92, 94, 96, or 98, wherein the polynucleotide sequence encodes a polypeptide comprising ACP activity;

在一些实施方式中，本发明涉及含有与SEQ ID NO：11或SEQ ID NO：78至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ACP活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 11 or SEQ ID NO: 78, wherein the polynucleotide sequence encodes a polypeptide comprising ACP activity.

在一些实施方式中，所述核酸分子包含与SEQ ID NO：11中编码1、2、3、4、5或6个ACP结构域的多核苷酸序列至少80％相同的多核苷酸序列，其中所述多核苷酸序列编码含有与一个或多个ACP结构域相关的ACP活性的多肽。SEQ ID NO：13、15、17、19、21和23是SEQID NO：11中编码一个ACP结构域的代表性多核苷酸序列。In some embodiments, the nucleic acid molecule comprises a polynucleotide sequence that is at least 80% identical to a polynucleotide sequence encoding one, two, three, four, five, or six ACP domains of SEQ ID NO: 11, wherein the polynucleotide sequence encodes a polypeptide comprising an ACP activity associated with one or more ACP domains. SEQ ID NOs: 13, 15, 17, 19, 21, and 23 are representative polynucleotide sequences encoding one ACP domain of SEQ ID NO: 11.

在一些实施方式中，所述核酸分子包含与SEQ ID NO：78中编码1、2、3、4、5、6、7、8、9或10个ACP结构域的多核苷酸序列至少80％相同的多核苷酸序列，其中所述多核苷酸序列编码含有与一个或多个ACP结构域相关的ACP活性的多肽。SEQ ID NO：80、82、84、86、88、90、92、94、96和98是SEQ ID NO：78中编码一个ACP结构域的代表性多核苷酸序列。In some embodiments, the nucleic acid molecule comprises a polynucleotide sequence that is at least 80% identical to a polynucleotide sequence encoding one, two, three, four, five, six, seven, eight, nine, or ten ACP domains of SEQ ID NO: 78, wherein the polynucleotide sequence encodes a polypeptide comprising an ACP activity associated with one or more ACP domains. SEQ ID NOs: 80, 82, 84, 86, 88, 90, 92, 94, 96, and 98 are representative polynucleotide sequences encoding one ACP domain of SEQ ID NO: 78.

在一些实施方式中，本发明涉及含有与SEQ ID NO：25或SEQ ID NO：100至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有KR活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 25 or SEQ ID NO: 100, wherein the polynucleotide sequence encodes a polypeptide comprising KR activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：27或SEQ ID NO：118至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 27 or SEQ ID NO: 118, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity.

在一些实施方式中，本发明涉及一种含有与SEQ ID NO：3、SEQ ID NO：70或SEQ IDNO：121至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码包含选自KS活性、CLF活性、AT活性、ER活性及其组合的PUFA合酶活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 3, SEQ ID NO: 70, or SEQ ID NO: 121, wherein the polynucleotide sequence encodes a polypeptide comprising a PUFA synthase activity selected from the group consisting of KS activity, CLF activity, AT activity, ER activity, and combinations thereof.

在一些实施方式中，本发明涉及含有与SEQ ID NO：29或SEQ ID NO：102至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有KS活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 29 or SEQ ID NO: 102, wherein the polynucleotide sequence encodes a polypeptide comprising KS activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：31或SEQ ID NO：104至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有CLF活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 31 or SEQ ID NO: 104, wherein the polynucleotide sequence encodes a polypeptide comprising CLF activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：33或SEQ ID NO：106至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有AT活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 33 or SEQ ID NO: 106, wherein the polynucleotide sequence encodes a polypeptide comprising AT activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：35或SEQ ID NO：108至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ER活性的多肽。In some embodiments, the invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 35 or SEQ ID NO: 108, wherein the polynucleotide sequence encodes a polypeptide comprising ER activity.

在一些实施方式中，本发明涉及一种含有与SEQ ID NO：5、SEQ ID NO：72或SEQ IDNO：122中任一项至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸编码包含选自DH活性、ER活性及其组合的PUFA合酶活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to any one of SEQ ID NO: 5, SEQ ID NO: 72, or SEQ ID NO: 122, wherein the polynucleotide encodes a polypeptide comprising a PUFA synthase activity selected from the group consisting of DH activity, ER activity, and a combination thereof.

在一些实施方式中，本发明涉及含有与SEQ ID NO：37至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 37, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：39至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 39, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：110至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 110, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：112至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有DH活性的多肽。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 112, wherein the polynucleotide sequence encodes a polypeptide comprising DH activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：41或SEQ ID NO：114至少80％相同的多核苷酸序列的核酸分子，其中所述多核苷酸序列编码含有ER活性的多肽。In some embodiments, the invention relates to a nucleic acid molecule comprising a polynucleotide sequence that is at least 80% identical to SEQ ID NO: 41 or SEQ ID NO: 114, wherein the polynucleotide sequence encodes a polypeptide comprising ER activity.

本发明涉及包含编码多肽的多核苷酸序列的分离的核酸分子，其中所述多肽包含与Pfa1p(SEQ ID NO：2或SEQ ID NO：69)，Pfa2p(SEQ ID NO：4或SEQ ID NO：71)或Pfa3p(SEQ ID NO：6或SEQ ID NO：73)的氨基酸序列至少80％相同的氨基酸序列，其中所述多核苷酸编码包含一种或多种PUFA合酶活性的多肽。The present invention relates to an isolated nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to the amino acid sequence of Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69), Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71), or Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73), wherein the polynucleotide encodes a polypeptide comprising one or more PUFA synthase activities.

本发明涉及包含编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与本发明中PUFA合酶的一个或多个PUFA合酶结构域的氨基酸序列至少80％相同的氨基酸序列。The present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide comprising an amino acid sequence that is at least 80% identical to the amino acid sequence of one or more PUFA synthase domains of a PUFA synthase of the present invention.

在一些实施方式中，本发明涉及包含编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与含一个或多个PUFA合酶结构域的Pfa1p(SEQ ID NO：2或SEQ ID NO：69)内的氨基酸序列至少80％相同的氨基酸序列。在一些实施方式中，所述多肽包含与Pfa1p(SEQID NO：2，SEQ ID NO：69)中含有一种或多种PUFA合酶结构域的氨基酸序列至少80％相同的氨基酸序列，所述PUFA合酶结构域如KS结构域(SEQ ID NO：8或SEQ ID NO：75)，MAT结构域(SEQ ID NO：10或SEQ ID NO：77)，ACP结构域(如SEQ ID NO：14、16、18、20、22、24、81、83、85、87、89、91、93、95、97或99中任一)，两种或多种ACP结构域的组合，如2、3、4、5、6、7、8、9或10种ACP结构域的组合，包括串联结构域(SEQ ID NO：12或SEQ ID NO：79及其部分)，KR结构域(SEQ ID NO：26或SEQ ID NO：101)，DH结构域(SEQ ID NO：28或SEQ ID NO：119)及其组合。在一些实施方式中，所述多肽包含两种或多种氨基酸序列，其中所述至少两种或多种氨基酸序列各自与Pfa1p(SEQ ID NO：2或SEQ ID NO：69)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa1p(SEQID NO：2，SEQ ID NO：69)中包含一种或多种PUFA合酶结构域的相同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa1p(SEQ ID NO：2，SEQ ID NO：69)中各自包含一种或多种PUFA合酶结构域的不同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa1p(SEQ ID NO：2或SEQ ID NO：69)中的不同氨基酸序列80％相同，与Pfa1p(SEQ ID NO：2或SEQ ID NO：69)中的对应结构域顺序相比，所述至少两种或多种氨基酸序列在多肽中以相同或不同顺序排列。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa1p(SEQ ID NO：2或SEQ ID NO：69)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同，所述PUFA合酶结构域如KS结构域(SEQ ID NO：8或SEQID NO：75)、MAT结构域(SEQ ID NO：10或SEQ ID NO：77)、ACP结构域(如SEQ ID NOs：14、16、18、20、22、24、81、83、85、87、89、91、93、95、97或99中任一)，两种或多种ACP结构域的组合，如2、3、4、5、6、7、8、9或10种ACP结构域的组合，包括串联结构域(SEQ ID NO：12或SEQ IDNO：79及其部分)，KR结构域(SEQ ID NO：26或SEQ ID NO：101)，DH结构域(SEQ ID NO：28或SEQ ID NO：119)及其组合。在一些实施方式中，所述多肽包含Pfa1p(SEQ ID NO：2或SEQ IDNO：69)中包含一种或多种PUFA合酶结构域的一个或多个氨基酸序列，包括一个或多个拷贝的任何独立结构域与一个或多个拷贝任何其它独立结构域的组合。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide comprising an amino acid sequence at least 80% identical to an amino acid sequence within Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69) comprising one or more PUFA synthase domains. In some embodiments, the polypeptide comprises an amino acid sequence that is at least 80% identical to an amino acid sequence of Pfa1p (SEQ ID NO: 2, SEQ ID NO: 69) comprising one or more PUFA synthase domains, such as a KS domain (SEQ ID NO: 8 or SEQ ID NO: 75), a MAT domain (SEQ ID NO: 10 or SEQ ID NO: 77), an ACP domain (such as any of SEQ ID NOs: 14, 16, 18, 20, 22, 24, 81, 83, 85, 87, 89, 91, 93, 95, 97, or 99), a combination of two or more ACP domains, such as a combination of 2, 3, 4, 5, 6, 7, 8, 9, or 10 ACP domains, including a tandem domain (SEQ ID NO: 12 or SEQ ID NO: 79 and portions thereof), a KR domain (SEQ ID NO: 26 or SEQ ID NO: 101), a DH domain (SEQ ID NO: 28 or SEQ ID NO: 29), or a DH domain (SEQ ID NO: 30). NO: 119) and combinations thereof. In some embodiments, the polypeptide comprises two or more amino acid sequences, wherein each of the at least two or more amino acid sequences is 80% identical to an amino acid sequence comprising one or more PUFA synthase domains in Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69). In some embodiments, the at least two or more amino acid sequences are 80% identical to the same amino acid sequence comprising one or more PUFA synthase domains in Pfa1p (SEQ ID NO: 2, SEQ ID NO: 69). In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa1p (SEQ ID NO: 2, SEQ ID NO: 69) that each comprise one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69), and the at least two or more amino acid sequences are arranged in the same or different order in the polypeptide compared to the order of the corresponding domains in Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69). In some embodiments, the at least two or more amino acid sequences are 80% identical to an amino acid sequence of Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69) comprising one or more PUFA synthase domains, such as a KS domain (SEQ ID NO: 8 or SEQ ID NO: 75), a MAT domain (SEQ ID NO: 10 or SEQ ID NO: 77), an ACP domain (such as any of SEQ ID NOs: 14, 16, 18, 20, 22, 24, 81, 83, 85, 87, 89, 91, 93, 95, 97, or 99), a combination of two or more ACP domains, such as a combination of 2, 3, 4, 5, 6, 7, 8, 9, or 10 ACP domains, including a tandem domain (SEQ ID NO: 12 or SEQ ID NO: 79 and portions thereof), a KR domain (SEQ ID NO: 26 or SEQ ID NO: 101), a DH domain (SEQ ID NO: 28 or SEQ ID NO: 29), or a DH domain (SEQ ID NO: 30). In some embodiments, the polypeptide comprises one or more amino acid sequences comprising one or more PUFA synthase domains of Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69), including one or more copies of any individual domain in combination with one or more copies of any other individual domain.

在一些实施方式中，本发明涉及包含编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与包含一个或多个PUFA合酶结构域的Pfa2p(SEQ ID NO：4或SEQ ID NO：71)内的氨基酸序列至少80％相同的氨基酸序列。在一些实施方式中，所述多肽包含与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中含有一种或多种PUFA合酶结构域的氨基酸序列至少80％相同的氨基酸序列，所述PUFA合酶结构域如KS结构域(SEQ ID NO：30或SEQ ID NO：103)，CLF结构域(SEQ ID NO：32或SEQ ID NO：105)，AT结构域(SEQ ID NO：34或SEQ ID NO：107)，ER结构域(SEQ ID NO：36或SEQ ID NO：109)及其组合。在一些实施方式中，所述多肽包含两种或多种氨基酸序列，其中所述至少两种或多种氨基酸序列各自与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa2p(SEQ ID NO：4，SEQ ID NO：71)中的相同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa2p(SEQ IDNO：4，SEQ ID NO：71)中各自包含一种或多种PUFA合酶结构域的不同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中的不同氨基酸序列80％相同，与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中的对应结构域顺序相比，所述至少两种或多种氨基酸序列在多肽中以相同或不同顺序排列。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同，所述PUFA合酶结构域如KS结构域(SEQID NO：30或SEQ ID NO：103)，CLF结构域(SEQ ID NO：32或SEQ ID NO：105)，AT结构域(SEQID NO：34或SEQ ID NO：107)，ER结构域(SEQ ID NO：36或SEQ ID NO：109)及其组合。在一些实施方式中，所述多肽包含Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中包含一种或多种PUFA合酶结构域的一个或多个氨基酸序列，包括一个或多个拷贝的任何独立结构域与一个或多个拷贝任何其它独立结构域的组合。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to an amino acid sequence within Pfa2p (SEQ ID NO:4 or SEQ ID NO:71) comprising one or more PUFA synthase domains. In some embodiments, the polypeptide comprises an amino acid sequence that is at least 80% identical to an amino acid sequence within Pfa2p (SEQ ID NO:4 or SEQ ID NO:71) comprising one or more PUFA synthase domains, such as a KS domain (SEQ ID NO:30 or SEQ ID NO:103), a CLF domain (SEQ ID NO:32 or SEQ ID NO:105), an AT domain (SEQ ID NO:34 or SEQ ID NO:107), an ER domain (SEQ ID NO:36 or SEQ ID NO:109), and combinations thereof. In some embodiments, the polypeptide comprises two or more amino acid sequences, wherein each of the at least two or more amino acid sequences is 80% identical to an amino acid sequence in Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71) that comprises one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to the same amino acid sequence in Pfa2p (SEQ ID NO: 4, SEQ ID NO: 71). In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa2p (SEQ ID NO: 4, SEQ ID NO: 71) that each comprise one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71), and the at least two or more amino acid sequences are arranged in the same or different order in the polypeptide compared to the order of the corresponding domains in Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71). In some embodiments, the at least two or more amino acid sequences are 80% identical to an amino acid sequence comprising one or more PUFA synthase domains of Pfa2p (SEQ ID NO:4 or SEQ ID NO:71), such as a KS domain (SEQ ID NO:30 or SEQ ID NO:103), a CLF domain (SEQ ID NO:32 or SEQ ID NO:105), an AT domain (SEQ ID NO:34 or SEQ ID NO:107), an ER domain (SEQ ID NO:36 or SEQ ID NO:109), and combinations thereof. In some embodiments, the polypeptide comprises one or more amino acid sequences comprising one or more PUFA synthase domains of Pfa2p (SEQ ID NO:4 or SEQ ID NO:71), including one or more copies of any individual domain in combination with one or more copies of any other individual domain.

在一些实施方式中，本发明涉及包含编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与包含一个或多个PUFA合酶结构域的Pfa3p(SEQ IDNO：6或SEQ ID NO：73)内的氨基酸序列至少80％相同的氨基酸序列。在一些实施方式中，所述多肽包含与Pfa3(SEQID NO：6或SEQ ID NO：73)中含有一种或多种PUFA合酶结构域的氨基酸序列至少80％相同的氨基酸序列，所述PUFA合酶结构域如DH结构域(SEQ ID NO：38、SEQ ID NO：40，SEQ IDNO：111或SEQ ID NO：113)、ER结构域(SEQ ID NO：42或SEQ ID NO：115)及其组合。在一些实施方式中，所述多肽包含两种或多种氨基酸序列，其中所述至少两种或多种氨基酸序列各自与Pfa3p(SEQ ID NO：6或SEQ ID NO：73)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa3p(SEQ ID NO：6，SEQ ID NO：73)中包含一种或多种PUFA合酶结构域的相同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa3p(SEQ ID NO：6，SEQ ID NO：73)中各自包含一种或多种PUFA合酶结构域的不同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa3p(SEQ ID NO：6或SEQ ID NO：73)中的不同氨基酸序列80％相同，与Pfa3p(SEQ ID NO：6或SEQ ID NO：73)中的对应结构域顺序相比，所述至少两种或多种氨基酸序列在多肽中以相同或不同顺序排列。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa3(SEQ ID NO：6或SEQ ID NO：73)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同，所述PUFA合酶结构域如DH结构域(SEQ ID NO：38、SEQ ID NO：40、SEQ ID NO：111或SEQ ID NO：113)，ER结构域(SEQ ID NO：42或SEQ ID NO：115)及其组合。在一些实施方式中，所述多肽包含Pfa3p(SEQ ID NO：6或SEQ ID NO：73)中包含一种或多种PUFA合酶结构域的一个或多个氨基酸序列，包括一个或多个拷贝的任何独立结构域与一个或多个拷贝任何其它独立结构域的组合。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to an amino acid sequence within Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73) comprising one or more PUFA synthase domains. In some embodiments, the polypeptide comprises an amino acid sequence that is at least 80% identical to an amino acid sequence within Pfa3 (SEQ ID NO: 6 or SEQ ID NO: 73) comprising one or more PUFA synthase domains, such as the DH domain (SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 111, or SEQ ID NO: 113), the ER domain (SEQ ID NO: 42 or SEQ ID NO: 115), and combinations thereof. In some embodiments, the polypeptide comprises two or more amino acid sequences, wherein each of the at least two or more amino acid sequences is 80% identical to an amino acid sequence within Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73) comprising one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to the same amino acid sequence in Pfa3p (SEQ ID NO: 6, SEQ ID NO: 73) that comprises one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa3p (SEQ ID NO: 6, SEQ ID NO: 73) that each comprise one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73), and the at least two or more amino acid sequences are arranged in the same or different order in the polypeptide compared to the order of the corresponding domains in Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73). In some embodiments, the at least two or more amino acid sequences are 80% identical to an amino acid sequence comprising one or more PUFA synthase domains of Pfa3 (SEQ ID NO: 6 or SEQ ID NO: 73), such as the DH domain (SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 111, or SEQ ID NO: 113), the ER domain (SEQ ID NO: 42 or SEQ ID NO: 115), and combinations thereof. In some embodiments, the polypeptide comprises one or more amino acid sequences comprising one or more PUFA synthase domains of Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73), including one or more copies of any individual domain in combination with one or more copies of any other individual domain.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：2或SEQ ID NO：69至少80％相同的氨基酸序列，并且所述多肽含有选自KS活性、MAT活性、ACP活性、KR活性、DH活性及其组合的PUFA合酶活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 2 or SEQ ID NO: 69, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, DH activity, and combinations thereof.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：8或SEQ ID NO：75至少80％相同的氨基酸序列，并且所述多肽含有KS活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 8 or SEQ ID NO: 75, and the polypeptide comprises KS activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：10或SEQ ID NO：77至少80％相同的氨基酸序列，并且所述多肽含有MAT活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 10 or SEQ ID NO: 77, and the polypeptide comprises MAT activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：14、16、18、20、22、24、81、83、87、89、91、93、95、97或99至少80％相同的氨基酸序列，并且所述多肽含有ACP活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 14, 16, 18, 20, 22, 24, 81, 83, 87, 89, 91, 93, 95, 97 or 99, and the polypeptide comprises ACP activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：12或SEQ ID NO：79至少80％相同的氨基酸序列，并且所述多肽含有ACP活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 12 or SEQ ID NO: 79, and the polypeptide comprises ACP activity.

在一些实施方式中，本发明涉及含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：12至少80％相同的氨基酸序列，并且所述多肽含有ACP活性。在一些实施方式中，所述氨基酸序列与SEQ ID NO：12中包含1、2、3、4、5或6个ACP结构域的氨基酸序列至少80％相同，其中所述多肽包含与一个或多个ACP结构域相关的ACP活性。SEQID NO：14、16、18、20、22和24是SEQ ID NO：12中含有单一ACP结构域的代表性氨基酸序列。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 12, and the polypeptide comprises ACP activity. In some embodiments, the amino acid sequence is at least 80% identical to an amino acid sequence comprising one, two, three, four, five, or six ACP domains in SEQ ID NO: 12, wherein the polypeptide comprises ACP activity associated with one or more ACP domains. SEQ ID NOs: 14, 16, 18, 20, 22, and 24 are representative amino acid sequences of SEQ ID NO: 12 comprising a single ACP domain.

在一些实施方式中，本发明涉及含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：79至少80％相同的氨基酸序列，并且所述多肽含有ACP活性。在一些实施方式中，所述氨基酸序列与SEQ ID NO：79中包含1、2、3、4、5、6、7、8、9或10个ACP结构域的氨基酸序列至少80％相同，其中所述多肽包含与一个或多个ACP结构域相关的ACP活性。SEQ ID NO：81、83、85、87、89、91、93、95、97和99是SEQ ID NO：79中包含单个ACP结构域的代表性氨基酸序列。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 79, and the polypeptide comprises ACP activity. In some embodiments, the amino acid sequence is at least 80% identical to an amino acid sequence comprising one, two, three, four, five, six, seven, eight, nine, or ten ACP domains in SEQ ID NO: 79, wherein the polypeptide comprises ACP activity associated with one or more ACP domains. SEQ ID NOs: 81, 83, 85, 87, 89, 91, 93, 95, 97, and 99 are representative amino acid sequences comprising individual ACP domains in SEQ ID NO: 79.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：26或SEQ ID NO：101至少80％相同的氨基酸序列，并且所述多肽含有KR活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 26 or SEQ ID NO: 101, and the polypeptide comprises KR activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：28或SEQ ID NO：119至少80％相同的氨基酸序列，并且所述多肽含有DH活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 28 or SEQ ID NO: 119, and the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：4或SEQ ID NO：71至少80％相同的氨基酸序列，并且所述多肽含有选自KS活性、CLF活性、AT活性、ER活性及其组合的PUFA合酶活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 4 or SEQ ID NO: 71, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, CLF activity, AT activity, ER activity, and combinations thereof.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：30或SEQ ID NO：103至少80％相同的氨基酸序列，并且所述多肽含有KS活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 30 or SEQ ID NO: 103, and the polypeptide comprises KS activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：32或SEQ ID NO：105至少80％相同的氨基酸序列，并且所述多肽含有CLF活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 32 or SEQ ID NO: 105, and the polypeptide comprises CLF activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：34或SEQ ID NO：107至少80％相同的氨基酸序列，并且所述多肽含有AT活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 34 or SEQ ID NO: 107, and the polypeptide comprises AT activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：36或SEQ ID NO：109至少80％相同的氨基酸序列，并且所述多肽含有ER活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 36 or SEQ ID NO: 109, and the polypeptide comprises ER activity.

在一些实施方式中，本发明涉及含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽包含与SEQ ID NO：6或SEQ ID NO：73相同至少80％的氨基酸序列，并且所述多肽含有选自DH活性、ER活性及其组合的PUFA合酶活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 6 or SEQ ID NO: 73, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of DH activity, ER activity, and a combination thereof.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：38至少80％的氨基酸序列，并且所述多肽含有DH活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 38, and the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：40至少80％相同的氨基酸序列，并且所述多肽含有DH活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence that is at least 80% identical to SEQ ID NO: 40, and the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：111至少80％相同的氨基酸序列，并且所述多肽含有DH活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 111, and the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：113至少80％相同的氨基酸序列，并且所述多肽含有DH活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 113, and the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及一种含有编码多肽的多核苷酸序列的核酸分子，其中所述多肽含有与SEQ ID NO：42或SEQ ID NO：115至少80％相同的氨基酸序列，并且所述多肽含有ER活性。In some embodiments, the present invention relates to a nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide, wherein the polypeptide comprises an amino acid sequence at least 80% identical to SEQ ID NO: 42 or SEQ ID NO: 115, and the polypeptide comprises ER activity.

在一些实施方式中，所述核酸分子包含与本文报道的多核苷酸序列至少约80％、85％或90％相同，或者与本文报道的多核苷酸序列至少约95％、96％、97％、98％、99％或100％相同的多核苷酸序列。本领域所知术语“相同性百分数”指，通过序列比较确定两条或多条氨基酸序列或两条或多条核苷酸序列之间的关系。本领域中“相同”也指氨基酸或多核苷酸序列之间序列相关的程度，视情况通过此类序列字串之间配对进行确定。In some embodiments, the nucleic acid molecule comprises a polynucleotide sequence that is at least about 80%, 85%, or 90% identical to the polynucleotide sequences reported herein, or at least about 95%, 96%, 97%, 98%, 99%, or 100% identical to the polynucleotide sequences reported herein. The term "percent identity" as known in the art refers to the relationship between two or more amino acid sequences or two or more nucleotide sequences as determined by sequence comparison. "Identical" in the art also refers to the degree of sequence relatedness between amino acid or polynucleotide sequences, as determined by pairing between word strings of such sequences, as appropriate.

例如，核酸分子的多核苷酸序列与本发明的对照多核苷酸序95％“相同”，是指核酸分子的多核苷酸序列与对照序列相同，除了所述多核苷酸序列在每100个核苷酸中有多至5个核苷酸不同于对照多核苷酸序列。换言之，为了获得与对照多核苷酸序列至少95％相同的多核苷酸序列，对照序列中多至5％的核苷酸可被缺失或用其它核苷酸代替，或者对照序列总核苷酸的多至5％数目的核苷酸可被插入到对照序列中。For example, a polynucleotide sequence of a nucleic acid molecule is 95% "identical" to a reference polynucleotide sequence of the present invention, which means that the polynucleotide sequence of the nucleic acid molecule is identical to the reference sequence, except that the polynucleotide sequence differs from the reference polynucleotide sequence by up to 5 nucleotides per 100 nucleotides. In other words, to obtain a polynucleotide sequence that is at least 95% identical to the reference polynucleotide sequence, up to 5% of the nucleotides in the reference sequence can be deleted or replaced with other nucleotides, or up to 5% of the total nucleotides in the reference sequence can be inserted into the reference sequence.

实际上，一般通过已知计算机程序确定任何特定多核苷酸序列或氨基酸序列与本发明的多核苷酸序列或氨基酸序列至少80％、85％、90％、95％、96％、97％、98％或99％相同。通过序列比对并计算相同性评分可确定查询序列(本发明序列)和对象序列之间的最佳总体配对的方法。使用计算机程序AlignX进行比对，该程序是英骏公司(Invitrogen)(www.invitrogen.com)Vector NTI套件的一个组件。使用ClustalW比对软件对氨基酸和多核苷酸序列进行比对(Thompson，J.D.，等.Nucl.Acids Res.22：4673-4680(1994))。分别使用默认打分矩阵Blosum62mt2和swgapdnamt进行氨基酸和多核苷酸序列比对。对于氨基酸序列，默认的缺口开放罚分为10，缺口延伸罚分为0.1。对于多核苷酸序列，默认的缺口开放罚分为15，缺口延伸罚分为6.66。In practice, any particular polynucleotide sequence or amino acid sequence is typically determined by known computer programs to be at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to a polynucleotide sequence or amino acid sequence of the invention. Methods for determining the best overall match between a query sequence (a sequence of the invention) and a subject sequence can be performed by aligning the sequences and calculating an identity score. Alignments are performed using the computer program AlignX, a component of the Vector NTI suite from Invitrogen (www.invitrogen.com). Amino acid and polynucleotide sequences are aligned using ClustalW alignment software (Thompson, J.D., et al. Nucl. Acids Res. 22:4673-4680 (1994)). Amino acid and polynucleotide sequence alignments are performed using the default scoring matrices Blosum62mt2 and swgapdnamt, respectively. For amino acid sequences, the default gap opening penalty is 10 and the gap extension penalty is 0.1. For polynucleotide sequences, the default gap opening penalty is 15 and the gap extension penalty is 6.66.

本发明涉及包含编码含PUFA合酶活性多肽的多核苷酸的分离的核酸分子，所述合酶活性选自KS活性、MAT活性、ACP活性、KR活性、CLF活性、AT活性、ER活性、DH活性及其组合，其中所述多核苷酸在严谨条件下与上述任何多核苷酸序列的互补序列杂交。The present invention relates to an isolated nucleic acid molecule comprising a polynucleotide encoding a polypeptide comprising PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, CLF activity, AT activity, ER activity, DH activity, and combinations thereof, wherein the polynucleotide hybridizes under stringent conditions to the complement of any of the above polynucleotide sequences.

当单链形式的分子在合适的温度和溶液条件下可与其它核酸分子(如cDNA、基因组DNA或RNA)退火时，该核酸分子是与所述其它核酸分子是“可杂交的”。杂交和洗涤条件众所周知，易于举例。参见例如，Sambrook J.和Russell D.2001.《分子克隆：实验室手册》(第三版)冷泉港实验室出版社，纽约冷泉港(Molecular cloning：A laboratory manual，3rdedition.Cold Spring Harbor Laboratory Press，Cold Spring Harbor，New York)。温度和离子强度条件确定了杂交的“严谨性”。可调整严谨性条件以筛选适度相似的片段，如来自亲缘关系较远的有机体的同源性片段，到高度相似的片段，如来自亲缘关系较近的功能重复的酶。杂交后的洗涤决定了严谨条件。一个条件的集合使用一系列的洗涤，始于6XSSC，0.5％SDS室温15min，然后用2X SSC，0.5％SDS 45℃重复洗涤30min，再用0.2X SSC，0.5％SDS 50℃洗涤30min重复两次。对于更严谨的条件，洗涤在更高的温度中进行，条件同上，除了最后两次用0.2X SSC，0.5％SDS洗涤30min的温度升高到60℃。另一个高严谨条件的集合中最后两次洗涤的条件为0.1X SSC，0.1％SDS 65℃。另一高严谨条件为在0.1XSSC，0.1％SDS，65℃杂交；洗涤条件为2X SSC，0.1％SDS，然后是0.1X SSC，0.1％SDS。A nucleic acid molecule is "hybridizable" with another nucleic acid molecule (e.g., cDNA, genomic DNA, or RNA) when the single-stranded form of the molecule can anneal to the other nucleic acid molecule under appropriate temperature and solution conditions. Hybridization and wash conditions are well known and readily exemplified. See, e.g., Sambrook J. and Russell D. 2001. Molecular cloning: A laboratory manual, 3rd edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. Temperature and ionic strength conditions determine the "stringency" of the hybridization. Stringent conditions can be adjusted to screen for moderately similar fragments, such as homologous fragments from distantly related organisms, to highly similar fragments, such as functionally duplicated enzymes from closely related organisms. Post-hybridization washes determine the stringency conditions. One set of conditions uses a series of washes starting with 6X SSC, 0.5% SDS at room temperature for 15 minutes, followed by repeated washes with 2X SSC, 0.5% SDS at 45°C for 30 minutes, and then two more washes with 0.2X SSC, 0.5% SDS at 50°C for 30 minutes. For more stringent conditions, washes are performed at higher temperatures, using the same conditions as above, except that the temperature for the final two 30-minute washes with 0.2X SSC, 0.5% SDS is increased to 60°C. Another set of highly stringent conditions has the final two washes at 0.1X SSC, 0.1% SDS at 65°C. Another highly stringent condition involves hybridization at 0.1X SSC, 0.1% SDS at 65°C, followed by washes with 2X SSC, 0.1% SDS, followed by 0.1X SSC, 0.1% SDS.

本发明涉及含有多核苷酸序列的分离的核酸分子，所述多核苷酸序列与上述任何多核苷酸序列完全互补。术语“互补”用于描述能够彼此杂交的核苷酸碱基间的关系。例如，当涉及DNA时，腺嘌呤与胸腺嘧啶互补，胞嘧啶与鸟嘌呤互补。The present invention relates to isolated nucleic acid molecules containing polynucleotide sequences that are completely complementary to any of the polynucleotide sequences described above. The term "complementary" is used to describe the relationship between nucleotide bases that are capable of hybridizing to each other. For example, when referring to DNA, adenine is complementary to thymine, and cytosine is complementary to guanine.

在某些实施方式中，所述多核苷酸或核酸是DNA。当指DNA时，包含多肽编码多核苷酸序列的核酸分子通常包含启动子和/或其它转录或翻译控制元件，这些元件与一个或多个编码区可操作的连接。可操作的连接指，当基因产物如多肽的编码区与一个或多个调控序列以此方式连接时，将使该基因产物的表达受到该调控序列的影响或控制。如果启动子功能的诱导使编码所需基因产物的mRNA转录，且如果两个DNA片段间连接的属性不干扰表达调控序列指导基因产物表达或不干扰转录DNA模板的能力，则两个DNA片段(如多肽编码区及其相连的启动子)“可操作地相连接”。因此，如果启动子能够影响多核苷酸序列的转录，则启动子区域与多肽编码多核苷酸序列可操作的连接。启动子可以是仅在预先确定的细胞中指导DNA大量转录的细胞特异性启动子。通常，编码区位于启动子的3′末端。启动子可整体来自于天然基因，或由来自不同天然启动子的不同元件所组成，或包含合成的DNA片段。本领域熟练技术人员了解，不同的启动子可指导不同组织或细胞类型的基因表达，或指导不同发育阶段的基因表达，或对不同的环境或生理条件应答从而指导基因表达。在多数时候多数细胞类型中引起基因表达的启动子称作“组成型启动子”。还应认识到的是，既然多数情况下调控序列界限不完全明确，不同长度的DNA片段可有相同的启动子活性。通常启动子基因的界限为3′末端的转录起始位点，向上游(5’方向)延伸，以包括起始高于背景的可检测水平的转录所需的最小数目碱基或元件。启动子内部可找到转录起始位点(常通过核酸酶S1作图确定)，以及负责RNA聚合酶结合的蛋白质结合域(共有序列)。In certain embodiments, the polynucleotide or nucleic acid is DNA. When referring to DNA, a nucleic acid molecule comprising a polypeptide-encoding polynucleotide sequence typically includes a promoter and/or other transcriptional or translational control elements operably linked to one or more coding regions. Operably linked means that when the coding region for a gene product, such as a polypeptide, is linked to one or more regulatory sequences in such a manner, the expression of the gene product is influenced or controlled by the regulatory sequences. Two DNA fragments (e.g., a polypeptide-encoding region and its associated promoter) are "operably linked" if induction of promoter function results in transcription of mRNA encoding the desired gene product, and if the nature of the linkage between the two DNA fragments does not interfere with the ability of the expression regulatory sequences to direct expression of the gene product or to transcribe the DNA template. Thus, a promoter region is operably linked to a polypeptide-encoding polynucleotide sequence if the promoter is capable of influencing transcription of the polynucleotide sequence. Promoters can be cell-specific promoters that direct substantial transcription of DNA only in predetermined cells. Typically, the coding region is located at the 3' terminus of the promoter. Promoters can be derived in their entirety from a natural gene, be composed of different elements from different natural promoters, or comprise synthetic DNA fragments. Those skilled in the art understand that different promoters can direct gene expression in different tissues or cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters that often cause gene expression in most cell types are referred to as "constitutive promoters." It should also be recognized that, since regulatory sequence boundaries are often not completely defined, DNA fragments of different lengths can have the same promoter activity. The promoter of a gene is typically defined by the transcription start site at the 3' end, extending upstream (5' direction) to include the minimum number of bases or elements required to initiate transcription at levels detectable above background. Within the promoter, the transcription start site (often determined by nuclease S1 mapping) and the protein binding domain (consensus sequence) responsible for RNA polymerase binding can be found.

合适的调控区域包括位于上游(5’非编码序列)、在编码区内或下游(3’非编码序列)的核酸区域，它影响相关编码区的转录、RNA加工或稳定性，或翻译。调控区域包含启动子、翻译引导序列、RNA加工位点、效应子结合位点和茎环结构。除了启动子以外，其它转录控制元件还有增强子、操纵子、阻抑物和转录终止信号，可与多核苷酸可操作地连接以指导细胞特异性转录。编码序列的边界由5’(氨基)末端翻译起始密码子和3’(羧基)末端翻译终止密码子界定。编码区包括但不限于，原核区域、来自mRNA的cDNA、基因组DNA分子、合成DNA分子或RNA分子。如果在真核细胞中表达编码区，常常在编码区的3’存在聚腺苷酸化信号和转录终止序列。Suitable regulatory regions include nucleic acid regions located upstream (5' non-coding sequences), within the coding region, or downstream (3' non-coding sequences) that affect the transcription, RNA processing or stability, or translation of the associated coding region. Regulatory regions include promoters, translation leader sequences, RNA processing sites, effector binding sites, and stem-loop structures. In addition to promoters, other transcriptional control elements include enhancers, operators, repressors, and transcription termination signals that can be operably linked to polynucleotides to direct cell-specific transcription. The boundaries of the coding sequence are defined by a 5' (amino) terminal translation start codon and a 3' (carboxyl) terminal translation stop codon. Coding regions include, but are not limited to, prokaryotic regions, cDNA from mRNA, genomic DNA molecules, synthetic DNA molecules, or RNA molecules. If the coding region is expressed in eukaryotic cells, a polyadenylation signal and a transcription termination sequence are often present 3' of the coding region.

在本发明的某些方面，至少20个碱基、至少30个碱基、或者至少50个碱基并且与本发明多核苷酸序列杂交的多核苷酸序列可用作PCR引物。通常，在PCR-型扩增技术中，引物具有不同的序列，并且彼此不互补。根据所需检测条件，引物序列的设计应使目的核酸的复制有效且保真。PCR引物设计方法常见且为本领域所熟知。通常从DNA或RNA中扩增编码同源基因的较长的核酸片段时，可采用聚合酶链反应(PCR)方案扩增所述序列的两个短片段。聚合酶链反应也可在克隆核酸片段文库上进行，其中一条引物的序列来自于所述核酸片段，另一引物的序列则方便利用编码微生物基因的mRNA前体的3’末端的聚腺苷酸通道的存在。或者，第二条引物序列基于来自克隆载体的序列。In certain aspects of the present invention, polynucleotide sequences of at least 20 bases, at least 30 bases, or at least 50 bases that hybridize to the polynucleotide sequences of the present invention can be used as PCR primers. Typically, in PCR-type amplification techniques, primers have different sequences and are not complementary to each other. Depending on the desired detection conditions, the design of the primer sequence should ensure efficient and fidelity replication of the target nucleic acid. PCR primer design methods are common and well known in the art. Typically, when amplifying longer nucleic acid fragments encoding homologous genes from DNA or RNA, a polymerase chain reaction (PCR) protocol can be used to amplify two short fragments of the sequence. The polymerase chain reaction can also be performed on a library of cloned nucleic acid fragments, where the sequence of one primer is derived from the nucleic acid fragment and the sequence of the other primer is conveniently adapted to utilize the presence of a polyadenylic acid channel at the 3' end of the mRNA precursor encoding the microbial gene. Alternatively, the second primer sequence is based on a sequence from a cloning vector.

此外，设计特定引物并用于扩增所述序列的部分或全长。可在扩增反应中直接标记得到的扩增产物，也可在扩增反应后标记，并在合适严谨性的条件下检测分离的全长DNA片段。In addition, specific primers are designed and used to amplify a portion or the full length of the sequence. The resulting amplified product can be labeled directly during the amplification reaction, or it can be labeled after the amplification reaction and the separated full-length DNA fragments can be detected under conditions of appropriate stringency.

因此，本发明的核酸分子可用于从相同或其它种类或细菌种类中分离同源蛋白质的编码基因。本领域熟知使用序列依赖性方案分离同源基因。序列依赖性方案的例子包括但不限于，核酸扩增技术的多种用途所示例的核酸杂交方法，以及DNA和RNA扩增方法(例如聚合酶链反应，Mullis等，美国专利号4,683,202；连接酶链反应(LCR)(Tabor，S.等，Proc.Acad.Sci.USA 82：1074(1985))；或链取代扩增(SDA；Walker等，Proc.Natl.Acad.Sci.U.S.A.89：392(1992))。Thus, the nucleic acid molecules of the present invention can be used to isolate genes encoding homologous proteins from the same or other species or bacterial species. The use of sequence-dependent protocols for isolating homologous genes is well known in the art. Examples of sequence-dependent protocols include, but are not limited to, nucleic acid hybridization methods, as exemplified by the various uses of nucleic acid amplification technology, and DNA and RNA amplification methods (e.g., polymerase chain reaction, Mullis et al., U.S. Patent No. 4,683,202; ligase chain reaction (LCR) (Tabor, S. et al., Proc. Acad. Sci. USA 82:1074 (1985)); or strand displacement amplification (SDA; Walker et al., Proc. Natl. Acad. Sci. U.S.A. 89:392 (1992)).

在一些实施方式中，本发明的分离的核酸分子用于从其它有机体中分离同源核酸分子，以鉴定产生相似或改善PUFA分布的PUFA合酶。在一些实施方式中，本发明的分离的核酸分子用于从参与产生大量DHA的其它有机体中分离同源的核酸分子。In some embodiments, the isolated nucleic acid molecules of the present invention are used to isolate homologous nucleic acid molecules from other organisms to identify PUFA synthases that produce similar or improved PUFA profiles. In some embodiments, the isolated nucleic acid molecules of the present invention are used to isolate homologous nucleic acid molecules from other organisms that are involved in producing large amounts of DHA.

本发明的核酸分子还包含与标记物序列在框内融合的编码PUFA合酶基因、PUFA合酶基因结构域或PUFA合酶基因片段的多核苷酸序列，所述标记物序列用于检测本发明的多肽。标记物序列包括本领域一般技术人员所知的营养缺陷型或显性标记物，如ZEO(零霉素)、NEO(G418)、潮霉素、亚砷酸盐、HPH、NAT等。The nucleic acid molecules of the present invention further comprise a polynucleotide sequence encoding a PUFA synthase gene, a PUFA synthase gene domain, or a PUFA synthase gene fragment fused in frame to a marker sequence for detecting the polypeptides of the present invention. Marker sequences include auxotrophic or dominant markers known to those of ordinary skill in the art, such as ZEO (zeromycin), NEO (G418), hygromycin, arsenite, HPH, NAT, and the like.

本发明还涵盖PUFA合酶基因的变体。变体可在编码区、非编码区或两者中包含改变。多核苷酸变体的例子包含产生沉默性取代、添加或缺失，但不改变所编码多肽的性质或活性的变化。在某些实施方式中，由于基因密码子简并形成的沉默取代产生多核苷酸序列变体。在一些实施方式中，因多种原因产生多核苷酸序列，例如，为了在特定条宿主中表达优化密码子(如，将破囊壶菌中的密码子改变为其它生物，如大肠杆菌或酿酒酵母中优选的)。The present invention also encompasses variants of PUFA synthase genes. Variants may include alterations in the coding region, noncoding region, or both. Examples of polynucleotide variants include changes that produce silent substitutions, additions, or deletions that do not alter the properties or activity of the encoded polypeptide. In certain embodiments, polynucleotide sequence variants are produced by silent substitutions due to degeneracy in the gene codon. In some embodiments, polynucleotide sequences are generated for a variety of reasons, such as, for example, optimizing codons for expression in a specific host (e.g., changing codons in Thraustochytrid to codons preferred in other organisms, such as E. coli or Saccharomyces cerevisiae).

本发明还提供等位变体，直系同源物和/或物种同源物。使用本文公开序列信息，通过本领域所知步骤可用于获取本文所述基因的全长基因、等位基因变体、剪接变体、全长编码部分、直系同源物和/或物种同源物。例如，通过产生本文提供序列的合适探针或引物，并且为等位基因变体和/或所需同源物筛选合适的核酸来源，可分离并鉴定等位基因变体和/或物种同源物。The present invention also provides allelic variants, orthologs and/or species homologs. Using the sequence information disclosed herein, procedures known in the art can be used to obtain full-length genes, allelic variants, splice variants, full-length coding portions, orthologs and/or species homologs of the genes described herein. For example, allelic variants and/or species homologs can be isolated and identified by generating suitable probes or primers based on the sequences provided herein and screening for suitable nucleic acid sources for allelic variants and/or desired homologs.

本发明涉及包含上述任何核酸分子或其组合以及转录控制序列的重组核酸分子。在一些实施方式中，所述重组核酸分子是重组载体。The present invention relates to a recombinant nucleic acid molecule comprising any of the above nucleic acid molecules or a combination thereof and a transcription control sequence. In some embodiments, the recombinant nucleic acid molecule is a recombinant vector.

本发明涉及产生重组载体的方法，包括将本文所述的一种或多种分离的核酸分子插入载体。The present invention relates to a method for producing a recombinant vector comprising inserting one or more of the isolated nucleic acid molecules described herein into a vector.

本发明的载体可以是例如克隆载体或表达载体。载体可以是，例如质粒、病毒颗粒、噬菌体等形式。The vector of the present invention can be, for example, a cloning vector or an expression vector. The vector can be, for example, in the form of a plasmid, a viral particle, a phage, or the like.

本发明的多核苷酸序列可包含于任何用于表达多肽的表达载体中。此类载体包括染色体、非染色体和合成DNA或RNA序列，如SV40衍生物；细菌质粒和酵母质粒。然而，可使用其它任何本领域一般技术人员所知的合适载体。The polynucleotide sequences of the present invention may be contained in any expression vector for expressing a polypeptide. Such vectors include chromosomal, non-chromosomal, and synthetic DNA or RNA sequences, such as SV40 derivatives; bacterial plasmids; and yeast plasmids. However, any other suitable vector known to those of ordinary skill in the art may be used.

可通过多种步骤将合适的DNA序列插入载体。通常，通过本领域已知方法将DNA序列插入合适的限制性核酸内切酶切割位点。此类步骤及其它可认为是在本领域技术人员所知范围内。Can by various steps, suitable dna sequence is inserted into carrier.Usually, dna sequence is inserted into suitable restriction endonuclease cleavage site by methods known in the art.Such steps and other can be considered as within the scope known to those skilled in the art.

本发明还包括含有一个或多个上述多核苷酸序列的重组构建物。所述构建物包括载体，如质粒或病毒载体，其中正向或反向插入一个或多个本发明的序列。在本发明的一个方面，所述构建物还包括调控序列，包括例如可操作地连接于序列的启动子。本领域技术人员知道大量合适的载体和启动子，并可商业购得。The present invention also includes recombinant constructs containing one or more of the above-mentioned polynucleotide sequences. The constructs include vectors, such as plasmids or viral vectors, into which one or more sequences of the present invention are inserted in either a forward or reverse orientation. In one aspect of the invention, the constructs also include regulatory sequences, including, for example, promoters operably linked to the sequences. A large number of suitable vectors and promoters are known to those skilled in the art and are commercially available.

多肽peptides

本发明涉及包含来自ATCC登录号PTA-9695和PTA-10212的分离的微生物的PUFA合酶蛋白质和结构域氨基酸序列的分离的多肽。The present invention relates to isolated polypeptides comprising the amino acid sequence of a PUFA synthase protein and domain from the isolated microorganisms with ATCC Accession Nos. PTA-9695 and PTA-10212.

本文所用术语“多肽”包括单数形式和复数形式，涵盖酰胺键(也称肽键)线性连接的单体(氨基酸)所组成的分子。术语“多肽”指两个或多个氨基酸的任何一条链或多条链，而非具体长度的产物。因此，肽、二肽、三肽、寡肽、“蛋白质”、“氨基酸链”或其它任何用于指两个或多个氨基酸的一条或多条链的术语均包含于“多肽”的定义内，术语“多肽”可与这些术语替代或互换使用。术语“多肽”还指多肽表达后修饰的产物，这些修饰包括但不限于糖基化，乙酰基化，磷酸化，氨基化，通过已知保护/封闭基团衍生化，蛋白酶切割或非天然氨基酸修饰。As used herein, the term "polypeptide" includes both singular and plural forms, encompassing molecules composed of monomers (amino acids) linearly linked by amide bonds (also known as peptide bonds). The term "polypeptide" refers to any chain or chains of two or more amino acids, rather than a product of a specific length. Therefore, peptides, dipeptides, tripeptides, oligopeptides, "proteins," "amino acid chains," or any other terms used to refer to one or more chains of two or more amino acids are included in the definition of "polypeptide," and the term "polypeptide" can be substituted or used interchangeably with these terms. The term "polypeptide" also refers to products of modifications after polypeptide expression, including but not limited to glycosylation, acetylation, phosphorylation, amination, derivatization by known protecting/blocking groups, protease cleavage, or modification with non-natural amino acids.

本文所述多肽包括但不限于片段、变体或衍生分子。术语“片段”、“变体”、“衍生物”和“类似物”当指多肽时，包括任何保留至少一些生物学活性的多肽。多肽片段可包括任何蛋白酶解片段，缺失片段和当递送给动物时易于抵达作用位点的片段。多肽片段还包括任何包含天然多肽的抗原性或免疫原性表位的多肽部分，包括线性或三维表位。多肽片段可包含变体区，包括上述片段，还包括由于氨基酸替换、缺失或插入引起的氨基酸序列改变的多肽。变体可天然产生，如等位基因变体。通过“等位基因变体”指占据有机体染色体给定座位的基因的替代形式。使用本领域所知的诱变技术产生非天然变体。本发明的多肽片段包含保守的或非保守的氨基酸取代、缺失或添加。多肽变体本文中也称作“多肽类似物”。本发明的多肽片段包括衍生分子。本文所用多肽或多肽片段的“衍生物”指对象多肽的一个或多个残基由功能性侧链反应化学衍生而来。“衍生物”还包括那些含有一个或多个20种标准氨基酸的天然氨基酸衍生物的肽。例如，4-羟基脯氨酸可取代为脯氨酸；5-羟基赖氨酸可取代为赖氨酸；3-甲基组氨酸可取代为组氨酸；高丝氨酸可取代为丝氨酸；鸟氨酸可取代为赖氨酸。The polypeptides described herein include, but are not limited to, fragments, variants, or derivative molecules. The terms "fragment," "variant," "derivative," and "analog," when referring to polypeptides, include any polypeptide that retains at least some biological activity. Polypeptide fragments may include any proteolytic fragments, deletion fragments, and fragments that readily reach the site of action when delivered to an animal. Polypeptide fragments also include any portion of a polypeptide that comprises an antigenic or immunogenic epitope of a native polypeptide, including linear or three-dimensional epitopes. Polypeptide fragments may comprise variant regions, including those described above, as well as polypeptides whose amino acid sequence is altered by amino acid substitutions, deletions, or insertions. Variants may occur naturally, such as allelic variants. By "allelic variant" is meant an alternative form of a gene occupying a given locus on an organism's chromosome. Non-natural variants are generated using mutagenesis techniques known in the art. Polypeptide fragments of the present invention may comprise conservative or non-conservative amino acid substitutions, deletions, or additions. Polypeptide variants are also referred to herein as "polypeptide analogs." Polypeptide fragments of the present invention include derivative molecules. As used herein, a "derivative" of a polypeptide or polypeptide fragment refers to one or more residues of the subject polypeptide chemically derivatized by reacting functional side chains. "Derivatives" also include peptides that contain one or more naturally occurring amino acid derivatives of the 20 standard amino acids. For example, 4-hydroxyproline can be substituted for proline; 5-hydroxylysine can be substituted for lysine; 3-methylhistidine can be substituted for histidine; homoserine can be substituted for serine; and ornithine can be substituted for lysine.

本发明的多肽可由本发明的任何核酸分子所编码。The polypeptides of the present invention may be encoded by any nucleic acid molecule of the present invention.

本发明涉及分离的多肽，所述多肽包含与Pfa1p(SEQ ID NO：2或SEQ ID NO：69)，Pfa2p(SEQ ID NO：4或SEQ ID NO：71)，Pfa3p(SEQ ID NO：6或SEQ ID NO：73)及其组合的氨基酸序列至少80％相同的氨基酸序列，其中所述多肽包含一种或多种PUFA合酶活性。The present invention relates to an isolated polypeptide comprising an amino acid sequence that is at least 80% identical to the amino acid sequence of Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69), Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71), Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73), and combinations thereof, wherein the polypeptide comprises one or more PUFA synthase activities.

本发明涉及包含氨基酸序列的多肽，所述氨基酸序列与本发明PUFA合酶的一个或多个PUFA合酶结构域的氨基酸序列至少80％相同。The present invention relates to polypeptides comprising an amino acid sequence that is at least 80% identical to the amino acid sequence of one or more PUFA synthase domains of a PUFA synthase of the present invention.

在一些实施方式中，本发明涉及包含氨基酸序列的多肽，所述氨基酸序列与包含一种或多种PUFA合酶结构域的Pfa1p(SEQ ID NO：2或SEQ ID NO：69)中的氨基酸序列至少80％相同。在一些实施方式中，所述多肽包含与Pfa1p(SEQ ID NO：2，SEQ ID NO：69)中包含一种或多种PUFA合酶结构域的氨基酸序列至少80％相同的氨基酸序列，所述PUFA合酶结构域如KS结构域(SEQ ID NO：8或SEQ ID NO：75)，MAT结构域(SEQ ID NO：10或SEQ ID NO：77)，ACP结构域(如SEQ ID NO：14、16、18、20、22、24、81、83、85、87、89、91、93、95、97或99中任一)，两种或多种ACP结构域的组合，如2、3、4、5、6、7、8、9或10种ACP结构域的组合，包括串联结构域(SEQ ID NO：12或SEQ ID NO：79及其部分)，KR结构域(SEQ ID NO：26或SEQID NO：101)，DH结构域(SEQ ID NO：28或SEQ ID NO：119)及其组合。在一些实施方式中，所述多肽包含两种或多种氨基酸序列，其中所述至少两种或多种氨基酸序列各自与Pfa1p(SEQ IDNO：2或SEQ ID NO：69)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa1p(SEQ ID NO：2，SEQ ID NO：69)中包含一种或多种PUFA合酶结构域的相同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa1p(SEQ ID NO：2，SEQ ID NO：69)中包含一种或多种PUFA合酶结构域的不同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa1p(SEQ ID NO：2或SEQ ID NO：69)中的不同氨基酸序列80％相同，与Pfa1p(SEQ IDNO：2或SEQ ID NO：69)中的对应结构域顺序相比，所述至少两种或多种氨基酸序列在多肽中以相同或不同顺序排列。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa1p(SEQ ID NO：2或SEQ ID NO：69)中包含一种或多种PUFA合酶结构域的氨基酸序列至少80％相同，所述PUFA合酶结构域如KS结构域(SEQ ID NO：8或SEQ ID NO：75)，MAT结构域(SEQ IDNO：10或SEQ ID NO：77)，ACP结构域(如SEQ ID NO：14、16、18、20、22、24、81、83、85、87、89、91、93、95、97或99中任一)，两种或多种ACP结构域的组合，如2、3、4、5、6、7、8、9或10种ACP结构域的组合，包括串联结构域(SEQ ID NO：12或SEQ ID NO：79及其部分)，KR结构域(SEQ IDNO：26或SEQ ID NO：101)，DH结构域(SEQ ID NO：28或SEQ ID NO：119)及其组合。在一些实施方式中，所述多肽包含Pfa1p(SEQ ID NO：2或SEQ ID NO：69)中包含一种或多种PUFA合酶结构域的一个或多个氨基酸序列，包括一个或多个拷贝的任何独立结构域与一个或多个拷贝任何其它独立结构域的组合。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to an amino acid sequence in Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69) comprising one or more PUFA synthase domains. In some embodiments, the polypeptide comprises an amino acid sequence that is at least 80% identical to an amino acid sequence of Pfa1p (SEQ ID NO: 2, SEQ ID NO: 69) comprising one or more PUFA synthase domains, such as a KS domain (SEQ ID NO: 8 or SEQ ID NO: 75), a MAT domain (SEQ ID NO: 10 or SEQ ID NO: 77), an ACP domain (such as any of SEQ ID NOs: 14, 16, 18, 20, 22, 24, 81, 83, 85, 87, 89, 91, 93, 95, 97, or 99), a combination of two or more ACP domains, such as a combination of 2, 3, 4, 5, 6, 7, 8, 9, or 10 ACP domains, including a tandem domain (SEQ ID NO: 12 or SEQ ID NO: 79 and portions thereof), a KR domain (SEQ ID NO: 26 or SEQ ID NO: 101), a DH domain (SEQ ID NO: 28 or SEQ ID NO: 29), or a DH domain (SEQ ID NO: 30). NO: 119) and combinations thereof. In some embodiments, the polypeptide comprises two or more amino acid sequences, wherein each of the at least two or more amino acid sequences is 80% identical to an amino acid sequence comprising one or more PUFA synthase domains in Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69). In some embodiments, the at least two or more amino acid sequences are 80% identical to the same amino acid sequence comprising one or more PUFA synthase domains in Pfa1p (SEQ ID NO: 2, SEQ ID NO: 69). In some embodiments, the at least two or more amino acid sequences are 80% identical to a different amino acid sequence comprising one or more PUFA synthase domains in Pfa1p (SEQ ID NO: 2, SEQ ID NO: 69). In some embodiments, the at least two or more amino acid sequences are 80% identical to a different amino acid sequence in Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69), and the at least two or more amino acid sequences are arranged in the same or a different order in the polypeptide compared to the order of the corresponding domains in Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69). In some embodiments, the at least two or more amino acid sequences are at least 80% identical to an amino acid sequence of Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69) comprising one or more PUFA synthase domains, such as a KS domain (SEQ ID NO: 8 or SEQ ID NO: 75), a MAT domain (SEQ ID NO: 10 or SEQ ID NO: 77), an ACP domain (such as any one of SEQ ID NOs: 14, 16, 18, 20, 22, 24, 81, 83, 85, 87, 89, 91, 93, 95, 97, or 99), a combination of two or more ACP domains, such as a combination of 2, 3, 4, 5, 6, 7, 8, 9, or 10 ACP domains, including a tandem domain (SEQ ID NO: 12 or SEQ ID NO: 79 and portions thereof), a KR domain (SEQ ID NO: 26 or SEQ ID NO: 101), a DH domain (SEQ ID NO: 28 or SEQ ID NO: 29), or a DH domain (SEQ ID NO: 30). In some embodiments, the polypeptide comprises one or more amino acid sequences comprising one or more PUFA synthase domains of Pfa1p (SEQ ID NO: 2 or SEQ ID NO: 69), including one or more copies of any individual domain in combination with one or more copies of any other individual domain.

在一些实施方式中，本发明涉及包含氨基酸序列的多肽，所述氨基酸序列与包含一种或多种PUFA合酶结构域的Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中的氨基酸序列至少80％相同。在一些实施方式中，所述多肽包含与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中包含一种或多种PUFA合酶结构域的氨基酸序列至少80％相同的氨基酸序列，所述PUFA合酶结构域如KS结构域(SEQ ID NO：30或SEQ ID NO：103)，CLF结构域(SEQ ID NO：32或SEQ IDNO：105)，AT结构域(SEQ ID NO：34或SEQ ID NO：107)，ER结构域(SEQ ID NO：36或SEQ IDNO：109)及其组合。在一些实施方式中，所述多肽包含两种或多种氨基酸序列，其中所述至少两种或多种氨基酸序列各自与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa2p(SEQ ID NO：4，SEQ ID NO：71)中的相同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa2p(SEQ ID NO：4，SEQ ID NO：71)中各自包含一种或多种PUFA合酶结构域的不同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中的不同氨基酸序列80％相同，与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中的对应结构域顺序相比，所述至少两种或多种氨基酸序列在多肽中以相同或不同顺序排列。在一些实施方式中，所述至少两种或多种多肽分子与Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同，所述PUFA合酶结构域如KS结构域(SEQ ID NO：30或SEQ ID NO：103)，CLF结构域(SEQ ID NO：32或SEQ ID NO：105)，AT结构域(SEQ ID NO：24或SEQ ID NO：107)，ER结构域(SEQ ID NO：36或SEQ ID NO：109)及其组合。在一些实施方式中，所述多肽包含Pfa2p(SEQ ID NO：4或SEQ ID NO：71)中包含一种或多种PUFA合酶结构域的一个或多个氨基酸序列，包括一个或多个拷贝的任何独立结构域与一个或多个拷贝任何其它独立结构域的组合。In some embodiments, the present invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to an amino acid sequence in Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71) comprising one or more PUFA synthase domains. In some embodiments, the polypeptide comprises an amino acid sequence that is at least 80% identical to an amino acid sequence in Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71) comprising one or more PUFA synthase domains, such as a KS domain (SEQ ID NO: 30 or SEQ ID NO: 103), a CLF domain (SEQ ID NO: 32 or SEQ ID NO: 105), an AT domain (SEQ ID NO: 34 or SEQ ID NO: 107), an ER domain (SEQ ID NO: 36 or SEQ ID NO: 109), and combinations thereof. In some embodiments, the polypeptide comprises two or more amino acid sequences, wherein each of the at least two or more amino acid sequences is 80% identical to an amino acid sequence in Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71) comprising one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to the same amino acid sequence in Pfa2p (SEQ ID NO: 4, SEQ ID NO: 71). In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa2p (SEQ ID NO: 4, SEQ ID NO: 71), each comprising one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71), and the at least two or more amino acid sequences are arranged in the same or different order in the polypeptide compared to the order of the corresponding domains in Pfa2p (SEQ ID NO: 4 or SEQ ID NO: 71). In some embodiments, the at least two or more polypeptide molecules are 80% identical to an amino acid sequence comprising one or more PUFA synthase domains of Pfa2p (SEQ ID NO:4 or SEQ ID NO:71), such as the KS domain (SEQ ID NO:30 or SEQ ID NO:103), the CLF domain (SEQ ID NO:32 or SEQ ID NO:105), the AT domain (SEQ ID NO:24 or SEQ ID NO:107), the ER domain (SEQ ID NO:36 or SEQ ID NO:109), and combinations thereof. In some embodiments, the polypeptide comprises one or more amino acid sequences comprising one or more PUFA synthase domains of Pfa2p (SEQ ID NO:4 or SEQ ID NO:71), including one or more copies of any individual domain in combination with one or more copies of any other individual domain.

在一些实施方式中，本发明涉及包含氨基酸序列的多肽，所述氨基酸序列与包含一种或多种PUFA合酶结构域的Pfa3p(SEQ ID NO：6或SEQ ID NO：73)中的氨基酸序列至少80％相同。在一些实施方式中，所述多肽包含与Pfa3(SEQ ID NO：6或SEQ ID NO：73)中包含一种或多种PUFA合酶结构域的氨基酸序列至少80％相同的氨基酸序列，所述PUFA合酶结构域如DH结构域(SEQ ID NO：38、SEQ ID NO：40、SEQ ID NO：111或SEQ ID NO：113)，ER结构域(SEQ ID NO：42或SEQ ID NO：115)及其组合。在一些实施方式中，所述多肽包含两种或多种氨基酸序列，其中所述至少两种或多种氨基酸序列各自与Pfa3p(SEQ ID NO：6或SEQ IDNO：73)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa3p(SEQ ID NO：6，SEQ ID NO：73)中包含一种或多种PUFA合酶结构域的相同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa3p(SEQ ID NO：6，SEQ ID NO：73)中各自包含一种或多种PUFA合酶结构域的不同氨基酸序列80％相同。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa3p(SEQ ID NO：6或SEQ ID NO：73)中的不同氨基酸序列80％相同，与Pfa3p(SEQ ID NO：6或SEQ ID NO：73)中的对应结构域顺序相比，所述至少两种或多种氨基酸序列在多肽中以相同或不同顺序排列。在一些实施方式中，所述至少两种或多种氨基酸序列与Pfa3(SEQ IDNO：6或SEQ ID NO：73)中包含一种或多种PUFA合酶结构域的氨基酸序列80％相同，所述PUFA合酶结构域如DH结构域(SEQ ID NO：38、SEQ ID NO：40、SEQ ID NO：111或SEQ ID NO：113)，ER结构域(SEQ ID NO：42或SEQ ID NO：115)及其组合。在一些实施方式中，所述多肽包含Pfa3p(SEQ ID NO：6或SEQ ID NO：73)中包含一种或多种PUFA合酶结构域的一个或多个氨基酸序列，包括一个或多个拷贝的任何独立结构域与一个或多个拷贝任何其它独立结构域的组合。In some embodiments, the present invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to an amino acid sequence in Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73) comprising one or more PUFA synthase domains. In some embodiments, the polypeptide comprises an amino acid sequence that is at least 80% identical to an amino acid sequence in Pfa3 (SEQ ID NO: 6 or SEQ ID NO: 73) comprising one or more PUFA synthase domains, such as a DH domain (SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 111, or SEQ ID NO: 113), an ER domain (SEQ ID NO: 42 or SEQ ID NO: 115), and combinations thereof. In some embodiments, the polypeptide comprises two or more amino acid sequences, wherein each of the at least two or more amino acid sequences is 80% identical to an amino acid sequence in Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73) comprising one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to the same amino acid sequence in Pfa3p (SEQ ID NO: 6, SEQ ID NO: 73) that comprises one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa3p (SEQ ID NO: 6, SEQ ID NO: 73) that each comprise one or more PUFA synthase domains. In some embodiments, the at least two or more amino acid sequences are 80% identical to different amino acid sequences in Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73), and the at least two or more amino acid sequences are arranged in the same or different order in the polypeptide compared to the order of the corresponding domains in Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73). In some embodiments, the at least two or more amino acid sequences are 80% identical to an amino acid sequence comprising one or more PUFA synthase domains of Pfa3 (SEQ ID NO: 6 or SEQ ID NO: 73), such as the DH domain (SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 111, or SEQ ID NO: 113), the ER domain (SEQ ID NO: 42 or SEQ ID NO: 115), and combinations thereof. In some embodiments, the polypeptide comprises one or more amino acid sequences comprising one or more PUFA synthase domains of Pfa3p (SEQ ID NO: 6 or SEQ ID NO: 73), including one or more copies of any individual domain in combination with one or more copies of any other individual domain.

在一些实施方式中，本发明涉及包含与SEQ ID NO：2或SEQ ID NO：69至少80％相同的氨基酸序列的多肽，其中所述多肽包含选自KS活性、MAT活性、ACP活性、KR活性、DH活性及其组合的PUFA合酶活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 2 or SEQ ID NO: 69, wherein the polypeptide comprises a PUFA synthase activity selected from the group consisting of KS activity, MAT activity, ACP activity, KR activity, DH activity, and combinations thereof.

在一些实施方式中，本发明涉及含有与SEQ ID NO：8或SEQ ID NO：75至少80％相同的氨基酸序列的多肽，其中所述多肽含有KS活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 8 or SEQ ID NO: 75, wherein the polypeptide comprises KS activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：10或SEQ ID NO：77至少80％相同的氨基酸序列的多肽，其中所述多肽含有MAT活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 10 or SEQ ID NO: 77, wherein the polypeptide comprises MAT activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：14、16、18、20、22、24、81、83、87、89、91、93、95、97或99中任一项至少80％相同的氨基酸序列的多肽，并且所述多肽含有ACP活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to any one of SEQ ID NO: 14, 16, 18, 20, 22, 24, 81, 83, 87, 89, 91, 93, 95, 97, or 99, and the polypeptide comprises ACP activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：12或SEQ ID NO：79至少80％相同的氨基酸序列的多肽，其中所述多肽含有ACP活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 12 or SEQ ID NO: 79, wherein the polypeptide comprises ACP activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：12至少80％相同的氨基酸序列的多肽，其中所述多肽含有ACP活性。在一些实施方式中，所述氨基酸序列与SEQ ID NO：12中包含1、2、3、4、5或6个ACP结构域的氨基酸序列至少80％相同，其中所述多肽包含与一个或多个ACP结构域有关的ACP活性。SEQ ID NO：14、16、18、20、22和24是SEQ ID NO：12中含有单一ACP结构域的代表性氨基酸序列。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 12, wherein the polypeptide comprises ACP activity. In some embodiments, the amino acid sequence is at least 80% identical to an amino acid sequence comprising one, two, three, four, five, or six ACP domains in SEQ ID NO: 12, wherein the polypeptide comprises ACP activity associated with one or more ACP domains. SEQ ID NOs: 14, 16, 18, 20, 22, and 24 are representative amino acid sequences of SEQ ID NO: 12 comprising a single ACP domain.

在一些实施方式中，本发明涉及含有与SEQ ID NO：79至少80％相同的氨基酸序列的多肽，其中所述多肽含有ACP活性。在一些实施方式中，所述氨基酸序列与SEQ ID NO：79中包含1、2、3、4、5、6、7、8、9或10个ACP结构域的氨基酸序列至少80％相同，其中所述多肽包含与一个或多个ACP结构域有关的ACP活性。SEQ ID NO：81、83、85、87、89、91、93、95、97和99是SEQ ID NO：79中包含单个ACP结构域的代表性氨基酸序列。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 79, wherein the polypeptide comprises ACP activity. In some embodiments, the amino acid sequence is at least 80% identical to an amino acid sequence comprising one, two, three, four, five, six, seven, eight, nine, or ten ACP domains in SEQ ID NO: 79, wherein the polypeptide comprises ACP activity associated with one or more ACP domains. SEQ ID NOs: 81, 83, 85, 87, 89, 91, 93, 95, 97, and 99 are representative amino acid sequences comprising individual ACP domains in SEQ ID NO: 79.

在一些实施方式中，本发明涉及含有与SEQ ID NO：26或SEQ ID NO：101至少80％相同的氨基酸序列的多肽，其中所述多肽含有KR活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 26 or SEQ ID NO: 101, wherein the polypeptide comprises KR activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：28或SEQ ID NO：119至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 28 or SEQ ID NO: 119, wherein the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及的多肽包含的氨基酸序列至少80％与SEQ ID NO：4或SEQ ID NO：71相同，并且所述多肽含有选自KS活性、CLF活性、AT活性、ER活性及其组合的PUFA合酶活性。In some embodiments, the present invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 4 or SEQ ID NO: 71, and the polypeptide comprises a PUFA synthase activity selected from KS activity, CLF activity, AT activity, ER activity, and combinations thereof.

在一些实施方式中，本发明涉及含有与SEQ ID NO：30或SEQ ID NO：103至少80％相同的氨基酸序列的多肽，其中所述多肽含有KS活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 30 or SEQ ID NO: 103, wherein the polypeptide comprises KS activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：32或SEQ ID NO：105至少80％相同的氨基酸序列的多肽，其中所述多肽含有CLF活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 32 or SEQ ID NO: 105, wherein the polypeptide comprises CLF activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：34或SEQ ID NO：107至少80％相同的氨基酸序列的多肽，其中所述多肽含有AT活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 34 or SEQ ID NO: 107, wherein the polypeptide comprises AT activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：36或SEQ ID NO：109至少80％相同的氨基酸序列的多肽，其中所述多肽含有ER活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 36 or SEQ ID NO: 109, wherein the polypeptide comprises ER activity.

在一些实施方式中，本发明涉及包含与SEQ ID NO：6或SEQ ID NO：73至少80％相同的氨基酸序列的多肽，并且所述多肽含有选自DH活性、ER活性及其组合的PUFA合酶活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence at least 80% identical to SEQ ID NO: 6 or SEQ ID NO: 73, and the polypeptide comprises a PUFA synthase activity selected from the group consisting of DH activity, ER activity, and combinations thereof.

在一些实施方式中，本发明涉及含有与SEQ ID NO：38至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性。In some embodiments, the present invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 38, wherein the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：40至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性。In some embodiments, the present invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 40, wherein the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：111至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性。In some embodiments, the present invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 111, wherein the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：113至少80％相同的氨基酸序列的多肽，其中所述多肽含有DH活性。In some embodiments, the present invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 113, wherein the polypeptide comprises DH activity.

在一些实施方式中，本发明涉及含有与SEQ ID NO：42或SEQ ID NO：115至少80％相同的氨基酸序列的多肽，其中所述多肽含有ER活性。In some embodiments, the invention relates to a polypeptide comprising an amino acid sequence that is at least 80% identical to SEQ ID NO: 42 or SEQ ID NO: 115, wherein the polypeptide comprises ER activity.

在一些实施方式中，所述多肽包含与本文报道的氨基酸序列至少约80％、85％或90％相同，或者至少约95％、96％、97％、98％、99％或100％相同的氨基酸序列。In some embodiments, the polypeptide comprises an amino acid sequence that is at least about 80%, 85%, or 90% identical, or at least about 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequences reported herein.

所述多肽的氨基酸序列与查询氨基酸序列至少，例如95％“相同”，意在表示对象多肽的氨基酸序列与查询序列相同，除了在查询氨基酸序列的每100个氨基酸中，所述对象多肽序列可包括最多5个氨基酸变化。换而言之，为得到包含与查询氨基酸序列至少有95％相同的氨基酸序列的多肽，对象序列中最多有5％的氨基酸可被插入、缺失、(得失位)或用其它氨基酸取代。参比序列的这些变化可以发生在参比氨基酸序列的氨基或羧基端或这些端位置之间的任何位置，可以在参比序列上各自分散开，或者出现在参比序列内的一个或一个以上毗连序列中。The amino acid sequence of a polypeptide is at least, for example, 95% "identical" to a query amino acid sequence, which means that the amino acid sequence of the subject polypeptide is identical to the query sequence, except that the subject polypeptide sequence may include up to 5 amino acid changes for every 100 amino acids in the query amino acid sequence. In other words, to obtain a polypeptide comprising an amino acid sequence at least 95% identical to the query amino acid sequence, up to 5% of the amino acids in the subject sequence may be inserted, deleted, (gained or lost positions), or substituted with other amino acids. These changes in the reference sequence can occur at the amino or carboxyl termini of the reference amino acid sequence or anywhere between these termini, and can be scattered throughout the reference sequence or occur in one or more contiguous sequences within the reference sequence.

实际上，一般通过已知计算机程序确定任何特定多肽是否具有与本发明的氨基酸序列至少80％、85％、90％、95％、96％、97％、98％或99％相同的氨基酸序列。如上所述，通过序列比对并计算相同性评分可确定查询序列(本发明序列)和对象序列之间的最佳总体配对的方法。使用计算机程序AlignX进行比对，该程序是英骏公司(Invitrogen)(www.invitrogen.com)Vector NTI套件的一个组件。使用ClustalW比对软件进行比对(J.Thompson等.，Nucleic Acids Res.22(22)：4673-4680(1994)。In practice, whether any particular polypeptide has an amino acid sequence that is at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identical to an amino acid sequence of the invention is generally determined by known computer programs. As described above, methods for determining the best overall match between a query sequence (a sequence of the invention) and a subject sequence can be performed by aligning the sequences and calculating an identity score. Alignments are performed using the computer program AlignX, which is a component of the Vector NTI suite from Invitrogen (www.invitrogen.com). Alignments are performed using ClustalW alignment software (J. Thompson et al., Nucleic Acids Res. 22(22): 4673-4680 (1994)).

使用默认评分矩阵Blosum62mt2。默认缺口开放罚分为10，缺口扩展罚分为0.1。The default scoring matrix Blosum62mt2 is used. The default gap opening penalty is 10 and the default gap extension penalty is 0.1.

在本发明的另一些方面，具有与本文所公开的多核苷酸序列至少80％、85％、90％、95％、96％、97％、98％或99％相同的多核苷酸序列的核酸分子编码具有一种或多种PUFA合酶活性的多肽。具有一种或多种PUFA合酶活性的多肽显示与本发明PUFA合酶的一种或多种活性相似，但不一定完全相同的一种或多种活性。In other aspects of the invention, nucleic acid molecules having a polynucleotide sequence that is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to a polynucleotide sequence disclosed herein encode a polypeptide having one or more PUFA synthase activities. Polypeptides having one or more PUFA synthase activities exhibit one or more activities that are similar, but not necessarily identical, to one or more activities of a PUFA synthase of the invention.

当然，因为遗传密码子的简并，本领域一般技术人员立即意识到大部分具有与本文所述多核苷酸序列至少80％、85％、90％、95％、96％、97％、98％或99％相同的多核苷酸序列的核酸分子编码“具有PUFA合酶功能活性”的多肽。事实上，由于任何这些多核苷酸的简并变体均编码相同的多肽，在许多情况下，本领域熟练技术人员可根据保守取代以及保守功能结构域的知识预测显示活性的多肽。在本发明的某些方面，以分离的形式提供本发明的多肽和多核苷酸，如纯化至均一。或者，常规合成人员可合成产生本发明的多肽和多核苷酸。Of course, due to the degeneracy of the genetic code, one of ordinary skill in the art will readily recognize that most nucleic acid molecules having a polynucleotide sequence that is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to the polynucleotide sequences described herein encode polypeptides that "have PUFA synthase functional activity." In fact, since degenerate variants of any of these polynucleotides encode the same polypeptide, in many cases, one skilled in the art can predict a polypeptide that exhibits activity based on conservative substitutions and knowledge of conserved functional domains. In certain aspects of the present invention, the polypeptides and polynucleotides of the present invention are provided in isolated form, e.g., purified to homogeneity. Alternatively, the polypeptides and polynucleotides of the present invention can be produced synthetically by conventional synthesizers.

本领域所知，通过比较多肽的氨基酸序列及其保守氨基酸取代物与第二条多肽的序列，可确定两条多肽间的“相似性”。It is known in the art that the "similarity" between two polypeptides can be determined by comparing the amino acid sequence and conservative amino acid substitutions of the polypeptide to the sequence of a second polypeptide.

在一些实施方式中，本发明的多肽是融合多肽。In some embodiments, the polypeptides of the present invention are fusion polypeptides.

如本文所用，“融合多肽”指包含第一条多肽通过肽键与第二条多肽线性连接的多肽。第一条多肽与第二条多肽可以相同或不同，可直接连接，也可以通过肽接头连接。本文所用术语“连接”、“融合的”和“融合”可以互换使用。这些术语指通过某些方式如化学偶联或重组将两个或更多的元件或组件结合在一起。“框内融合”指以维持原有开放阅读框正确读框的方式，将两个或多个开放阅读框连接形成连续更长的开放阅读框。因此，得到的重组融合蛋白是包含两个或多个片段的单一蛋白质，所述片段对应原始开放阅读框编码的多肽(这些片段在自然条件下通常不如此连接)。尽管阅读框为连续的融合片段，但片段仍可被物理或空间分割，如框内接头序列。“接头”序列是融合蛋白中分隔两个多肽编码区域的一个或多个氨基酸。As used herein, "fusion polypeptide" refers to a polypeptide comprising a first polypeptide linearly connected to a second polypeptide by a peptide bond. The first polypeptide and the second polypeptide may be the same or different and may be directly connected or connected by a peptide linker. The terms "connected", "fused" and "fusion" as used herein may be used interchangeably. These terms refer to the binding of two or more elements or components together by some means, such as chemical coupling or recombination. "In-frame fusion" refers to the connection of two or more open reading frames to form a continuous longer open reading frame in a manner that maintains the correct reading frame of the original open reading frame. Thus, the resulting recombinant fusion protein is a single protein comprising two or more fragments corresponding to the polypeptides encoded by the original open reading frames (these fragments are not usually connected in this way under natural conditions). Although the reading frames are continuous fusion fragments, the fragments can still be physically or spatially divided, such as in-frame linker sequences. The "linker" sequence is one or more amino acids that separate the two polypeptide coding regions in the fusion protein.

本发明涉及包含一种或多种本发明多肽和生物学可接受运载体的组合物。The present invention relates to compositions comprising one or more polypeptides of the present invention and a biologically acceptable carrier.

在一些实施方式中，所述组合物包含生物学可接受的“赋形剂”，其中所述赋形剂是组分或组分的混合物，将其用于本发明的组合物中，使组合物具有所需特征，其包括运载体。“生物学可接受的”指化合物、材料、组合物、盐和/或剂型，在充分的医学评估范围内，适合与活细胞组织接触而不引起过度的毒性、刺激、炎症反应或其它问题性并发症，在接触期间有合理的受益/风险比。可使用不同的赋形剂。在一些实施方式中，赋形剂可以是但不限于碱剂、稳定剂、抗氧化剂、粘附剂、分离剂、涂饰剂、外部相组分、控释组分、溶剂、表面活性剂、保湿剂，缓冲剂、填料，软化剂及其组合。除了本文所述的赋形剂外，赋形剂还可包括但不限于《雷明顿：药学科学与实践》(第21版)(Remington：The Science and Practice ofPharmacy，21^st ed.)(2005)所列的那些。本文将赋形剂列入特定的分类(例如，“溶剂”)旨在说明赋形剂的作用，而非限制。特定赋形剂可属于多种分类之中。In some embodiments, the composition comprises a biologically acceptable "excipient," wherein the excipient is a component or mixture of components that is used in the composition of the present invention to impart the desired characteristics to the composition, including a carrier. "Biologically acceptable" refers to a compound, material, composition, salt, and/or dosage form that, within the scope of adequate medical evaluation, is suitable for contact with living tissue without causing excessive toxicity, irritation, inflammatory response, or other problematic complications, and has a reasonable benefit/risk ratio during the contact period. Various excipients can be used. In some embodiments, an excipient can be, but is not limited to, an alkali agent, a stabilizer, an antioxidant, an adhesive, a separating agent, a finishing agent, an external phase component, a controlled release component, a solvent, a surfactant, a humectant, a buffer, a filler, a softener, and combinations thereof. In addition to the excipients described herein, excipients may also include, but are not limited to, those listed in Remington: The Science and Practice of Pharmacy, ^21st ed. (2005). The inclusion of an excipient in a specific category (e.g., "solvent") herein is intended to illustrate the role of the excipient, not to limit it. A particular excipient may fall into more than one classification.

本发明还涉及本文所公开任何多肽的片段、变体、衍生物或类似物。The present invention also relates to fragments, variants, derivatives or analogs of any of the polypeptides disclosed herein.

本发明的多肽可以是重组多肽、天然多肽或合成多肽。The polypeptide of the present invention can be a recombinant polypeptide, a natural polypeptide or a synthetic polypeptide.

宿主细胞host cells

本发明涉及表达上述任何核酸分子、上述任何重组核酸分子及其组合的宿主细胞。The present invention relates to host cells expressing any of the aforementioned nucleic acid molecules, any of the aforementioned recombinant nucleic acid molecules, and combinations thereof.

本文所用术语“表达”指基因产生生化物质如RNA或多肽的过程。该过程包括任何在细胞内基因的任何功能表现，包括但不限于，基因敲除以及瞬时表达和稳定表达。包括但不限于，基因转录为信使RNA(mRNA)，转移RNA(tRNA)，小发夹RNA(shRNA)，小干扰RNA(siRNA)，或任何其它RNA产物，以及此类mRNA翻译为多肽。若最终期望产物是生化物质，表达包括创造该生化物质和任何前体。As used herein, the term "expression" refers to the process by which a gene produces a biochemical, such as RNA or a polypeptide. This process encompasses any functional manifestation of a gene within a cell, including, but not limited to, gene knockout, transient expression, and stable expression. This includes, but is not limited to, the transcription of a gene into messenger RNA (mRNA), transfer RNA (tRNA), small hairpin RNA (shRNA), small interfering RNA (siRNA), or any other RNA product, and the translation of such mRNA into a polypeptide. If the desired end product is a biochemical, expression includes the creation of that biochemical and any precursors.

为了产生一种或多种期望的多不饱和脂肪酸，可遗传修饰宿主细胞，将本发明的PUFA合酶系统引入宿主细胞。To produce one or more desired polyunsaturated fatty acids, host cells can be genetically modified to introduce the PUFA synthase system of the present invention into the host cells.

当根据本发明遗传修饰有机体表达PUFA合酶系统时，一些宿主有机体可内源性表达与PUFA合酶系统联合以产生PUFA所需的附属蛋白质。然而，即便是有机体内源性产生同源附属蛋白质，也可能需要使用编码一种或多种附属蛋白质的核酸分子转化有机体，以使有机体能够产生PUFA或增强有机体产生PUFA。一些异源性附属蛋白质能够比宿主细胞内源性附属蛋白质更有效或更高效地与转化的PUFA合酶蛋白质一起工作。When an organism is genetically modified according to the present invention to express a PUFA synthase system, some host organisms may endogenously express accessory proteins required for the PUFA synthase system to produce PUFAs. However, even when an organism endogenously produces homologous accessory proteins, it may be necessary to transform the organism with a nucleic acid molecule encoding one or more accessory proteins to enable or enhance PUFA production. Some heterologous accessory proteins may work more effectively or efficiently with the transformed PUFA synthase protein than the host cell's endogenous accessory proteins.

本文定义附属蛋白质为并非核心PUFA合酶系统的一部分(即并非PUFA合酶复合物本身的一部分)，但可能对于使用本发明的核心PUFA合酶复合物产生或有效产生PUFA是必需的蛋白质。例如，为了产生PUFA，PUFA合酶系统必须与附属蛋白质一起工作，所述附属蛋白质将4′-磷酸泛酰巯基乙胺基部分从辅酶A转移到酰基运载体蛋白质(ACP)结构域上。因此，可认为PUFA合酶系统包括至少一个4′-磷酸泛酰巯基乙胺基转移酶(PPT酶)结构域，或者可认为此结构域是PUFA合酶系统的一个附属结构域或蛋白质。PPT酶的结构和功能特征描述参见美国专利申请公开号2002/0194641、2004/0235127和2005/0100995。Accessory proteins are defined herein as proteins that are not part of the core PUFA synthase system (i.e., not part of the PUFA synthase complex itself), but that may be necessary for the production or efficient production of PUFAs using the core PUFA synthase complex of the present invention. For example, to produce PUFAs, the PUFA synthase system must work with accessory proteins that transfer the 4'-phosphopantetheinyl moiety from coenzyme A to the acyl carrier protein (ACP) domain. Thus, a PUFA synthase system can be considered to include at least one 4'-phosphopantetheinyl transferase (PPTase) domain, or such a domain can be considered an accessory domain or protein of the PUFA synthase system. The structural and functional characteristics of PPTases are described in U.S. Patent Application Publication Nos. 2002/0194641, 2004/0235127, and 2005/0100995.

具有4′-磷酸泛酰巯基乙胺基转移酶生物学活性(功能)的结构域或蛋白质被表征为将4′-磷酸泛酰巯基乙胺基部分从辅酶A转移到酰基运载体蛋白质(ACP)的酶。向ACP中不变的丝氨酸残基的此种转移将失活的脱辅基(apo-)形式激活为活性全蛋白(holo-)形式。在聚酮化合物和脂肪酸合成中，磷酸泛酰巯基乙胺基团与成长的酰基链形成硫酯。PPT酶是在脂肪酸、酮聚化合物和非核糖体肽合成中已经良好表征的一个酶家族。多种PPT酶序列已知，晶体结构确定(如Reuter K.，等，EMBO J.18(23)：6823-31(1999))，突变分析鉴定出多个对活性至关重要的氨基酸残基(Mofid M.R.等，Biochemistry 43(14)：4128-36(2004))。A domain or protein having the biological activity (function) of a 4'-phosphopantetheinyltransferase is characterized as an enzyme that transfers a 4'-phosphopantetheinyl moiety from coenzyme A to an acyl carrier protein (ACP). This transfer to an invariant serine residue in the ACP activates the inactive apo-form to the active holo-form. In the synthesis of polyketides and fatty acids, the phosphopantetheinyl group forms a thioester with the growing acyl chain. PPTases are a family of well-characterized enzymes involved in the synthesis of fatty acids, polyketides, and nonribosomal peptides. The sequences of several PPTases are known, crystal structures have been determined (e.g., Reuter K., et al., EMBO J. 18(23):6823-31 (1999)), and mutational analysis has identified several amino acid residues that are critical for activity (Mofid M.R. et al., Biochemistry 43(14):4128-36 (2004)).

之前识别裂殖壶菌ACP结构域作为底物的异源PPT酶是念珠藻(Nostoc sp.)PCC7120(旧称念珠藻(Anabaena sp).PCC 7120)的Het I蛋白。Het I是出现于念珠藻中的一类基因，已知负责长链羟基脂肪酸的合成，该脂肪酸是这种有机体异形细胞中糖脂层的组分(Black和Wolk，J.Bacteriol.176：2282-2292(1994)；Campbell等，Arch.Microbiol.167：251-258(1997))。Het I可能激活出现在该类中的蛋白质Hgl E的ACP结构域。包含HetI的序列和构建物的描述参见如美国专利申请公开号2007/0244192，将其整体纳入作参考。A previous heterologous PPTase that recognized the Schizochytrium ACP domain as a substrate was the Het I protein from Nostoc sp. PCC7120 (formerly known as Anabaena sp. PCC 7120). Het I is a gene found in Nostoc sp. that is known to be responsible for the synthesis of long-chain hydroxy fatty acids, components of the glycolipid layer in the heterocysts of this organism (Black and Wolk, J. Bacteriol. 176:2282-2292 (1994); Campbell et al., Arch. Microbiol. 167:251-258 (1997)). Het I likely activates the ACP domain of Hgl E, a protein found in this class. Sequences and constructs containing Het I are described, for example, in U.S. Patent Application Publication No. 2007/0244192, which is incorporated by reference in its entirety.

之前证明识别裂殖壶菌ACP结构域的另一异源性PPT酶是Sfp，其来自枯草芽孢杆菌(Bacillus subtilis)。Sfp研究透彻并广泛使用，因为它能够识别广泛的底物。基于公开的序列信息(Nakana，等，Molecular and General Genetics 232：313-321(1992))，之前产生了Sfp的表达载体，通过将编码区和确定的上下游侧接DNA序列克隆入pACYC-184克隆载体得到该表达载体。该构建物编码功能性PPT酶，表现在当与裂殖壶菌Orfs在大肠杆菌中共表达时，在合适的条件下，引起细胞中DHA的蓄积(参见美国申请公开号2004/0235127，整体纳入作参考)。Another heterologous PPTase previously demonstrated to recognize the Schizochytrium ACP domain is Sfp, from Bacillus subtilis. Sfp is well-studied and widely used because it recognizes a wide range of substrates. Based on published sequence information (Nakana et al., Molecular and General Genetics 232:313-321 (1992)), an expression vector for Sfp was previously generated by cloning the coding region and defined upstream and downstream flanking DNA sequences into the pACYC-184 cloning vector. This construct encodes a functional PPTase and has been shown to result in DHA accumulation in E. coli when co-expressed with Schizochytrium Orfs under appropriate conditions (see U.S. Application Publication No. 2004/0235127, incorporated by reference in its entirety).

宿主细胞包括微生物细胞、动物细胞、植物细胞和昆虫细胞。合适宿主的代表性例子包括细菌细胞；嗜热菌或嗜温菌；海洋细菌；破囊壶菌；真菌细胞如酵母，植物细胞；昆虫细胞；以及分离的动物细胞。宿主细胞可以是未转染细胞或者已经用至少一种其它重组核酸分子转染的细胞。宿主细胞还可以包括已经工程改造表达PUFA合酶的转基因细胞。本领域熟练技术人员经过学习本文，能够选择合适的宿主。Host cells include microbial cells, animal cells, plant cells, and insect cells. Representative examples of suitable hosts include bacterial cells; thermophilic or mesophilic bacteria; marine bacteria; thraustochytrids; fungal cells such as yeast; plant cells; insect cells; and isolated animal cells. Host cells can be untransfected or transfected with at least one other recombinant nucleic acid molecule. Host cells can also include transgenic cells that have been engineered to express a PUFA synthase. Those skilled in the art will be able to select a suitable host upon studying this disclosure.

宿主细胞包括任何破囊壶菌目的微生物，如微生物的属包括但不限于：破囊壶菌属(Thraustochytrium)，网粘菌属(Labyrinthuloides)，日本壶菌属(Japonochytrium)和裂殖壶菌属(Schizochytrium)。这些属内的物种包括但不限于：任何裂殖壶菌(Schizochytrium)种类，包括黄金石斛裂殖壶菌(Schizochytrium aggregatum)、里氏裂殖壶菌(Schizochytrium limacinum)、米氏裂殖壶菌(Schizochytrium minutum)；任何破囊壶菌种类(Thraustochytrium)(包括之前的吾肯氏壶菌属种类，如威瑟氏吾肯氏壶菌(U.visurgensis)、阿姆氏吾肯氏壶菌(U.amoeboida)、萨迦氏吾肯氏壶菌(U.sarkariana)、普氏吾肯氏壶菌(U.profunda)、拉氏吾肯氏壶菌(U.radiate)、米氏吾肯氏壶菌(U.minuta)和吾肯氏壶菌(Ulkenia sp.)BP-5601)，包括斯托特破囊壶菌(Thraustochytriumstriatum)、金黄色破囊壶菌(Thraustochytrium aureum)、罗萨破囊壶菌(Thraustochytrium roseum)；以及日本壶菌属(Japonochytrium)种类。破囊壶菌菌株包括但不限于：裂殖壶菌(S31)(ATCC 20888)、裂殖壶菌(S8)(ATCC 20889)、裂殖壶菌(LC-RM)(ATCC 18915)、裂殖壶菌(SR21)、黄金石斛裂殖壶菌(Goldstein et Belsky)(ATCC28209)、里氏裂殖壶菌(Honda et Yokochi)(IFO 32693)、破囊壶菌(23B)(ATCC 20891)、斯托特破囊壶菌(施奈德)(ATCC 24473)、金黄色破囊壶菌(高士登)(Goldstein)(ATCC34304)、罗萨破囊壶菌(高士登)(ATCC 28210)以及日本壶菌(L1)(ATCC 28207)。其它合适遗传修饰的宿主微生物的例子包括但不限于：酵母，包括酿酒酵母(Saccharomycescerevisiae)、卡尔酵母(Saccharomyces carlsbergensis)或其它酵母，如念珠菌(Candida)，乳酸克鲁维斯酵母(Kluyveromyces)或其它真菌，例如，丝状真菌如曲霉(Aspergillus)，链孢霉(Neurospora)，青霉(Penicillium)等。细菌细胞也可用于宿主。包括用于发酵过程的大肠杆菌。或者，如乳酸菌(Lactobacillus)种或芽孢杆菌(Bacillus)种可用作宿主。Host cells include any microorganism of the order Thraustochytrids, such as microorganisms of genera including, but not limited to, Thraustochytrium, Labyrinthuloides, Japonochytrium, and Schizochytrium. Species within these genera include, but are not limited to, any species of Schizochytrium, including Schizochytrium aggregatum, Schizochytrium limacinum, and Schizochytrium minutum; any species of Thraustochytrium (including former species of the genus Ulkenia, such as U. visurgensis, U. amoeboida, U. sarkariana, U. profunda, U. radiate, U. minuta, and Ulkenia sp.) BP-5601), including Thraustochytrium striatum, Thraustochytrium aureum, Thraustochytrium roseum; and species of Japonochytrium. Thraustochytrid strains include, but are not limited to, Schizochytrium (S31) (ATCC 20888), Schizochytrium (S8) (ATCC 20889), Schizochytrium (LC-RM) (ATCC 18915), Schizochytrium (SR21), Goldstein et Belsky (ATCC 28209), Honda et Yokochi (IFO 32693), Thraustochytrium (23B) (ATCC 20891), Stout (Schneider) (ATCC 24473), Goldstein (ATCC 34304), Rossa (Goldstein) (ATCC 28210), and Japonicum (L1) (ATCC 28207). Examples of other suitable genetically modified host microorganisms include, but are not limited to, yeast, including Saccharomyces cerevisiae, Saccharomyces carlsbergensis, or other yeasts, such as Candida, Kluyveromyces lactis, or other fungi, such as filamentous fungi such as Aspergillus, Neurospora, Penicillium, etc. Bacterial cells can also be used as hosts. Including Escherichia coli for fermentation processes. Alternatively, species such as Lactobacillus or Bacillus can be used as hosts.

植物宿主细胞包括但不限于：任何高等植物，包括双子叶和单子叶植物，以及消费植物，包括作物和油料植物。这些植物可包括例如：芸苔(canola)、大豆、油菜籽、亚麻子、玉米、红花、向日葵和烟草。其他优选的植物包括已知可产生用作药剂、芳香剂、营养剂、功能食物成分或美容活性剂的化合物的植物或者经遗传工程改造产生这些化合物/试剂的植物。因此，可选择任何植物种类或植物细胞。植物和植物细胞，以及其来源植物包括但不限于：来自芸苔(Brassica rapa L.)；芸苔栽培品种NQC02CNX12(ATCC PTA-6011)、NQC02CNX21(ATCC PTA-6644)和NQC02CNX25(ATCC PTA-6012)的植物和植物细胞，以及衍生自芸苔栽培品种NQC02CNX12、NQC02CNX21和NQC02CNX25的栽培品种、育种品种和植物部分(分别参见美国专利7,355,100、7,456,340和7,348,473)；大豆(Glycine max)，油菜籽(Brassica spp.)；亚麻子/亚麻(Linum usitatissimum)，玉米(玉米)(Zea mays)，红花(Carthamus tinctorius)；向日葵(Helianthus annuus)；烟草(Nicotiana tabacum)；拟南芥、巴西坚果(Betholettia excelsa)；蓖麻子(Riccinus communis)，椰子(Cocusnucifera)，香菜(Coriandrum sativum)，棉花(Gossypium spp.)，花生(Arachishypogaea)；霍霍巴(Simmondsia chinensis)；榨菜(Brassica spp.和Sinapis alba)；油棕(Elaeis guineeis)，橄榄油(Olea eurpaea)；水稻(Oryza sativa)，南瓜(Cucurbitamaxima)，大麦(Hordeum vulgare)，小麦(Traeticum aestivum)；和浮萍(Lemnaceae sp.)。可对来自这些或其它植物的植株品系进行生产、选择或优化，使其具有所需性状，如或关于但不限于种子产量、抗倒伏、出苗、抗病性或耐受性、成熟、晚季植物完整性、株高、抗落叶、便于植株改造、油含量或油特征。通过植物育种，如系谱育种、轮回选择育种、互交和回交育种以及如标记辅助育种和耕作方法选择植株品系。参见例如，美国专利号7,348,473。Plant host cells include, but are not limited to, any higher plant, including dicots and monocots, as well as consumer plants, including crops and oilseed plants. These plants may include, for example, canola, soybean, rapeseed, linseed, corn, safflower, sunflower, and tobacco. Other preferred plants include plants known to produce compounds useful as pharmaceuticals, fragrances, nutrients, functional food ingredients, or cosmetic actives, or plants genetically engineered to produce such compounds/agents. Thus, any plant species or plant cell may be selected. Plants and plant cells, and plants from which they are derived, include, but are not limited to, plants and plant cells from Brassica rapa L.; Brassica cultivars NQC02CNX12 (ATCC PTA-6011), NQC02CNX21 (ATCC PTA-6644), and NQC02CNX25 (ATCC PTA-6012), and cultivars, cultivars, and plant parts derived from Brassica cultivars NQC02CNX12, NQC02CNX21, and NQC02CNX25 (see U.S. Patents 7,355,100, 7,456,340, and 7,348,473, respectively); soybean (Glycine max), rapeseed (Brassica spp.); linseed/flax (Linum usitatissimum), corn (Zea mays), safflower (Carthamus tinctorius); sunflower (Helianthus annuus); tobacco (Nicotiana tabacum); Arabidopsis thaliana, Brazil nut (Betholettia excelsa); castor bean (Riccinus communis), coconut (Cocus nucifera), coriander (Coriandrum sativum), cotton (Gossypium spp.), peanut (Arachis hypogaea); jojoba (Simmondsia chinensis); mustard tuber (Brassica spp. and Sinapis alba); oil palm (Elaeis guineeis), olive oil (Olea eurpaea); rice (Oryza sativa), pumpkin (Cucurbita maxima), barley (Hordeum vulgare), wheat (Traeticum aestivum); and duckweed (Lemnaceae sp.). Plant lines from these or other plants can be produced, selected or optimized to have desired traits, such as or related to, but not limited to, seed yield, lodging resistance, emergence, disease resistance or tolerance, maturity, late season plant integrity, plant height, resistance to defoliation, ease of plant modification, oil content or oil characteristics. Plant lines can be selected through plant breeding, such as pedigree breeding, recurrent selection breeding, intercross and backcross breeding, and such as marker-assisted breeding and farming methods. See, for example, U.S. Patent No. 7,348,473.

动物细胞包括任何分离的动物细胞。Animal cells include any isolated animal cell.

本发明涉及表达一种或多种本发明的核酸分子或重组核酸分子包括载体的宿主细胞。The present invention relates to host cells that express one or more nucleic acid molecules or recombinant nucleic acid molecules, including vectors, of the present invention.

本发明涉及制备重组宿主细胞的方法，包括将重组载体引入宿主细胞。The present invention relates to a method for preparing a recombinant host cell, comprising introducing a recombinant vector into the host cell.

使用本发明的载体例如克隆载体或表达载体可对宿主细胞进行遗传工程改造(转导或转化或转染)。载体可以是，例如质粒、病毒颗粒、噬菌体等形式。载体包含本文所述多核苷酸序列，以及合适的启动子或控制序列，可用于转化合适的宿主使其表达多核苷酸序列编码的多肽。宿主细胞的遗传修饰还可包括对优选或最佳宿主密码子使用进行的基因优化。Using the vectors of the present invention, such as cloning vectors or expression vectors, host cells can be genetically engineered (transduced, transformed, or transfected). The vectors can be in the form of, for example, plasmids, viral particles, bacteriophages, etc. The vectors contain the polynucleotide sequences described herein, as well as suitable promoters or control sequences, and can be used to transform suitable hosts to express the polypeptides encoded by the polynucleotide sequences. Genetic modification of host cells can also include genetic optimization for preferred or optimal host codon usage.

工程改造的宿主细胞可在常规营养培养基中培养，对所述培养基进行改良用于激活启动子，选择转化子，或扩增本发明的基因。培养条件，例如温度，pH等与选择宿主细胞用于表达的条件，为本领域一般技术人员所知。The engineered host cells can be cultured in conventional nutrient media modified for promoter activation, transformant selection, or amplification of the genes of the invention. Culture conditions, such as temperature, pH, and the like, and conditions for selecting host cells for expression, are known to those of ordinary skill in the art.

在一些实施方式中，本发明涉及遗传修饰植物或植物部分以表达本文所述PUFA合酶系统，其至少包括核心PUFA合酶酶复合物。本文定义的“植物的部分”或“植物部分”包含植物的任何部分，例如但不限于种子(未成熟或成熟)，油，花粉，胚，花，果实，芽，叶，根，茎，外植体等。在一些实施方式中，遗传修饰的植物或植物部分产生一种或多种PUFA，如EPA、DHA、DPA(n-3或n-6)、ARA、GLA、SDA、其它PUFA及其组合。已知植物不包含内源性PUFA合酶系统；因此本发明的PUFA合酶系统可用于工程改造植物具有独特的脂肪酸生产能力。在另外的实施方式中，进一步遗传修饰植物或植物部分表达至少一种PUFA合酶附属蛋白质(如PPT酶)。在一些实施方式中，植物是油料植物，其中油料种子和/或油料种子里的油包含PUFA合酶系统产生的PUFA。在一些实施方式中，遗传修饰的植物，植物部分，油料种子和/或油料种子里的油包含PUFA合酶系统产生的可检测量的至少一种PUFA。在另外的实施方式中，此类植物，植物部分，油料种子和/或油料种子里的油基本没有中间产物或副产物，所述中间产物或副产物不是引入的PUFA合酶系统的主要PUFA产物，也不是野生型植物的内源性FAS系统天然产生的。尽管通过FAS系统，野生型植物产生许多短或中链PUFA，如18碳PUFA，但由于使用本文所述PUFA合酶系统进行遗传修饰，植物、植物部分、油料种子和/或油料种子的油中仍可产生新的或另外的PUFA。In some embodiments, the present invention relates to genetically modified plants or plant parts to express the PUFA synthase system described herein, which includes at least the core PUFA synthase enzyme complex. "Part of a plant" or "plant part," as defined herein, includes any part of a plant, such as, but not limited to, seeds (immature or mature), oil, pollen, embryos, flowers, fruits, buds, leaves, roots, stems, explants, and the like. In some embodiments, the genetically modified plant or plant part produces one or more PUFAs, such as EPA, DHA, DPA (n-3 or n-6), ARA, GLA, SDA, other PUFAs, and combinations thereof. Plants are not known to contain endogenous PUFA synthase systems; therefore, the PUFA synthase systems of the present invention can be used to engineer plants with unique fatty acid production capabilities. In other embodiments, the plant or plant part is further genetically modified to express at least one PUFA synthase accessory protein (e.g., a PPTase). In some embodiments, the plant is an oilseed plant, wherein the oilseed and/or the oil from the oilseed contains PUFAs produced by the PUFA synthase system. In some embodiments, the oil in the genetically modified plant, plant part, oilseed, and/or oilseed comprises a detectable amount of at least one PUFA produced by the PUFA synthase system. In other embodiments, the oil in such plant, plant part, oilseed, and/or oilseed is substantially free of intermediates or byproducts that are not the primary PUFA product of the introduced PUFA synthase system and that are not naturally produced by the endogenous FAS system of the wild-type plant. Although wild-type plants produce many short- and medium-chain PUFAs, such as 18-carbon PUFAs, through the FAS system, new or additional PUFAs can be produced in the oil of the plant, plant part, oilseed, and/or oilseed as a result of genetic modification using the PUFA synthase system described herein.

使用经典植株开发和/或分子遗传学技术，可对植物进行遗传学修饰。参见美国申请公开号2007/0244192。本领域了解产生转基因植物的方法，其中将编码所需氨基酸序列的重组核酸分子插入植物的基因组中。例如，可使用病毒载体产生转基因植物，如通过美国专利号5,569,597；5,589,367；和5,316,931所述方法使用病毒载体转化单子叶植物。本领域熟知通过转化遗传改造或修饰植物的方法，包括生物学和物理转化方案。参见例如，B.L.Miki等，《植物分子生物学和生物技术》(METHODS IN PLANT MOLECULAR BIOLOGY ANDBIOTECHNOLOGY 67-88)中67-88页“将外源DNA引入植物的步骤”(Procedures forIntroducing Foreign DNA into Plants)(Glick，B.R.和Thompson，J.E.编，CRC出版公司，博卡拉顿，1993)。此外，可获得用于植物细胞或组织的转化和植物再生的载体和体外培养方法。参见例如，M.Y.Gruber等，《植物分子生物学和生物技术》(METHODS IN PLANTMOLECULAR BIOLOGY AND BIOTECHNOLOGY)中89-119页“用于植物转化的载体”(Vectorsfor Plant Transformation)(Glick，B.R.和Thompson，J.E.编，CRC出版公司，博卡拉顿，1993)。Plants can be genetically modified using classical plant development and/or molecular genetic techniques. See U.S. Application Publication No. 2007/0244192. Methods for producing transgenic plants are known in the art, wherein a recombinant nucleic acid molecule encoding a desired amino acid sequence is inserted into the plant's genome. For example, transgenic plants can be produced using viral vectors, such as by transforming monocots using viral vectors as described in U.S. Patent Nos. 5,569,597; 5,589,367; and 5,316,931. Methods for genetically engineering or modifying plants by transformation are well known in the art, including both biological and physical transformation protocols. See, for example, B.L. Miki et al., "Procedures for Introducing Foreign DNA into Plants," in METHODS IN PLANT MOLECULAR BIOLOGY AND BIOTECHNOLOGY 67-88 (Glick, B.R. and Thompson, J.E., eds., CRC Publishing Co., Boca Raton, 1993). In addition, vectors and in vitro culture methods for transformation of plant cells or tissues and plant regeneration are available. See, for example, M.Y. Gruber et al., "Vectors for Plant Transformation," in METHODS IN PLANTMOLECULAR BIOLOGY AND BIOTECHNOLOGY 67-88 (Glick, B.R. and Thompson, J.E., eds., CRC Publishing Co., Boca Raton, 1993).

一种广泛应用的将表达载体引入植物的方法是基于农杆菌天然转化系统。参见例如Horsch等，Science 227：1229(1985)以及美国专利号6,051,757。根癌土壤杆菌(A.tumefaciens)和毛根土壤杆菌(A.rhizogenes)是植物致病性土壤细菌，其能遗传转化植物细胞。根癌土壤杆菌和毛根土壤杆菌的Ti和Ri质粒分别携带负责遗传转化植物的基因。参见例如，Kado，C.I.，Crit.Rev.Plant.Sci.10：1(1991)。多份参考文献提供了农杆菌载体系统的描述以及农杆菌介导的基因转移的方法，包括Gruber等，同上；Miki等，同上；Moloney等，PlantCell Reports 8：238(1989)；美国专利号5,177,010；5,104,310；5,149,645；5,469,976；5,464,763；4,940,838；4,693,976；5,591,616；5,231,019；5,463,174；4,762,785；5,004,863；和5,159,135；和欧洲专利申请号0131624、120516、159418、176112、116718、290799、320500、604662、627752、0267159和0292435。A widely used method for introducing expression vectors into plants is based on the natural transformation system of Agrobacterium. See, for example, Horsch et al., Science 227:1229 (1985) and U.S. Patent No. 6,051,757. Agrobacterium tumefaciens and A. rhizogenes are plant pathogenic soil bacteria that can genetically transform plant cells. The Ti and Ri plasmids of A. tumefaciens and A. rhizogenes, respectively, carry the genes responsible for genetic transformation of plants. See, for example, Kado, C.I., Crit. Rev. Plant. Sci. 10:1 (1991). Descriptions of Agrobacterium vector systems and methods of Agrobacterium-mediated gene transfer are provided in several references, including Gruber et al., supra; Miki et al., supra; Moloney et al., Plant Cell Reports 2001; and others. 8:238 (1989); U.S. Patent Nos. 5,177,010; 5,104,310; 5,149,645; 5,469,976; 5,464,763; 4,940,838; 4,693,976; 5,591,616; 5,231,019; 5,463,174; 4,762,785; 5,004,863; and 5,159,135; and European Patent Application Nos. 0131624, 120516, 159418, 176112, 116718, 290799, 320500, 604662, 627752, 0267159, and 0292435.

其它用于植物转化的方法包括微弹介导的转化，其中DNA携带于微弹的表面。用生物射弹设备将微粒加速到足以穿透植物细胞壁和细胞膜的速度，以便将表达载体引入植物组织。参见，Sanford等，Part.Sci.Technol.5：27(1987)，Sanford，J.C.，TrendsBiotech.6：299(1988)，Sanford，J.C.，Physiol.Plant 79：206(1990)，Klein等，Biotechnology 10：268(1992)以及美国专利号5,015,580和5,322,783。还描述了将遗传物质包被的微弹加速导入细胞的技术，如美国专利号4,945,050和5,141,141所述。借助物理途径将DNA递送给植物的另一种方法是超声处理靶细胞。参见例如，Zhang等，Bio/Technology 9：996(1991)。或者，利用脂质体或原生质球融合将表达载体引入植物中。参见例如Deshayes等，EMBO J.，4：2731(1985)，Christou等，Proc Natl.Acad.Sci.USA 84：3962(1987)。利用CaCl₂沉淀、DNA注射、聚乙烯醇或聚-L-鸟氨酸将DNA直接摄入原生质体已见诸报道。参见例如Hain等，Mol.Gen.Genet.199：161(1985)和Draper等，Plant CellPhysiol.23：451(1982)。对原生质体以及完整的细胞和组织进行电穿孔也已见诸报道。参见例如Donn等，第七届国际植物细胞和组织培养会议摘要(Abstracts of VIIthInternational Congress on Plant Cell and Tissue Culture)IAPTC，A2-38，第53页(1990)；D′Halluin等，Plant Cell 4：1495-1505(1992)和Spencer等，Plant Mol.Biol.24：51-61(1994)；国际申请公开号WO 87/06614、WO 92/09696和WO 93/21335；以及美国专利号5,472,869和5,384,253。其它转化技术包括晶须技术，参见例如，美国专利号5,302,523和5,464,765。Other methods for plant transformation include microprojectile-mediated transformation, in which the DNA is carried on the surface of the microprojectile. Biolistic devices are used to accelerate the microparticles to a velocity sufficient to penetrate plant cell walls and membranes in order to introduce the expression vector into plant tissue. See, Sanford et al., Part. Sci. Technol. 5:27 (1987), Sanford, JC, Trends Biotech. 6:299 (1988), Sanford, JC, Physiol. Plant 79:206 (1990), Klein et al., Biotechnology 10:268 (1992), and U.S. Patents 5,015,580 and 5,322,783. Techniques for accelerating genetic material-coated microprojectiles into cells have also been described, such as those described in U.S. Patents 4,945,050 and 5,141,141. Another method for delivering DNA to plants by physical means is sonication of target cells. See, for example, Zhang et al., Bio/Technology 9:996 (1991). Alternatively, expression vectors can be introduced into plants using liposomes or spheroplast fusion. See, for example, Deshayes et al., EMBO J., 4:2731 (1985), Christou et al., Proc Natl. Acad. Sci. USA 84:3962 (1987). Direct uptake of DNA into protoplasts using _CaCl precipitation, DNA injection, polyvinyl alcohol, or poly-L-ornithine has been reported. See, for example, Hain et al., Mol. Gen. Genet. 199:161 (1985) and Draper et al., Plant Cell Physiol. 23:451 (1982). Electroporation of protoplasts and intact cells and tissues has also been reported. See, for example, Donn et al., Abstracts of VIIth International Congress on Plant Cell and Tissue Culture IAPTC, A2-38, p. 53 (1990); D'Halluin et al., Plant Cell 4: 1495-1505 (1992) and Spencer et al., Plant Mol. Biol. 24: 51-61 (1994); International Application Publication Nos. WO 87/06614, WO 92/09696 and WO 93/21335; and U.S. Pat. Nos. 5,472,869 and 5,384,253. Other transformation techniques include the whisker technique, see, for example, U.S. Pat. Nos. 5,302,523 and 5,464,765.

还可直接转化叶绿体或质体。因此，可产生重组植物，其中仅有叶绿体和质体DNA用上述任何核酸分子和重组核酸分子及其组合进行修饰。在叶绿体或质体中发挥功能的启动子为本领域所知。参见例如，Hanley-Bowden等，Trends in Biochemical Sciences 12：67-70(1987)。获得含插入异源DNA的叶绿体的细胞的方法和组合物可参见如美国专利号5,693,507和5,451,513。Chloroplasts or plastids can also be transformed directly. Thus, recombinant plants can be produced in which only chloroplast and plastid DNA is modified with any of the nucleic acid molecules and recombinant nucleic acid molecules described above, and combinations thereof. Promoters that function in chloroplasts or plastids are known in the art. See, for example, Hanley-Bowden et al., Trends in Biochemical Sciences 12:67-70 (1987). Methods and compositions for obtaining cells containing chloroplasts into which heterologous DNA has been inserted can be found, for example, in U.S. Patent Nos. 5,693,507 and 5,451,513.

还可使用其它有效转化的方法。Other methods for efficient transformation may also be used.

本领域熟悉用于植物转化的合适载体。参见例如，美国专利号6,495,738；7,271,315；7,348,473；7,355,100；7,456,340；5,571,698和5,625,033，及其公开的参考文献。Suitable vectors for plant transformation are well known in the art. See, for example, U.S. Patent Nos. 6,495,738; 7,271,315; 7,348,473; 7,355,100; 7,456,340; 5,571,698 and 5,625,033, and references therein.

表达载体可包含至少一种可操作地连接于调控元件(如启动子)的遗传标记物，从而使包含标记物的转化细胞通过负向选择(即抑制不含标记物基因细胞的生长)或正向选择(筛选遗传标记物编码的产物)回收。转化领域熟知数种常用选择性标记物基因，例如编码酶的基因，所述酶代谢解除化学试剂如抗生素或除草剂的毒性，或者编码改变靶点的基因，使其对于抑制剂不敏感。植物转化可用的选择性标记物包括但不限于：转座子Tn5(AphII)的氨基糖苷类磷酸转移酶基因，它编码卡那霉素、新霉素和G418抗性，以及草甘膦、潮霉素、甲氨蝶呤、膦蓖麻毒蛋白(bialophos)、咪唑啉、磺脲类三唑并嘧啶磺酰胺除草剂，如氯磺隆、溴苯、达拉朋等抗性或耐受性的编码基因。植物转化的一个常用选择性标记物基因是新霉素磷酸转移酶II(nptII)基因，该基因在植物调控信号的控制之下，产生卡那霉素抗性。参见例如Fraley等，Proc.Natl.Acad.Sci.U.S.A.80：4803(1983)。另外常用的选择性标记物基因是潮霉素磷酸转移酶基因，该基因产生对于抗生素潮霉素的抗性。参见例如，Vanden Elzen等，Plant Mol.Biol.5：299(1985)。其它来源于细菌的产生抗性的选择性标记物基因包括庆大霉素乙酰转移酶、链霉素磷酸转移酶、氨基糖苷类3′-腺苷酸转移、博莱霉素抗性决定因子。参见Hayford等，Plant Physiol.86：1216(1988)，Jones等，Mol.Gen.Genet.210：86(1987)，Svab等，Plant Mol.Biol.14：197(1990)，Hille等，PlantMol.Biol.7：171(1986)。其它选择性标记物基因产生对除草剂如草甘膦、草铵膦或溴苯腈的抗性。参见例如，Comai等，Nature 317：741-744(1985)，Gordon-Kamm等，Plant Cell 2：603-618(1990)和Stalker等，Science 242：419-423(1988)。植物转化的其它选择性标记物基因非细菌来源。这些基因包括例如，小鼠二氢叶酸还原酶、植物5-烯醇丙酮酰莽草酸-3-磷酸合酶和植物乙酰乳酸合酶。参见例如Eichholtz等，Somatic Cell Mol.Genet.13：67(1987)，Shah等，Science 233：478(1986)，Charest等，Plant Cell Rep.8：643(1990)。The expression vector may contain at least one genetic marker operably linked to a regulatory element (e.g., a promoter) so that transformed cells containing the marker can be recovered by negative selection (i.e., inhibiting the growth of cells lacking the marker gene) or positive selection (screening for the product encoded by the genetic marker). Several commonly used selectable marker genes are well known in the transformation art, such as genes encoding enzymes that metabolize and detoxify chemical agents such as antibiotics or herbicides, or genes that alter the target site, rendering it insensitive to the inhibitor. Selectable markers useful for plant transformation include, but are not limited to, the aminoglycoside phosphotransferase gene from the transposon Tn5 (AphII), which encodes resistance to kanamycin, neomycin, and G418, as well as genes encoding resistance or tolerance to glyphosate, hygromycin, methotrexate, bialophos, imidazolines, and sulfonylurea triazolopyrimidine sulfonamide herbicides such as chlorsulfuron, bromobenzene, and darabon. One commonly used selectable marker gene for plant transformation is the neomycin phosphotransferase II (nptII) gene, which, under the control of plant regulatory signals, confers kanamycin resistance. See, for example, Fraley et al., Proc. Natl. Acad. Sci. U.S.A. 80:4803 (1983). Another commonly used selectable marker gene is the hygromycin phosphotransferase gene, which confers resistance to the antibiotic hygromycin. See, for example, Vanden Elzen et al., Plant Mol. Biol. 5:299 (1985). Other resistance-conferring selectable marker genes of bacterial origin include gentamicin acetyltransferase, streptomycin phosphotransferase, aminoglycoside 3′-adenylate transferase, and the bleomycin resistance determinant. See Hayford et al., Plant Physiol. 86:1216 (1988), Jones et al., Mol. Gen. Genet. 210:86 (1987), Svab et al., Plant Mol. Biol. 14:197 (1990), Hille et al., Plant Mol. Biol. 7:171 (1986). Other selectable marker genes confer resistance to herbicides such as glyphosate, glufosinate, or bromoxynil. See, for example, Comai et al., Nature 317:741-744 (1985), Gordon-Kamm et al., Plant Cell 2:603-618 (1990), and Stalker et al., Science 242:419-423 (1988). Other selectable marker genes for plant transformation are of non-bacterial origin. These genes include, for example, mouse dihydrofolate reductase, plant 5-enolpyruvylshikimate-3-phosphate synthase, and plant acetolactate synthase. See, for example, Eichholtz et al., Somatic Cell Mol. Genet. 13:67 (1987), Shah et al., Science 233:478 (1986), Charest et al., Plant Cell Rep. 8:643 (1990).

报道基因可与或不与选择性标记物一起使用。报道基因是通常不出现在接收者有机体中的基因，常常编码蛋白质得到某种表型改变或酶特征。参见例如K.Weising等，Ann.Rev.Genetics 22：421(1988)。报道基因包括但不限于β-葡萄糖醛酸糖苷酶(GUS)、β-半乳糖苷酶、氯霉素乙酰基转移酶、绿色荧光蛋白和荧光素酶基因。参见例如，Jefferson，R.A.，Plant Mol.Biol.Rep.5：387(1987)，Teeri等，EMBO J.8：343(1989)，Koncz等，Proc.Natl.Acad.Sci U.S.A.84：131(1987)，DeBlock等，EMBO J.3：1681(1984)和Chalfie等，Science 263：802(1994)。在将基因引入接收者细胞后的合适时间可采用实验检测报道基因表达。这种实验需要使用编码大肠杆菌β-葡糖醛酸糖苷酶基因座的β-葡萄糖醛酸糖苷酶(GUS)的基因，如Jefferson等，Biochem.Soc.Trans.15：17-19(1987)所述。Reporter genes can be used with or without a selectable marker. A reporter gene is a gene not normally present in the recipient organism that often encodes a protein that results in a phenotypic change or enzymatic characteristic. See, for example, K. Weising et al., Ann. Rev. Genetics 22:421 (1988). Examples of reporter genes include, but are not limited to, β-glucuronidase (GUS), β-galactosidase, chloramphenicol acetyltransferase, green fluorescent protein, and luciferase genes. See, for example, Jefferson, R.A., Plant Mol. Biol. Rep. 5:387 (1987), Teeri et al., EMBO J. 8:343 (1989), Koncz et al., Proc. Natl. Acad. Sci U.S.A. 84:131 (1987), DeBlock et al., EMBO J. 3:1681 (1984), and Chalfie et al., Science 263:802 (1994). An assay can be used to detect reporter gene expression at an appropriate time after the gene has been introduced into the recipient cells. Such an assay requires the use of the gene encoding the β-glucuronidase (GUS) gene from the β-glucuronidase locus of Escherichia coli, as described in Jefferson et al., Biochem. Soc. Trans. 15: 17-19 (1987).

多种来源的启动子调控元件可有效用于植物细胞以表达外源基因。例如，可使用细菌来源的启动子调控元件，如章鱼碱合酶启动子、胭脂氨酸合酶启动子、甘露碱合酶启动子，以及病毒来源的启动子，如花椰菜花叶病毒(35S和19S)，35T(这是一个重新工程改造的35S启动子，参见国际申请公开号WO97/13402)。植物启动子调控元件包括但不限于核酮糖-1，6-二磷酸(RUBP)羧基酶小亚基(ssu)、β-伴大豆球蛋白启动子、β-菜豆蛋白启动子、ADH启动子、热休克启动子和组织特异性启动子。基质连接区、支架连接区、内含子、增强子和聚腺苷酸化序列也可用于提高转录效率和DNA整合。可包含这类元件以在植物中获得转化DNA的理想性能。典型的元件包括但不限于：Adh-内含子1、Adh-内含子6、苜蓿花叶病毒包被蛋白前导序列、玉米条纹病毒包被蛋白质前导序列，以及本领域技术人员可用的那些。也可使用组成型启动子调控元件指导连续基因表达。组成型启动子包括但不限于：来自植物病毒的启动子，如CaMV的35S启动子(Odell等，Nature 313：810-812(1985))，以及如水稻肌动蛋白(McElroy等，PlantCell 2：163-171(1990))、泛素(Christensen等，Plant Mol.Biol.12：619-632(1989)和Christensen等，Plant Mol.Biol.18：675-689(1992))、pEMU(Last等，Theor.Appl.Genet.81：581-588(1991))、MAS(Velten等，EMBO J.3：2723-2730(1984))，玉米H3组蛋白(Lepetit等，Mol.Gen.Genetics 231：276-285(1992)和Atanassova等，PlantJournal 2(3)：291-300(1992))的启动子，和ALS启动子，甘蓝型油菜(Brassica napus)ALS3结构基因5′侧的Xba1/NcoI片段(与Xba1/NcoI片段相似的核苷酸序列)(国际申请公开号WO96/30530)。还可在特定细胞或组织类型，如叶或种子中使用组织特异性启动子调控元件表达基因(如玉米醇溶蛋白、油质蛋白、油菜籽蛋白、ACP、球蛋白等)。组织特异性或组织优选性启动子包括但不限于，根优选启动子，如来自菜豆蛋白基因的启动子(Murai等，Science 23：476-482(1983)，和Sengupta-Gopalan等，Proc.Natl.Acad.Sci.U.S.A.82：3320-3324(1985))；叶特异性和光诱导性启动子，如来自cab或核酮糖二磷酸羧化酶-加氧酶(rubisco)的启动子(Simpson等，EMBO J.4(11)：2723-2729(1985)和Timko等，Nature318：579-582(1985))；花药特异性启动子如来自LAT52的启动子(Twell等，Mol.Gen.Genetics 217：240-245(1989))；花粉特异性启动子，如来自Zm13的启动子(Guerrero等，Mol.Gen.Genetics 244：161-168(1993))；或小孢子优选启动子，如来自apg的启动子(Twell等，Sex.Plant Reprod.6：217-224(1993))。启动子调控元件可在植物发育的特定阶段以及植物组织和器官中激活，包括但不限于：花粉特异性、胚胎特异性、玉米穗丝特异性、棉花纤维特异性、根特异性以及种子胚乳特异性启动子调控元件。可使用诱导型启动子调控元件，这种调控元件对于特定信号应答从而使基因表达，这些信号例如：物理刺激(热休克基因)；光(RUBP羧基酶)；激素(Em)；代谢物；化学物质和应激。诱导型启动子包括但不限于：来自ACEI系统对于铜产生应答的启动子(Mett等，PNAS 90：4567-4571(1993))；来自玉米In2基因对于苯磺酰胺除草安全剂响应的启动子(Hershey等，Mol..Gen Genetics227：229-237(1991)以及Gatz等，Mol.Gen.Genetics 243：32-38(1994))，来自Tn10Tet抑制物的启动子(Gatz等，Mol.Gen.Genetics 227：229-237(1991))；以及来自甾体激素基因的启动子，其转录活性由糖皮质激素诱导(Schena等，Proc.Natl.Acad.Sci.U.S.A.88：0421(1991)。Promoter regulatory elements from a variety of sources can be effectively used in plant cells to express exogenous genes. For example, promoter regulatory elements from bacterial sources, such as the octopine synthase promoter, nopaline synthase promoter, and mannopine synthase promoter, as well as promoters from viral sources, such as cauliflower mosaic virus (35S and 19S), 35T (this is a re-engineered 35S promoter, see International Application Publication No. WO97/13402). Plant promoter regulatory elements include, but are not limited to, ribulose-1,6-bisphosphate (RUBP) carboxylase small subunit (ssu), β-conglycinin promoter, β-phaseolin promoter, ADH promoter, heat shock promoters, and tissue-specific promoters. Matrix linker regions, scaffold linker regions, introns, enhancers, and polyadenylation sequences can also be used to improve transcription efficiency and DNA integration. Such elements can be included to obtain the desired properties of transforming DNA in plants. Typical elements include, but are not limited to, Adh-intron 1, Adh-intron 6, alfalfa mosaic virus coat protein leader sequence, corn streak virus coat protein leader sequence, and those available to those skilled in the art. Constitutive promoter regulatory elements can also be used to direct continuous gene expression. Constitutive promoters include, but are not limited to, promoters from plant viruses, such as the 35S promoter of CaMV (Odell et al., Nature 313:810-812 (1985)), and promoters such as rice actin (McElroy et al., Plant Cell 2:163-171 (1990)), ubiquitin (Christensen et al., Plant Mol. Biol. 12:619-632 (1989) and Christensen et al., Plant Mol. Biol. 18:675-689 (1992)), pEMU (Last et al., Theor. Appl. Genet. 81:581-588 (1991)), MAS (Velten et al., EMBO J. 3:2723-2730 (1984)), maize H3 histone (Lepetit et al., Mol. Gen. Genetics [0015] The invention also provides a promoter for expressing genes (e.g., zein, oleosin, napin, ACP, globulin, etc.) in specific cell or tissue types, such as leaves or seeds, using tissue-specific promoter regulatory elements. Tissue-specific or tissue-preferred promoters include, but are not limited to, root-preferred promoters, such as the promoter from the phaseolin gene (Murai et al., Science 23:476-482 (1983), and Sengupta-Gopalan et al., Proc. Natl. Acad. Sci. U.S.A. 82:3320-3324 (1985)); leaf-specific and light-inducible promoters, such as the promoter from cab or ribulose bisphosphate carboxylase-oxygenase (rubisco) (Simpson et al., EMBO J. 4(11):2723-2729 (1985) and Timko et al., Nature 318:579-582 (1985)); anther-specific promoters, such as the promoter from LAT52 (Twell et al., Mol. Gen. Genetics 217:240-245 (1989)); pollen-specific promoters, such as the promoter from Zm13 (Guerrero et al., Mol. Gen. Genetics 244:161-168 (1993)); or microspore-preferred promoters, such as the promoter from apg (Twell et al., Sex. Plant Reprod. 6:217-224 (1993)). Promoter regulatory elements can be activated at specific stages of plant development and plant tissues and organs, including but not limited to pollen-specific, embryo-specific, corn silk-specific, cotton fiber-specific, root-specific, and seed endosperm-specific promoter regulatory elements. Inducible promoter regulatory elements can be used, which respond to specific signals such as physical stimuli (heat shock genes); light (RUBP carboxylase); hormones (Em); metabolites; chemicals and stress to cause gene expression. Inducible promoters include, but are not limited to, a promoter from the ACEI system that responds to copper (Mett et al., PNAS 90:4567-4571 (1993)); a promoter from the maize In2 gene that responds to benzenesulfonamide herbicide safeners (Hershey et al., Mol. Gen Genetics 227:229-237 (1991) and Gatz et al., Mol. Gen. Genetics 243:32-38 (1994)), a promoter from the Tn10 Tet repressor (Gatz et al., Mol. Gen. Genetics 227:229-237 (1991)); and a promoter from a steroid hormone gene whose transcriptional activity is induced by glucocorticoids (Schena et al., Proc. Natl. Acad. Sci. U.S.A. 88:0421 (1991).

可使用信号序列指导多肽至细胞器或亚细胞组分或分泌到质外体。参见例如Becker等，Plant Mol.Biol.20：49(1992)，Knox，C.，等，Plant Mol.Biol.9：3-17(1987)，Lerner等，Plant Physiol.91：124-129(1989)，Fontes等，Plant Cell 3：483-496(1991)，Matsuoka等，Proc.Natl.Acad.Sci.88：834(1991)，Gould等，J.Cell.Biol.108：1657(1989)，Creissen等，Plant J.2：129(1991)，Kalderon，等，Cell 39：499-509(1984)，andSteifel等，Plant Cell 2：785-793(1990)。此类靶向序列使需要表达的蛋白质转移到最有效发挥功能的细胞结构，或转移到期望的表型功能所需细胞过程最集中的细胞区域。A signal sequence may be used to direct the polypeptide to an organelle or subcellular fraction or to the apoplast for secretion. See, e.g., Becker et al., Plant Mol. Biol. 20:49 (1992), Knox, C., et al., Plant Mol. Biol. 9:3-17 (1987), Lerner et al., Plant Physiol. 91:124-129 (1989), Fontes et al., Plant Cell 3:483-496 (1991), Matsuoka et al., Proc. Natl. Acad. Sci. 88:834 (1991), Gould et al., J. Cell. Biol. 108:1657 (1989), Creissen et al., Plant J. 2:129 (1991), Kalderon, et al., Cell 39:499-509 (1984), and Steifel et al., Plant Cell 2:785-793 (1990). Such targeting sequences allow the desired protein to be translocated to the cellular structure where it functions most efficiently, or to the region of the cell where the cellular processes required for the desired phenotypic function are most concentrated.

在一些实施方式中，使用信号序列指导本发明的蛋白质到亚细胞组分，如质体或叶绿体。通过将基因产物与信号序列融合，可将基因产物，包括异源基因产物，靶向至质体或叶绿体，所述信号序列在叶绿体输入时被切割，从而产生成熟蛋白质。参见例如Comai等.，J.Biol.Chem.263：15104-15109(1988)，以及van den Broeck等，Nature 313：358-363(1985)。可从编码RUBISCO蛋白、CAB蛋白、EPSP合酶、GS2蛋白的cDNA中，或从任何自然产生的叶绿体靶向蛋白中分离合适信号序列的编码DNA，所述自然产生的叶绿体靶向蛋白包含指导靶蛋白到叶绿体的信号序列(也称作叶绿体转运肽(CTP))。这些叶绿体靶向蛋白是本领域众所周知的。这些叶绿体靶向蛋白以包含氨基端CTP的较大前体蛋白形式质合成，CTP指导所述前体至叶绿体输入机器。通常叶绿体细胞器内的特异性内切酶切割CTP，因此从前体释放靶向的成熟蛋白进入叶绿体环境，所述成熟蛋白包括活性蛋白质，如酶。适合将基因或基因产物靶向至叶绿体或质体的肽的编码序列的例子包括矮牵牛花EPSPS CTP、拟南芥EPSPS CTP2和内含子以及本领所知其它序列。CTP的具体例子包括但不限于：拟南芥核酮糖二磷酸羧化酶小亚基ats1A转运肽，拟南芥EPSPS的转运肽，和玉米核酮糖二磷酸羧化酶小亚基的转运肽。优化转运肽的描述参见如，Van den Broeck等，Nature 313：358-363(1985)。原核和真核信号肽序列参见如Michaelis等，Ann.Rev.Microbiol.36：425(1982)。本发明可用的转运肽的其它例子包括叶绿体转运肽，参见Von Heijne等，PlantMol.Biol.Rep.9：104-126(1991)；Mazur等，Plant Physiol.85：1110(1987)；Vorst等，Gene65：59(1988)；Chen和Jagendorf，J.Biol.Chem.268：2363-2367(1993)；来自烟草(Nicotiana plumbaginifolia)rbcS基因的转运肽(Poulsen等，Mol.Gen.Genet.205：193-200(1986))；以及来自甘蓝型油菜酰基ACP硫酯酶的转运肽(Loader等，PlantMol.Biol.23：769-778(1993)；Loader等，Plant Physiol.110：336-336(1995)。In some embodiments, a signal sequence is used to direct the proteins of the invention to subcellular compartments, such as plastids or chloroplasts. Gene products, including heterologous gene products, can be targeted to plastids or chloroplasts by fusing the gene product to a signal sequence that is cleaved upon chloroplast import, thereby producing the mature protein. See, for example, Comai et al., J. Biol. Chem. 263:15104-15109 (1988), and van den Broeck et al., Nature 313:358-363 (1985). DNA encoding a suitable signal sequence can be isolated from cDNA encoding RUBISCO protein, CAB protein, EPSP synthase, GS2 protein, or from any naturally occurring chloroplast-targeting protein that contains a signal sequence (also known as a chloroplast transit peptide (CTP)) that directs the target protein to the chloroplast. Such chloroplast-targeting proteins are well known in the art. These chloroplast-targeting proteins are synthesized as larger precursor proteins containing an amino-terminal CTP that directs the precursor to the chloroplast import machinery. Specific endonucleases within the chloroplast typically cleave the CTP, thereby releasing the targeted mature protein, including active proteins such as enzymes, from the precursor into the chloroplast environment. Examples of coding sequences for peptides suitable for targeting genes or gene products to chloroplasts or plastids include the petunia EPSPS CTP, the Arabidopsis EPSPS CTP2 and introns, and other sequences known in the art. Specific examples of CTPs include, but are not limited to, the Arabidopsis ribulose bisphosphate carboxylase small subunit ats1A transit peptide, the Arabidopsis EPSPS transit peptide, and the maize ribulose bisphosphate carboxylase small subunit transit peptide. For a description of optimized transit peptides, see, e.g., Van den Broeck et al., Nature 313:358-363 (1985). Prokaryotic and eukaryotic signal peptide sequences are described, for example, in Michaelis et al., Ann. Rev. Microbiol. 36:425 (1982). Other examples of transit peptides that can be used in the present invention include chloroplast transit peptides, see Von Heijne et al., Plant Mol. Biol. Rep. 9:104-126 (1991); Mazur et al., Plant Physiol. 85:1110 (1987); Vorst et al., Gene 65:59 (1988); Chen and Jagendorf, J. Biol. Chem. 268:2363-2367 (1993); the transit peptide from the rbcS gene of tobacco (Nicotiana plumbaginifolia) (Poulsen et al., Mol. Gen. Genet. 205:193-200 (1986)); and the transit peptide from Brassica napus acyl ACP thioesterase (Loader et al., Plant Mol. Biol. 23:769-778 (1993); Loader et al., Plant Physiol. 85:1110 (1987); Vorst et al., Gene 65:59 (1988); Chen and Jagendorf, J. Biol. Chem. 268:2363-2367 (1993); Physiol. 110:336-336(1995).

本发明遗传修饰的植物还可通过修饰使内源性脂肪酸合酶缺失或失活，以减少与外源性PUFA合酶系统竞争丙二酰CoA，从而使丙二酰CoA水平增加，及其组合。参见例如，美国申请公开号2007/0245431。The genetically modified plants of the present invention can also be modified to delete or inactivate endogenous fatty acid synthase to reduce competition with exogenous PUFA synthase systems for malonyl-CoA, thereby increasing malonyl-CoA levels, and combinations thereof. See, for example, U.S. Application Publication No. 2007/0245431.

遗传修饰的植物可培养于发酵培养基中，或在合适的培养基，如土壤中生长。适合高等植物的培养基包括任何植物生长培养基，例如但不限于，土壤、沙、支持根生长的任何其它特定培养基(如蛭石、珍珠岩等)或水培以及合适的光、水、优化高等植物生长的营养补充剂。通过从植物中抽提化合物的纯化工艺可将PUFA从遗传修饰的植物中回收。可通过收获植物以及收获植物的油(从油料种子中)来回收PUFA。植物可以天然状态消耗或者进一步加工为可消耗产品。在一些实施方式中，本发明涉及遗传修饰的植物，其中作为遗传修饰的结果，植物至少产生一种PUFA，其中作为植物遗传修饰的结果，植物或蓄积PUFA的植物部分中总脂肪酸分布包含可检测量的PUFA。在一些实施方式中，所述植物是油料种子植物。在一些实施方式中，油料种子植物在其成熟种子中产生PUFA或在其种子的油中包含PUFA。Genetically modified plants can be cultured in fermentation media or grown in a suitable culture medium, such as soil. Culture media suitable for higher plants include any plant growth medium, such as, but not limited to, soil, sand, any other specialized culture medium that supports root growth (such as vermiculite, perlite, etc.), or hydroponics, as well as suitable light, water, and nutritional supplements to optimize the growth of higher plants. PUFAs can be recovered from genetically modified plants by purification processes that extract the compounds from the plants. PUFAs can be recovered by harvesting the plants and harvesting the oil from the plants (from oilseeds). The plants can be consumed in their natural state or further processed into consumable products. In some embodiments, the present invention relates to genetically modified plants, wherein as a result of the genetic modification, the plant produces at least one PUFA, wherein as a result of the genetic modification of the plant, the total fatty acid profile of the plant or the part of the plant that accumulates the PUFA contains a detectable amount of PUFAs. In some embodiments, the plant is an oilseed plant. In some embodiments, the oilseed plant produces PUFAs in its mature seeds or contains PUFAs in the oil of its seeds.

也可利用各种哺乳动物细胞培养系统来表达重组蛋白。表达载体包含复制起始位点、合适的启动子和增强子，以及任何必需的核糖体结合位点、聚腺苷酸化位点、剪接供受体位点、转录终止序列和5′侧接非转录序列。Various mammalian cell culture systems can also be used to express recombinant proteins. The expression vector contains a replication origin, a suitable promoter and enhancer, as well as any necessary ribosome binding sites, polyadenylation sites, splice donor and acceptor sites, transcription termination sequences, and 5' flanking non-transcribed sequences.

异源表达方法Heterologous expression method

本发明涉及产生至少一种PUFA的方法，包括在有效产生PUFA的条件下，于宿主细胞中，表达PUFA合酶系统，其中所述PUFA合酶系统包含本文所述任何分离的核酸分子和重组核酸分子及其组合，其中产生至少一种PUFA。在一些实施方式中，所述至少一种PUFA包括DHA、EPA或其组合。在一些实施方式中，所述宿主细胞是植物细胞、分离的动物细胞或微生物细胞。在一些实施方式中，所述宿主细胞是破囊壶菌。The present invention relates to a method for producing at least one PUFA, comprising expressing a PUFA synthase system in a host cell under conditions effective to produce the PUFA, wherein the PUFA synthase system comprises any of the isolated and recombinant nucleic acid molecules described herein, and combinations thereof, wherein at least one PUFA is produced. In some embodiments, the at least one PUFA comprises DHA, EPA, or a combination thereof. In some embodiments, the host cell is a plant cell, an isolated animal cell, or a microbial cell. In some embodiments, the host cell is a thraustochytrid.

本发明涉及产生富含DHA、EPA或其组合的脂质的方法，包括在有效产生脂质的条件下，于宿主细胞中表达PUFA合酶基因，其中在宿主细胞中所述PUFA合酶基因包含本文所述任何分离的核酸分子和重组核酸分子及其组合，其中产生富含DHA、EPA或其组合的脂质。The present invention relates to a method for producing lipids enriched in DHA, EPA, or a combination thereof, comprising expressing a PUFA synthase gene in a host cell under conditions effective to produce the lipids, wherein the PUFA synthase gene in the host cell comprises any of the isolated nucleic acid molecules and recombinant nucleic acid molecules described herein, and combinations thereof, wherein lipids enriched in DHA, EPA, or a combination thereof are produced.

本发明涉及从宿主细胞中分离脂质的方法，包括在有效产生脂质的条件下，于宿主细胞中表达PUFA合酶基因，并从所述宿主细胞中分离脂质，其中所述宿主细胞的PUFA合酶系统包含本文所述任何分离的核酸分子和重组核酸分子及其组合。The present invention relates to a method for isolating lipids from a host cell, comprising expressing a PUFA synthase gene in the host cell under conditions effective to produce lipids, and isolating lipids from the host cell, wherein the PUFA synthase system of the host cell comprises any of the isolated nucleic acid molecules and recombinant nucleic acid molecules described herein, and combinations thereof.

在一些实施方式中，从宿主细胞中分离含PUFA的一种或多种脂质组分。在一些实施方式中，从宿主细胞中分离的一种或多种组分包括总脂肪酸组分，甾醇酯组分，甘油三酯组分，游离脂肪酸组分，甾醇组分，双甘油组分，磷脂组分或其组合。在一些实施方式中，从宿主细胞中分离PUFA，根据引入宿主细胞的PUFA合酶系统的组成，所述PUFA富含ω-3脂肪酸、ω-6脂肪酸或其组合。在一些实施方式中，根据引入宿主细胞的PUFA合酶系统的组成，所述PUFA富含DHA、EPA、DPA n-6、ARA或其组合。在一些实施方式中，所述PUFA富含DHA、EPA或其组合。在一些实施方式中，从宿主细胞分离的PUFA的PUFA分布特征包括高浓度DHA，低浓度EPA、ARA、DPAn-6或其组合。在一些实施方式中，从宿主细胞分离的PUFA的PUFA分布特征包括高浓度DHA和EPA，低浓度ARA、DPA n-6或其组合。在一些实施方式中，从宿主细胞分离的PUFA的PUFA分布特征包括高浓度EPA，低浓度DHA、ARA、DPAn-6或其组合。In some embodiments, one or more lipid components containing PUFAs are isolated from the host cells. In some embodiments, the one or more components isolated from the host cells include a total fatty acid component, a sterol ester component, a triglyceride component, a free fatty acid component, a sterol component, a diglycerol component, a phospholipid component, or a combination thereof. In some embodiments, the PUFAs isolated from the host cells are enriched in ω-3 fatty acids, ω-6 fatty acids, or a combination thereof, depending on the composition of the PUFA synthase system introduced into the host cells. In some embodiments, the PUFAs are enriched in DHA, EPA, DPA n-6, ARA, or a combination thereof, depending on the composition of the PUFA synthase system introduced into the host cells. In some embodiments, the PUFAs are enriched in DHA, EPA, or a combination thereof. In some embodiments, the PUFA profile of the PUFAs isolated from the host cells includes a high concentration of DHA and a low concentration of EPA, ARA, DPA n-6, or a combination thereof. In some embodiments, the PUFA profile of the PUFAs isolated from the host cells includes a high concentration of DHA and EPA and a low concentration of ARA, DPA n-6, or a combination thereof. In some embodiments, the PUFA profile of the PUFAs isolated from the host cells includes a high concentration of EPA and a low concentration of DHA, ARA, DPAn-6, or a combination thereof.

本发明涉及在具有PUFA合酶活性的有机体中替换失活或缺失的PUFA合酶活性、引入新的PUFA合酶活性或增强已有的PUFA合酶活性的方法，包括在有效表达PUFA合酶活性的条件下于有机体中表达本文所述任何分离的核酸分子和重组核酸分子及其组合。在一些实施方式中，核酸分子包含本文所述PFA1、PFA2或PFA3 PUFA合酶的多核苷酸序列中的一种或多种，所述多核苷酸序列编码一种或多种PUFA合酶结构域。在一些实施方式中，有机体的PUFA分布特征在引入本发明的一种或多种核酸分子后发生改变。在一些实施方式中，PUFA分布特征的改变包括ω-3脂肪酸增加和ω-6脂肪酸减少。在一些实施方式中，PUFA特征的改变包括ω-6脂肪酸增加和ω-3脂肪酸减少。在一些实施方式中，ω-3和ω-6脂肪酸都增加。在一些实施方式中，DHA的量增加，而EPA、ARA、DPA n-6或其组合中一种或多种的量维持或减少。在一些实施方式中，EPA和DHA的量增加，而ARA、DPA n-6或其组合的量维持或减少。在一些实施方式中，EPA的量增加，而EPA、ARA、DPA n-6或其组合中一种或多种的量维持或减少。在一些实施方式中，所述核酸分子包含PFA3或其一个或多个结构域的多核苷酸序列。在一些实施方式中，所述核酸分子包含PFA3或其一个或多个结构域的多核苷酸序列，并且所述有机体中ω-3脂肪酸的量增加且ω-6脂肪酸的量减少。在一些实施方式中，所述核酸分子包含PFA2或其一个或多个结构域的多核苷酸序列，并且所述有机体中DHA的量增加且EPA的量减少。The present invention relates to methods for replacing an inactivated or deleted PUFA synthase activity, introducing a new PUFA synthase activity, or enhancing an existing PUFA synthase activity in an organism having PUFA synthase activity, comprising expressing any of the isolated and recombinant nucleic acid molecules described herein, and combinations thereof, in the organism under conditions effective to express the PUFA synthase activity. In some embodiments, the nucleic acid molecule comprises one or more polynucleotide sequences of a PF A1, PF A2, or PF A3 PUFA synthase described herein, encoding one or more PUFA synthase domains. In some embodiments, the PUFA profile of the organism is altered following introduction of one or more nucleic acid molecules of the present invention. In some embodiments, the altered PUFA profile comprises an increase in ω-3 fatty acids and a decrease in ω-6 fatty acids. In some embodiments, the altered PUFA profile comprises an increase in ω-6 fatty acids and a decrease in ω-3 fatty acids. In some embodiments, both ω-3 and ω-6 fatty acids are increased. In some embodiments, the amount of DHA is increased, while the amount of one or more of EPA, ARA, DPA n-6, or a combination thereof is maintained or decreased. In some embodiments, the amounts of EPA and DHA are increased, while the amounts of ARA, DPA n-6, or a combination thereof are maintained or decreased. In some embodiments, the amount of EPA is increased, while the amounts of one or more of EPA, ARA, DPA n-6, or a combination thereof are maintained or decreased. In some embodiments, the nucleic acid molecule comprises a polynucleotide sequence of PFA3 or one or more domains thereof. In some embodiments, the nucleic acid molecule comprises a polynucleotide sequence of PFA3 or one or more domains thereof, and the amount of omega-3 fatty acids in the organism is increased and the amount of omega-6 fatty acids is decreased. In some embodiments, the nucleic acid molecule comprises a polynucleotide sequence of PFA2 or one or more domains thereof, and the amount of DHA in the organism is increased and the amount of EPA is decreased.

本发明涉及在具有PUFA合酶活性的有机体中增加DHA、EPA或其组合产量的方法，包括在有效产生DHA、EPA或其组合的条件下于有机体中表达本文所述的任何分离的核酸分子和重组核酸分子及其组合，其中在所述有机体中PUFA合酶活性代替失活或缺失的活性，引入新活性或增强已有活性，并且所述有机体中DHA、EPA或其组合的产量增加。The present invention relates to a method for increasing the production of DHA, EPA, or a combination thereof in an organism having PUFA synthase activity, comprising expressing any of the isolated nucleic acid molecules and recombinant nucleic acid molecules described herein, and combinations thereof, in an organism under conditions effective to produce DHA, EPA, or a combination thereof, wherein the PUFA synthase activity in the organism replaces an inactive or deleted activity, introduces a new activity, or enhances an existing activity, and the production of DHA, EPA, or a combination thereof in the organism is increased.

概述本发明后，参考本文提供的实施例可获得更进一步理解本发明。以下实施例仅用于说明目的，不构成限制。Having summarized the present invention, a further understanding of the present invention can be obtained by referring to the examples provided herein. The following examples are for illustrative purposes only and are not intended to be limiting.

实施例1Example 1

设计KS和DH PUFA合酶结构域的简并引物，并从分离的微生物中分离对应的序列，所述微生物保藏于ATCC登录号PTA-9695，也称作裂殖壶菌ATCC PTA-9695。Degenerate primers for the KS and DH PUFA synthase domains were designed and the corresponding sequences were isolated from an isolated microorganism deposited under ATCC Accession No. PTA-9695, also known as Schizochytrium ATCC PTA-9695.

根据公开的日本希瓦氏菌(Shewanella japonica)、裂殖壶菌(Schizochytriumsp.)ATCC 20888、金黄色破囊壶菌(Thraustochytrium aureum)(ATCC 34304)和破囊壶菌(Thraustochytrium sp.)23B ATCC 20892的PFA1(之前称为orfA或ORF 1)序列，设计裂殖壶菌ATCC PTA-9695PFA1的KS区域(即包含KS结构域的区域)的简并引物：Based on the published PFA1 (formerly known as orfA or ORF 1) sequences of Shewanella japonica, Schizochytrium sp. ATCC 20888, Thraustochytrium aureum (ATCC 34304), and Thraustochytrium sp. 23B ATCC 20892, degenerate primers for the KS region (i.e., the region containing the KS domain) of Schizochytrium sp. ATCC PTA-9695 PFA1 were designed:

prDS173(正向)：GATCTACTGCAAGCGCGGNGGNTTYAT(SEQ ID NO：62)，以及prDS173 (forward): GATCTACTGCAAGCGCGGNGGNTTYAT (SEQ ID NO: 62), and

prDS174(反向)：GGCGCAGGCGGCRTCNACNAC(SEQ ID NO：63)。prDS174 (reverse): GGCGCAGGCGGCRTCNACNAC (SEQ ID NO: 63).

根据公开的南极细菌(Moritella marina)、裂殖壶菌(Schizochytrium sp.)ATCC20888、深海发光杆菌(Photobacter profundum)和破囊壶菌23B ATCC 20892序列，设计裂殖壶菌ATCC PTA-9695 PFA3的DH区域(之前称为orfC或ORF3)的简并引物：Based on the published sequences of Moritella marina, Schizochytrium sp. ATCC 20888, Photobacter profundum, and Thraustochytrium 23B ATCC 20892, degenerate primers for the DH region of Schizochytrium sp. ATCC PTA-9695 PFA3 (formerly known as orfC or ORF3) were designed:

JGM190(正向)：CAYTGGTAYTTYCCNTGYCAYTT(SEQ ID NO：64)；以及JGM190 (forward): CAYTGGTAYTTYCCNTGYCAYTT (SEQ ID NO: 64); and

BLR242(反向)：CCNGGCATNACNGGRTC(SEQ ID NO：65)。BLR242 (reverse): CCNGGCATNACNGGRTC (SEQ ID NO: 65).

使用染色体DNA模板的PCR条件如下：0.2μM dNTP，每种引物0.1uM，8％DMSO，200ng染色体DNA，2.5U HerculaseII融合聚合酶(斯查塔基公司(Stratagene))和1XHerculase缓冲液(斯查塔基公司)，总体积50μL。该PCR方法包括以下步骤：(1)98℃3分钟；(2)98℃30秒；(3)50℃30秒；(4)72℃2分钟；(5)重复第2-4步40个循环；(6)72℃5分钟；和(7)存于6℃。The PCR conditions using chromosomal DNA template were as follows: 0.2 μM dNTP, 0.1 μM of each primer, 8% DMSO, 200 ng of chromosomal DNA, 2.5 U of Herculase II Fusion Polymerase (Stratagene), and 1× Herculase buffer (Stratagene), in a total volume of 50 μL. The PCR method included the following steps: (1) 98°C for 3 minutes; (2) 98°C for 30 seconds; (3) 50°C for 30 seconds; (4) 72°C for 2 minutes; (5) repeating steps 2-4 for 40 cycles; (6) 72°C for 5 minutes; and (7) storage at 6°C.

对于两对引物，使用来自裂殖壶菌(Schizochytrium sp.)ATCC登录号PTA-9695的染色体模板，PCR得到期望大小的不同的DNA产物。根据厂商手册分别将PCR产物克隆到载体pJET1.2/钝端(富酶泰斯(Fermentas))中，使用提供的标准引物确定插入序列。For the two pairs of primers, PCR was performed using a chromosomal template from Schizochytrium sp. ATCC Accession No. PTA-9695 to obtain different DNA products of the expected size. The PCR products were cloned into the vector pJET1.2/blunt end (Fermentas) according to the manufacturer's manual, and the insert sequence was determined using the provided standard primers.

获自PCR产物的DNA序列与来自NCBI GenBank的已知序列进行比较，使用标准BLASTx搜索(BLASTx参数：低复杂性过滤打开；矩阵：BLOSUM62；缺口消耗；存在11，延伸1。Stephen F.Altschul，Thomas L.Madden，Alejandro A.Jinghui Zhang，ZhengZhang，Webb Miller和David J.Lipman(1997)，《缺口BLAST和PSI-BLAST：新一代的蛋白数据库搜索程序》(Gapped BLAST and PSI-BLAST：a new generation of protein databasesearch programs)，Nucleic Acids Res.25：3389-3402.)。The DNA sequences obtained from the PCR products were compared with known sequences from NCBI GenBank using a standard BLASTx search (BLASTx parameters: low complexity filtering on; matrix: BLOSUM62; gap depletion; presence 11, extension 1. Stephen F. Altschul, Thomas L. Madden, Alejandro A. Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389-3402.).

在氨基酸水平，与来源于含破囊壶菌ATCC PTA-9695的KS片段的克隆DNA推导出的氨基酸序列具有最高水平同源性的是：裂殖壶菌ATCC 20888“多不饱和脂肪酸合酶亚基A”(相同性＝87％；阳性率＝92％)；内德斯希瓦氏菌(Shewanella oneidensis)MR-1“多结构域β-酮酰合酶”(相同性＝49％；阳性率＝64％)；以及希瓦氏菌(Shewanella sp.)MR-4“β-酮酰合酶“(相同性＝49％；阳性率＝64％)。At the amino acid level, the highest levels of homology to the amino acid sequences deduced from cloned DNA containing the KS fragment of Thraustochytrium ATCC PTA-9695 were: Schizochytrium ATCC 20888 "Polyunsaturated fatty acid synthase subunit A" (identity = 87%; positive rate = 92%); Shewanella oneidensis MR-1 "Multidomain β-ketoacyl synthase" (identity = 49%; positive rate = 64%); and Shewanella sp. MR-4 "β-ketoacyl synthase" (identity = 49%; positive rate = 64%).

在氨基酸水平，与来源于含破囊壶菌ATCC PTA-9695的DH片段的克隆DNA的氨基酸序列具有最高水平同源性的是：裂殖壶菌ATCC 20888“多不饱和脂肪酸合酶亚基C”(相同性＝61％；阳性率＝71％)；皮氏希瓦氏菌(Shewanella pealeana)ATCC 700345“β-羟酰基-(酰基-载体-蛋白)脱水酶FabA/FabZ”(相同性＝35％；阳性率＝50％)；以及斯得明希瓦氏菌(Shewanella sediminis.)HAW-EB3“ω-3多不饱和脂肪酸合酶PfaC(相同性＝34％；阳性率＝50％)。At the amino acid level, the highest levels of homology to the amino acid sequences of cloned DNA containing the DH fragment of Thraustochytrium ATCC PTA-9695 were: Schizochytrium ATCC 20888 "Polyunsaturated fatty acid synthase subunit C" (identity = 61%; positive rate = 71%); Shewanella pealeana ATCC 700345 "β-hydroxyacyl-(acyl-carrier-protein) dehydratase FabA/FabZ" (identity = 35%; positive rate = 50%); and Shewanella sediminis. HAW-EB3 "ω-3 polyunsaturated fatty acid synthase PfaC" (identity = 34%; positive rate = 50%).

实施例2Example 2

从裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695中鉴定PUFA合酶基因。PUFA synthase genes were identified from Schizochytrium sp. ATCC PTA-9695.

通过标准步骤从微生物中制备基因组DNA。参见例如，Sambrook J.和RussellD.2001.《分子克隆：实验室手册》(第三版)冷泉港实验室出版社，纽约冷泉港(Molecularcloning：A laboratory manual，3rd edition.Cold Spring Harbor Laboratory Press，Cold Spring Harbor，New York)。简单的说：(1)从对数中期培养物中离心获得500μL细胞。再次离心细胞，用小口径枪头移去细胞团块中所有的残留液体；Prepare genomic DNA from microorganisms using standard procedures. See, for example, Sambrook J. and Russell D. 2001. Molecular cloning: A laboratory manual, 3rd edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. Briefly: (1) Centrifuge 500 μL of cells from a mid-logarithmic culture. Centrifuge the cells again and remove any remaining liquid from the cell pellet using a small-bore pipette tip.

(2)团块重悬于200μL裂解缓冲液中(20mM Tris pH 8.0，125μg/mL蛋白酶K，50mMNaCl，10mM EDTA pH 8.0，0.5％SDS)；(2) The pellet was resuspended in 200 μL lysis buffer (20 mM Tris pH 8.0, 125 μg/mL proteinase K, 50 mM NaCl, 10 mM EDTA pH 8.0, 0.5% SDS);

(3)在50℃裂解细胞1小时；(3) Lyse cells at 50°C for 1 hour;

(4)用移液器将裂解混合物转移到锁相凝胶(PLG-艾本德)(PLG-Eppendorf)2mL试管中；(4) Use a pipette to transfer the lysis mixture into a 2 mL tube of phase lock gel (PLG-Eppendorf);

(5)加入等体积的P∶C∶I并混合1.5小时；(5) Add an equal volume of P:C:I and mix for 1.5 hours;

(6)将试管12k x g离心5分钟；(6) Centrifuge the tube at 12k x g for 5 minutes;

(7)将水相从PLG管内的凝胶上移去，在水相中加入等体积的氯仿，混合30分钟；(7) Remove the aqueous phase from the gel in the PLG tube, add an equal volume of chloroform to the aqueous phase, and mix for 30 minutes;

(8)将试管14k x g离心约5分钟；(8) Centrifuge the tube at 14k x g for approximately 5 minutes;

(9)从氯仿中将上层(水相)吸去，并放在一个新试管中；(9) Aspirate the upper layer (aqueous phase) from the chloroform and place it in a new test tube;

(10)加入0.1体积的3M NaOAC并混合(倒置数次)；(10) Add 0.1 volume of 3M NaOAC and mix (invert several times);

(11)加入2体积100％EtOH并混合(倒置数次)，基因组DNA沉淀在该阶段形成；(11) Add 2 volumes of 100% EtOH and mix (invert several times). A genomic DNA precipitate is formed at this stage.

(12)在微型离心机中将试管于4℃14k离心约15分钟；(12) Centrifuge the tube at 4°C (14kJ) for approximately 15 minutes in a microcentrifuge.

(13)将液体小心倒去，基因组DNA在试管底部；(13) Carefully pour off the liquid, leaving the genomic DNA at the bottom of the tube;

(14)用0.5mL 70％EtOH洗涤团块；(14) Wash the pellet with 0.5 mL of 70% EtOH;

(15)在微型离心机中将试管于4℃14k离心约5分钟；(15) Centrifuge the tube at 4°C (14kJ) for approximately 5 minutes in a microcentrifuge.

(16)小心倒去EtOH，干燥基因组DNA团块；(16) Carefully pour off the EtOH and dry the genomic DNA pellet;

和(17)直接在基因组DNA团块中加入合适体积的H₂O和RNA酶。and (17) adding appropriate volumes of H ₂ O and RNase directly to the genomic DNA pellet.

根据厂商手册，利用分离的基因组，在粘粒pWEB-TNC^TM(艾比森得)(Epicentre)中产生包含大片段(约40kB)的重组文库。根据标准克隆杂交步骤使用³²P放射性标记的探针筛选粘粒文库(Sambrook J.和Russell D.2001.《分子克隆：实验室手册》(第三版)冷泉港实验室出版社，纽约冷泉港(Molecular cloning：A laboratory manual，3rd edition.ColdSpring Harbor Laboratory Press，Cold Spring Harbor，New York))。探针含有与实施例1中来自其它有机体的公开PUFA合酶序列同源的DNA。将来自上述pJET1.2/钝端克隆的克隆片段分别进行DNA限制性酶切得到这些探针，并用标准方法标记。在所有的情况下，各个探针与某些粘粒的强烈杂交表明克隆含有与PUFA合酶基因同源的DNA。Using the isolated genome, a recombinant library containing large fragments (approximately 40 kB) was generated in the cosmid pWEB-TNC ^™ (Epicentre) according to the manufacturer's manual. The cosmid library was screened using a ³² P radiolabeled probe according to standard colony hybridization procedures (Sambrook J. and Russell D. 2001. Molecular cloning: A laboratory manual, 3rd edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York). The probe contained DNA homologous to the published PUFA synthase sequences from other organisms described in Example 1. These probes were generated by restriction enzyme digestion of cloned fragments from the pJET1.2/blunt-end clones described above and labeled using standard methods. In all cases, strong hybridization of each probe to certain cosmids indicated that the clones contained DNA homologous to the PUFA synthase gene.

粘粒克隆pDS115显示与KS区域的探针强烈结合，选择该克隆进行裂殖壶菌ATCCPTA-9695 PFA1基因的DNA测序。粘粒克隆pDS115，包含裂殖壶菌ATCC PTA-9695 PFA1和PFA2基因，于2009年1月27日保存在布达佩斯条约下的美国典型培养物保藏中心，专利保藏所，大学大道第10801号，马纳萨斯，VA20110-2209(American Type Culture Collection，Patent Depository，10801University Boulevard，Manassas，VA 20110-2209)，ATCC登录号PTA-9737。使用标准方法设计实施例1中确定的KS区域DNA序列的测序引物。为了确定裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695PFA1的DNA序列，进行数轮DNA测序，包括使用标准方法设计后续测序引物，以“走完”粘粒克隆。Cosmid clone pDS115 showed strong binding to the probe for the KS region and was selected for DNA sequencing of the Schizochytrium ATCC PTA-9695 PFA1 gene. Cosmid clone pDS115, containing the Schizochytrium ATCC PTA-9695 PFA1 and PFA2 genes, was deposited with the American Type Culture Collection, Patent Depository, 10801 University Boulevard, Manassas, VA 20110-2209, under the Budapest Treaty on January 27, 2009, under ATCC Accession No. PTA-9737. Sequencing primers for the KS region DNA sequence determined in Example 1 were designed using standard methods. To determine the DNA sequence of Schizochytrium sp. ATCC PTA-9695 PFA1, several rounds of DNA sequencing were performed, including design of subsequent sequencing primers using standard methods to "walk through" the cosmid clone.

之前公开的破囊壶菌PUFA合酶系统中，PUFA合酶基因PFA1和PFA2簇集排布以异向转录。来自裂殖壶菌ATCC PTA-9695的PFA1和PFA2也是如此。通过对于来自粘粒克隆pDS115的DNA序列进行“步测”，概念上PFA2的起始位点距离PFA1的起始位点493个核苷酸，并且异向转录。裂殖壶菌ATCC PTA-9695 PFA1和PFA2 PUFA合酶基因的每一核苷酸碱基对均被至少两个独立的高质量DNA测序反应所覆盖，所述测序反应最小集合Phred评分为40(置信水平为99.99％)。In the previously disclosed thraustochytrid PUFA synthase system, the PUFA synthase genes PFA1 and PFA2 are clustered and transcribed in opposite directions. This is also true for PFA1 and PFA2 from Schizochytrium ATCC PTA-9695. By "walking" the DNA sequence from the cosmid clone pDS115, it was determined that the start site of PFA2 is conceptually 493 nucleotides from the start site of PFA1 and is transcribed in opposite directions. Every nucleotide base pair of the Schizochytrium ATCC PTA-9695 PFA1 and PFA2 PUFA synthase genes was covered by at least two independent high-quality DNA sequencing reactions, with a minimum pooled Phred score of 40 (99.99% confidence level).

粘粒克隆pBS4显示了与DH区域的探针强烈结合，选择该克隆进行裂殖壶菌ATCCPTA-9695 PFA3基因的DNA测序。粘粒克隆pBS4，包含裂殖壶菌ATCC PTA-9695 PFA3基因，于2009年1月27日保存在布达佩斯条约下的美国典型培养物保藏中心，专利保藏所，大学大道第10801号，马纳萨斯，VA20110-2209(American Type Culture Collection，PatentDepository，10801 University Boulevard，Manassas，VA 20110-2209)，ATCC登录号PTA-9736。使用标准方法设计实施例1中确定的DH区域DNA序列的测序引物.为了确定裂殖酵母ATCC PTA-9695 PFA3的DNA序列，进行数轮DNA测序，包括使用标准方法设计后续测序引物，以“走完”粘粒克隆。裂殖壶菌ATCC PTA-9695PFA3 PUFA合酶基因的每一核苷酸碱基对均被至少两个独立的高质量DNA测序反应所覆盖，所述测序反应最小集合Phred评分为40(置信水平为99.99％)。Cosmid clone pBS4 showed strong binding to the probe in the DH region and was selected for DNA sequencing of the Schizochytrium ATCC PTA-9695 PFA3 gene. Cosmid clone pBS4, containing the Schizochytrium ATCC PTA-9695 PFA3 gene, was deposited under the Budapest Treaty with the American Type Culture Collection, Patent Depository, 10801 University Boulevard, Manassas, VA 20110-2209, on January 27, 2009, under ATCC Accession No. PTA-9736. Sequencing primers for the DH region DNA sequence determined in Example 1 were designed using standard methods. To determine the DNA sequence of S. pombe ATCC PTA-9695 PFA3, several rounds of DNA sequencing were performed, including the design of subsequent sequencing primers using standard methods to "walk through" the cosmid clone. Every nucleotide base pair of the Schizochytrium sp. ATCC PTA-9695 PFA3 PUFA synthase gene was covered by at least two independent high-quality DNA sequencing reactions with a minimum pooled Phred score of 40 (99.99% confidence level).

表1显示了与之前公开的序列相比，裂殖壶菌ATCC PTA-9695PFA1(SEQ ID NO：1)、PFA2(SEQ ID NO：3)和PFA3(SEQ ID NO：5)多核苷酸序列的相同性。利用DNA比对标准VectorNTI程序的“AlignX”程序中的评分矩阵“swgapdnamt”来确定相同性。Table 1 shows the identities of the polynucleotide sequences of Schizochytrium sp. ATCC PTA-9695 PFA1 (SEQ ID NO: 1), PFA2 (SEQ ID NO: 3), and PFA3 (SEQ ID NO: 5) compared to previously published sequences. Identities were determined using the scoring matrix "swgapdnamt" in the "AlignX" program of VectorNTI, a standard program for DNA alignment.

表1：PFA1、PFA2和PFA3多核苷酸序列相同性百分比Table 1: Percentage identity of polynucleotide sequences of PFA1, PFA2 and PFA3

表2显示了与之前公开PUFA合酶氨基酸序列相比，裂殖壶菌ATCC PTA-9695 Pfa1p(SEQ ID NO：2)、Pfa2p(SEQ ID NO：4)和Pfa3p(SEQ ID NO：6)氨基酸序列的相同性。利用蛋白比对标准VectorNTI程序的“AlignX”程序中的评分矩阵“blosum62mt2”来确定相同性。Table 2 shows the identity of the amino acid sequences of Schizochytrium sp. ATCC PTA-9695 Pfa1p (SEQ ID NO: 2), Pfa2p (SEQ ID NO: 4), and Pfa3p (SEQ ID NO: 6) to previously published PUFA synthase amino acid sequences. Identities were determined using the scoring matrix "blosum62mt2" in the "AlignX" program of VectorNTI, a standard protein alignment program.

表2：Pfa1p、Pfa2p和Pfa3p氨基酸序列百分比相同性Table 2: Percent identity of amino acid sequences of Pfa1p, Pfa2p and Pfa3p

实施例3Example 3

进行结构域分析，以注释裂殖壶菌ATCC PTA-9695PFA1、PFA2和PFA3的PUFA合酶结构域和活性位点各自的序列坐标。基于与已知PUFA合酶、脂肪酸合酶和聚酮化合物合酶结构域的同源性鉴定结构域。Domain analysis was performed to annotate the PUFA synthase domains and active site sequence coordinates of Schizochytrium sp. ATCC PTA-9695 PFA1, PFA2, and PFA3. Domains were identified based on homology to known PUFA synthase, fatty acid synthase, and polyketide synthase domains.

表3显示与裂殖壶菌ATCC PTA-9695 PFA1相关的结构域和活性位点。Table 3 shows the structural domains and active sites associated with Schizochytrium ATCC PTA-9695 PFA1.

表3：裂殖壶菌ATCC PTA-9695 PFA1结构域分析Table 3: Structural analysis of the PFA1 domain of Schizochytrium ATCC PTA-9695

裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa1的第一个结构域是KS结构域。含有裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa1 KS结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：7，对应SEQ ID NO：1的第7-1401位。含有裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa1 KS结构域的氨基酸序列在此表示为SEQ IDNO：8，对应SEQ ID NO：2的第3-467位。KS结构域包含活性位点基序：DXAC*(SEQ ID NO：43)，其中*酰基结合位点对应SEQ ID NO：2的C203。同样，KS结构域末端存在一个特征性基序：GFGG(SEQ ID NO：44)，对应SEQ ID NO：2的第455-458位以及SEQ ID NO：8的第453-456位。The first domain of Schizochytrium sp. ATCC PTA-9695 Pfa1 is the KS domain. The nucleotide sequence encoding the KS domain of Schizochytrium sp. ATCC PTA-9695 Pfa1 is set forth herein as SEQ ID NO:7, corresponding to positions 7-1401 of SEQ ID NO:1. The amino acid sequence encoding the KS domain of Schizochytrium sp. ATCC PTA-9695 Pfa1 is set forth herein as SEQ ID NO:8, corresponding to positions 3-467 of SEQ ID NO:2. The KS domain contains the active site motif: DXAC* (SEQ ID NO:43), where the * acyl binding site corresponds to C203 of SEQ ID NO:2. Likewise, there is a characteristic motif at the end of the KS domain: GFGG (SEQ ID NO: 44), corresponding to positions 455-458 of SEQ ID NO: 2 and positions 453-456 of SEQ ID NO: 8.

裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa1的第二个结构域是MAT结构域。含有裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa1 MAT结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：9，对应SEQ ID NO：1的第1798-2700位。含有裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa1 MAT结构域的氨基酸序列在此表示为SEQ IDNO：10，对应SEQ ID NO：2的第600-900位。MAT结构域包含活性位点基序：GHS*XG(SEQ IDNO：46)，其中*酰基结合位点对应SEQ ID NO：2的S699。The second domain of Schizochytrium sp. ATCC PTA-9695 Pfa1 is the MAT domain. The nucleotide sequence encoding the Schizochytrium sp. ATCC PTA-9695 Pfa1 MAT domain is set forth herein as SEQ ID NO:9, corresponding to positions 1798-2700 of SEQ ID NO:1. The amino acid sequence encoding the Schizochytrium sp. ATCC PTA-9695 Pfa1 MAT domain is set forth herein as SEQ ID NO:10, corresponding to positions 600-900 of SEQ ID NO:2. The MAT domain contains the active site motif: GHS*XG (SEQ ID NO:46), where the * acyl binding site corresponds to S699 of SEQ ID NO:2.

裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa1的第3-8个结构域是6个串联的ACP结构域，本文记作ACP1、ACP2、ACP3、ACP4、ACP5和ACP6。包含第一个ACP结构域ACP1的核苷酸序列本文中为SEQ ID NO：13，包含于SEQ ID NO：1的约3325-3600位置的核苷酸序列范围内。包含ACP1的氨基酸序列本文记作SEQ ID NO：14，包含于SEQ ID NO：2的约1109-1200位置的氨基酸序列范围内。包含ACP2的核苷酸序列本文记作SEQ ID NO：15，包含于SEQID NO：1的约3667-3942位置的核苷酸序列范围内。包含ACP2的氨基酸序列本文记作SEQ IDNO：16，包含于SEQ ID NO：2的约1223-1314位置的氨基酸序列范围内。包含ACP3的核苷酸序列本文记作SEQ ID NO：17，包含于SEQ ID NO：1的约4015-4290位置的核苷酸序列范围内。包含ACP3的氨基酸序列本文记作SEQ ID NO：18，包含于SEQ ID NO：2的约1339-1430位置的氨基酸序列范围内。包含ACP4的核苷酸序列本文记作SEQ ID NO：19，包含于SEQ ID NO：1的约4363-4638位置的核苷酸序列范围内。包含ACP4的氨基酸序列本文记作SEQ ID NO：20，包含于SEQ ID NO：2的约1455-1546位置的氨基酸序列范围内。包含ACP5的核苷酸序列本文记作SEQ ID NO：21，包含于SEQ ID NO：1的约4711-4986位置的核苷酸序列范围内。包含ACP5的氨基酸序列本文记作SEQ ID NO：22，包含于SEQ ID NO：2的约1571-1662位置的氨基酸序列范围内。包含ACP6的核苷酸序列本文记作SEQ ID NO：23，包含于SEQ ID NO：1的约5053-5328位置的核苷酸序列范围内。包含ACP6的氨基酸序列本文记作SEQ ID NO：24，包含于SEQID NO：2的约1685-1776位置的氨基酸序列范围内。所有6种ACP结构域一起覆盖裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa1的SEQ ID NO：1的约3298-5400位置的区域，大约对应SEQ ID NO：2的氨基酸位置1100-1800。包含所有6个结构域的整个ACP区域的核苷酸序列本文中记作SEQ ID NO：11；而包含所有6个结构域的整个ACP区域的氨基酸序列记作SEQ ID NO：12。SEQ ID NO：11内6个ACP区域的重复间隔约为每342个核苷酸(相邻活性位点丝氨酸间的氨基酸实际数目范围为114-116个氨基酸)。6个ACP结构域的每一个包含泛酰巯基乙胺结合基序LGIDS*(SEQ ID NO：47)，其中S*是泛酰巯基乙胺结合位点丝氨酸(S)。泛酰巯基乙胺结合位点丝氨酸(S)位置靠近每个ACP结构域序列的中央。6个ACPD结构域的活性位点丝氨酸的位置(即泛酰巯基乙胺结合位点)，对于SEQ ID NO：2的氨基酸序列而言是：ACP1＝S1152、ACP2＝S1266、ACP3＝S1382、ACP4＝S1498、ACP5＝S1614和ACP6＝S1728。Domains 3-8 of Schizochytrium sp. ATCC PTA-9695 Pfa1 consist of six tandem ACP domains, designated herein as ACP1, ACP2, ACP3, ACP4, ACP5, and ACP6. The nucleotide sequence comprising the first ACP domain, ACP1, is designated herein as SEQ ID NO:13 and is encompassed within the nucleotide sequence of approximately positions 3325-3600 of SEQ ID NO:1. The amino acid sequence comprising ACP1 is designated herein as SEQ ID NO:14 and is encompassed within the amino acid sequence of approximately positions 1109-1200 of SEQ ID NO:2. The nucleotide sequence comprising ACP2 is designated herein as SEQ ID NO:15 and is encompassed within the nucleotide sequence of approximately positions 3667-3942 of SEQ ID NO:1. The amino acid sequence comprising ACP2 is designated herein as SEQ ID NO:16 and is encompassed within the amino acid sequence of approximately positions 1223-1314 of SEQ ID NO:2. A nucleotide sequence comprising ACP3 is designated herein as SEQ ID NO: 17 and is encompassed within the nucleotide sequence of approximately positions 4015-4290 of SEQ ID NO: 1. An amino acid sequence comprising ACP3 is designated herein as SEQ ID NO: 18 and is encompassed within the amino acid sequence of approximately positions 1339-1430 of SEQ ID NO: 2. A nucleotide sequence comprising ACP4 is designated herein as SEQ ID NO: 19 and is encompassed within the nucleotide sequence of approximately positions 4363-4638 of SEQ ID NO: 1. An amino acid sequence comprising ACP4 is designated herein as SEQ ID NO: 20 and is encompassed within the amino acid sequence of approximately positions 1455-1546 of SEQ ID NO: 2. A nucleotide sequence comprising ACP5 is designated herein as SEQ ID NO: 21 and is encompassed within the nucleotide sequence of approximately positions 4711-4986 of SEQ ID NO: 1. The amino acid sequence comprising ACP5 is designated herein as SEQ ID NO:22 and is encompassed within the amino acid sequence of approximately positions 1571-1662 of SEQ ID NO:2. The nucleotide sequence comprising ACP6 is designated herein as SEQ ID NO:23 and is encompassed within the nucleotide sequence of approximately positions 5053-5328 of SEQ ID NO:1. The amino acid sequence comprising ACP6 is designated herein as SEQ ID NO:24 and is encompassed within the amino acid sequence of approximately positions 1685-1776 of SEQ ID NO:2. Together, all six ACP domains cover the region of approximately positions 3298-5400 of SEQ ID NO:1 of Schizochytrium sp. ATCC PTA-9695 Pfa1, corresponding approximately to amino acid positions 1100-1800 of SEQ ID NO:2. The nucleotide sequence of the entire ACP region, encompassing all six domains, is herein designated as SEQ ID NO: 11; the amino acid sequence of the entire ACP region, encompassing all six domains, is designated as SEQ ID NO: 12. The six ACP regions within SEQ ID NO: 11 are repeated approximately every 342 nucleotides (the actual number of amino acids between adjacent active site serines ranges from 114 to 116 amino acids). Each of the six ACP domains contains a pantetheine-binding motif, LGIDS* (SEQ ID NO: 47), where S* is the pantetheine-binding site serine (S). The pantetheine-binding site serine (S) is located near the center of each ACP domain sequence. The positions of the active site serines of the six ACPD domains (i.e., the pantetheine binding site) for the amino acid sequence of SEQ ID NO: 2 are: ACP1 = S1152, ACP2 = S1266, ACP3 = S1382, ACP4 = S1498, ACP5 = S1614, and ACP6 = S1728.

裂殖壶菌(Schizochytrium sp)ATCC PTA-9695Pfa1的第9个结构域是KR结构域。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa KR结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：25，对应SEQ ID NO：1的第5623-7800位。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa1 KR结构域的氨基酸序列在此表示为SEQ IDNO：26，对应SEQ ID NO：2的第1875-2600位。KR结构域内部是与短链醛脱氢酶(KR是该家族成员)同源的核心区域(包含于核苷酸序列SEQ ID NO：48和氨基酸序列SEQ ID NO：49中)。The ninth domain of Schizochytrium sp. ATCC PTA-9695 Pfa1 is the KR domain. The nucleotide sequence encoding the Schizochytrium sp. ATCC PTA-9695 Pfa KR domain is shown herein as SEQ ID NO:25, corresponding to positions 5623-7800 of SEQ ID NO:1. The amino acid sequence encoding the Schizochytrium sp. ATCC PTA-9695 Pfa1 KR domain is shown herein as SEQ ID NO:26, corresponding to positions 1875-2600 of SEQ ID NO:2. Within the KR domain is a core region (contained in the nucleotide sequence of SEQ ID NO:48 and the amino acid sequence of SEQ ID NO:49) homologous to short-chain aldehyde dehydrogenases (of which KR is a member).

该核心区域范围是SEQ ID NO：1的约5998-6900位，对应SEQ ID NO：2的2000-2300氨基酸位置。裂殖壶菌(Schizochytrium sp)ATCC PTA-9695Pfa1的第10个结构域是DH结构域。包含裂殖壶菌(Schizochytrium sp)ATCCPTA-9695Pfa1DH结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：27，对应SEQ ID NO：1的第7027-7065位。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695Pfa1DH结构域的氨基酸序列在此表示为SEQ ID NO：28，对应SEQ ID NO：2的第2343-2355位。DH结构域包含保守的活性位点基序(参见Donadio，S.和Katz.，L.，Gene 111(1)：51-60(1992))：LxxHxxxGxxxxP(SEQ ID NO：50)。This core region spans approximately positions 5998-6900 of SEQ ID NO:1, corresponding to amino acid positions 2000-2300 of SEQ ID NO:2. The tenth domain of Schizochytrium sp. ATCC PTA-9695 Pfa1 is the DH domain. The nucleotide sequence encoding the Schizochytrium sp. ATCC PTA-9695 Pfa1 DH domain is shown herein as SEQ ID NO:27, corresponding to positions 7027-7065 of SEQ ID NO:1. The amino acid sequence encoding the Schizochytrium sp. ATCC PTA-9695 Pfa1 DH domain is shown herein as SEQ ID NO:28, corresponding to positions 2343-2355 of SEQ ID NO:2. The DH domain contains a conserved active site motif (see Donadio, S. and Katz., L., Gene 111(1):51-60 (1992)): LxxHxxxGxxxxP (SEQ ID NO:50).

表4显示与裂殖菌(Schizochytrium sp.)ATCC PTA-9695PFA2相关的结构域和活性位点。Table 4 shows the structural domains and active sites associated with Schizochytrium sp. ATCC PTA-9695 PFA2.

表4：裂殖壶菌ATCC PTA-9695 PFA2结构域分析Table 4: Analysis of the PFA2 domain of Schizochytrium ATCC PTA-9695

裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa2的第一个结构域是KS结构域。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa2 KS结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：29，对应SEQ ID NO：3的第10-1350位。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa2 KS结构域的氨基酸序列在此表示为SEQ IDNO：30，对应SEQ ID NO：4的第4-450位。KS结构域包含活性位点基序：DXAC*(SEQ ID NO：43)，其中*酰基结合位点对应SEQ ID NO：4的C191。同样，KS结构域末端存在一个特征性基序：GFGG(SEQ ID NO：44)，对应SEQ ID NO：4的第438-441位以及SEQ ID NO：30的第435-438位。The first domain of Schizochytrium sp. ATCC PTA-9695 Pfa2 is the KS domain. The nucleotide sequence encoding the KS domain of Schizochytrium sp. ATCC PTA-9695 Pfa2 is set forth herein as SEQ ID NO:29, corresponding to positions 10-1350 of SEQ ID NO:3. The amino acid sequence encoding the KS domain of Schizochytrium sp. ATCC PTA-9695 Pfa2 is set forth herein as SEQ ID NO:30, corresponding to positions 4-450 of SEQ ID NO:4. The KS domain contains the active site motif: DXAC* (SEQ ID NO:43), where the * acyl binding site corresponds to C191 of SEQ ID NO:4. Likewise, there is a characteristic motif at the end of the KS domain: GFGG (SEQ ID NO: 44), corresponding to positions 438-441 of SEQ ID NO: 4 and positions 435-438 of SEQ ID NO: 30.

裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa2的第3个结构域是CLF结构域。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa2 CLF结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：31，对应SEQ ID NO：3的第1408-2700位。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa2 CLF结构域的氨基酸序列在此表示为SEQ IDNO：32，对应SEQ ID NO：4的第470-900位。The third domain of Schizochytrium sp. ATCC PTA-9695 Pfa2 is the CLF domain. The nucleotide sequence encoding the CLF domain of Schizochytrium sp. ATCC PTA-9695 Pfa2 is shown herein as SEQ ID NO:31, corresponding to positions 1408-2700 of SEQ ID NO:3. The amino acid sequence encoding the CLF domain of Schizochytrium sp. ATCC PTA-9695 Pfa2 is shown herein as SEQ ID NO:32, corresponding to positions 470-900 of SEQ ID NO:4.

裂殖壶菌(Schizochytrium sp)ATCC PTA-9695Pfa2的第3个结构域是AT结构域。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695Pfa2AT结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：33，对应SEQ ID NO：3的第2998-4200位。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa2 AT结构域的氨基酸序列在此表示为SEQ IDNO：34，对应SEQ ID NO：4的第1000-1400位。AT结构域包含活性位点基序GxS*xG(SEQ IDNO：52)，这是酰基转移酶(AT)蛋白的特征，活性位点丝氨酸残基对应SEQ ID NO：4的S1141。The third domain of Schizochytrium sp. ATCC PTA-9695 Pfa2 is the AT domain. The nucleotide sequence encoding the Schizochytrium sp. ATCC PTA-9695 Pfa2 AT domain is set forth herein as SEQ ID NO:33, corresponding to positions 2998-4200 of SEQ ID NO:3. The amino acid sequence encoding the Schizochytrium sp. ATCC PTA-9695 Pfa2 AT domain is set forth herein as SEQ ID NO:34, corresponding to positions 1000-1400 of SEQ ID NO:4. The AT domain contains the active site motif GxS*xG (SEQ ID NO:52), which is characteristic of acyltransferase (AT) proteins. The active site serine residue corresponds to S1141 of SEQ ID NO:4.

裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa2的第四个结构域是ER结构域。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa2 ER结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：35，对应SEQ ID NO：3的第4498-5700位。包含Pfa2 ER结构域的氨基酸序列在此表示为SEQ ID NO：36，对应SEQ ID NO：4的第1500-1900位。The fourth domain of Schizochytrium sp. ATCC PTA-9695 Pfa2 is the ER domain. The nucleotide sequence encoding the ER domain of Schizochytrium sp. ATCC PTA-9695 Pfa2 is shown herein as SEQ ID NO:35, corresponding to positions 4498-5700 of SEQ ID NO:3. The amino acid sequence encoding the Pfa2 ER domain is shown herein as SEQ ID NO:36, corresponding to positions 1500-1900 of SEQ ID NO:4.

表5显示与裂殖菌(Schizochytrium sp.)ATCC PTA-9695 PFA3相关的结构域和活性位点。Table 5 shows the structural domains and active sites associated with Schizochytrium sp. ATCC PTA-9695 PFA3.

表5：裂殖壶菌ATCC PTA-9695 PFA3结构域分析Table 5: Structural analysis of the PFA3 domain of Schizochytrium ATCC PTA-9695

裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa3的第一、二个结构域本文分别记作DH1和DH2。包含裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695Pfa3 DH1结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：37，对应SEQ ID NO：5的第1-1350位。包含裂殖壶菌(Schizochytrium sp.)ATCCPTA-9695 Pfa3 DH1结构域的氨基酸序列在此表示为SEQ ID NO：38，对应SEQ ID NO：6的第1-450位。包含裂殖壶菌(Schizochytrium sp.)ATCCPTA-9695 Pfa3 DH2结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：39，对应SEQID NO：5的第1501-2700位。包含裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695Pfa3 DH2结构域的氨基酸序列在此表示为SEQ ID NO：40，对应SEQ ID NO：6的第501-900位。DH结构域包含活性位点基序：FxxH*F(SEQ ID NO：53)。DH1中包含活性位点基序的核苷酸序列对应SEQ ID NO：5的位置931-933，而DH2中包含活性位点基序的核苷酸序列对应SEQ ID NO：5的位置2401-2403。基序FxxH*F中的活性位点H*基于来自Leesong等，Structure 4：253-64(1996)和Kimber等，J Biol Chem.279：52593-602(2004)的数据，DH1中的活性位点H*对应SEQ ID NO：6的H310，DH2中的活性位点H*对应SEQ ID NO：6的H801。The first and second domains of Schizochytrium sp. ATCC PTA-9695 Pfa3 are referred to herein as DH1 and DH2, respectively. The nucleotide sequence encoding the DH1 domain of Schizochytrium sp. ATCC PTA-9695 Pfa3 is set forth herein as SEQ ID NO:37, corresponding to positions 1-1350 of SEQ ID NO:5. The amino acid sequence encoding the DH1 domain of Schizochytrium sp. ATCC PTA-9695 Pfa3 is set forth herein as SEQ ID NO:38, corresponding to positions 1-450 of SEQ ID NO:6. The nucleotide sequence encoding the DH2 domain of Schizochytrium sp. ATCC PTA-9695 Pfa3 is set forth herein as SEQ ID NO:39, corresponding to positions 1501-2700 of SEQ ID NO:5. The amino acid sequence comprising the DH2 domain of Schizochytrium sp. ATCC PTA-9695 Pfa3 is set forth herein as SEQ ID NO:40, corresponding to positions 501-900 of SEQ ID NO:6. The DH domain comprises the active site motif: FxxH*F (SEQ ID NO:53). The nucleotide sequence comprising the active site motif in DH1 corresponds to positions 931-933 of SEQ ID NO:5, while the nucleotide sequence comprising the active site motif in DH2 corresponds to positions 2401-2403 of SEQ ID NO:5. The active site H* in the motif FxxH*F is based on data from Leesong et al., Structure 4:253-64 (1996) and Kimber et al., J Biol Chem. 279:52593-602 (2004). The active site H* in DH1 corresponds to H310 of SEQ ID NO: 6, and the active site H* in DH2 corresponds to H801 of SEQ ID NO: 6.

裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa3的第三个结构域是ER结构域。包含裂殖壶菌(Schizochytrium sp)ATCC PTA-9695 Pfa3 ER结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：41，对应SEQ ID NO：5的第2848-4200位。包含裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 Pfa3 ER结构域的氨基酸序列在此表示为SEQ IDNO：42，对应SEQ ID NO：6的第950-1400位。The third domain of Schizochytrium sp. ATCC PTA-9695 Pfa3 is the ER domain. The nucleotide sequence encoding the ER domain of Schizochytrium sp. ATCC PTA-9695 Pfa3 is shown herein as SEQ ID NO:41, corresponding to positions 2848-4200 of SEQ ID NO:5. The amino acid sequence encoding the ER domain of Schizochytrium sp. ATCC PTA-9695 Pfa3 is shown herein as SEQ ID NO:42, corresponding to positions 950-1400 of SEQ ID NO:6.

实施例4Example 4

设计KS、ER和DH PUFA合酶结构域的简并引物，并从分离的微生物中分离对应的序列，所述微生物保藏于ATCC登录号PTA-10212，也称作破囊壶菌(Thraustochytrium sp.)ATCC PTA-10212。Degenerate primers for the KS, ER, and DH PUFA synthase domains were designed and the corresponding sequences were isolated from an isolated microorganism deposited under ATCC Accession No. PTA-10212, also known as Thraustochytrium sp. ATCC PTA-10212.

根据公开的裂殖壶菌(Schizochytrium sp.)ATCC 20888、金黄色破囊壶菌(Thraustochytrium aureum)(ATCC 34304)和破囊壶菌(Thraustochytrium sp.)23BATCC20892的PFA1(之前称为orfA或ORF 1)序列，设计破囊壶菌ATCC PTA-10212 PFA1的KS区域(即包含KS结构域的区域)的简并引物：Based on the published PFA1 (formerly known as orfA or ORF 1) sequences of Schizochytrium sp. ATCC 20888, Thraustochytrium aureum (ATCC 34304), and Thraustochytrium sp. 23B ATCC 20892, degenerate primers for the KS region (i.e., the region containing the KS domain) of Thraustochytrium sp. ATCC PTA-10212 PFA1 were designed:

prDS233(正向)：TGATATGGGAGGAATGAATTGTGTNGTNGAYGC(SEQ ID NO：123)prDS233 (forward): TGATATGGGAGGAATGAATTGTGTNGTNGAYGC (SEQ ID NO: 123)

prDS235(反向)：TTCCATAACAAAATGATAATTAGCTCCNCCRAANCC(SEQ ID NO：124).prDS235 (reverse): TTCCATAACAAAATGATAATTAGCTCCNCCRAANCC (SEQ ID NO: 124).

根据公开的日本希瓦氏菌、裂殖壶菌ATCC 20888、金黄色破囊壶菌(ATCC 34304)和破囊壶菌23B ATCC 20892的PFA2(之前称为orfB或ORF 2)序列，设计破囊壶菌ATCC PTA-10212PFA2ER区域(即包含ER结构域的区域)的简并引物：Based on the published PFA2 (formerly known as orfB or ORF 2) sequences of Shewanella japonicum, Schizochytrium sp. ATCC 20888, Thraustochytrium aureum (ATCC 34304), and Thraustochytrium sp. 23B ATCC 20892, degenerate primers for the ER region (i.e., the region containing the ER domain) of Thraustochytrium sp. ATCC PTA-10212 PFA2 were designed:

prDS183(正向)：GGCGGCCACACCGAYAAYMGNCC(SEQ ID NO：125)prDS183 (forward): GGCGGCCACACCGAYAAYMGNCC (SEQ ID NO: 125)

prDS184(反向)：CGGGGCCGCACCANAYYTGRTA(SEQ ID NO：126).prDS184 (reverse): CGGGGCCGCACCANAYYTGRTA (SEQ ID NO: 126).

根据公开的日本希瓦氏菌、裂殖壶菌ATCC 20888、金黄色破囊壶菌(ATCC 34304)和破囊壶菌23B ATCC 20892的PFA3(之前称为orfC或ORF 3)序列，设计破囊壶菌ATCC PTA-10212 PFA3ER区域(即包含ER结构域的区域)的简并引物：Based on the published PFA3 (formerly known as orfC or ORF 3) sequences of Shewanella japonicum, Schizochytrium sp. ATCC 20888, Thraustochytrium aureum (ATCC 34304), and Thraustochytrium sp. 23B ATCC 20892, degenerate primers for the ER region (i.e., the region containing the ER domain) of Thraustochytrium sp. ATCC PTA-10212 PFA3 were designed:

prDS181(正向)：TCCTTCGGNGCNGSNGG(SEQ ID NO：127)prDS181 (forward): TCCTTCGGNGCNGSNGG (SEQ ID NO: 127)

使用如上所述的简并引物JGM190(正向，SEQ ID NO：64)和BLR242(反向，SEQ IDNO：65)扩增破囊壶菌ATCC PTA-10212中PFA3的DH区域。The DH region of PFA3 from Thraustochytrium ATCC PTA-10212 was amplified using the degenerate primers JGM190 (forward, SEQ ID NO: 64) and BLR242 (reverse, SEQ ID NO: 65) as described above.

使用染色体DNA模板的PCR条件如下：0.2μM dNTP，每种引物0.1uM，6％DMSO，200ng染色体DNA，2.5U HerculaseII融合聚合酶(斯查塔基公司)和1X Herculase缓冲液(斯查塔基公司)，总体积50μL。该PCR方法包括以下步骤：(1)98℃3分钟；(2)98℃30秒；(3)54℃45秒；(4)72℃2分钟；(5)重复第2-4步40个循环；(6)72℃5分钟；和(7)存于6℃。The PCR conditions using chromosomal DNA template were as follows: 0.2 μM dNTP, 0.1 μM of each primer, 6% DMSO, 200 ng of chromosomal DNA, 2.5 U of Herculase II Fusion Polymerase (Schwagen), and 1X Herculase buffer (Schwagen), in a total volume of 50 μL. The PCR method included the following steps: (1) 98°C for 3 minutes; (2) 98°C for 30 seconds; (3) 54°C for 45 seconds; (4) 72°C for 2 minutes; (5) repeating steps 2-4 for 40 cycles; (6) 72°C for 5 minutes; and (7) storage at 6°C.

对于所有引物对，使用来自破囊壶菌ATCC PTA-10212的染色体模板，PCR得到期望大小的不同的DNA产物。根据厂商手册分别将PCR产物克隆到载体pJET1.2/钝端(富酶泰斯)中，使用提供的标准引物确定插入序列。For all primer pairs, PCR yielded DNA products of the expected size using a chromosomal template from Thraustochytrium ATCC PTA-10212. PCR products were cloned into the pJET1.2/blunt-end vector (EnzymeTex) according to the manufacturer's instructions, and the insert sequence was determined using the provided standard primers.

如实施例1所述将获自PCR产物的DNA序列与来自NCBI GenBank的已知序列比较。The DNA sequences obtained from the PCR products were compared with known sequences from NCBI GenBank as described in Example 1.

在氨基酸水平，与来自含破囊壶菌ATCC PTA-10212的PFA1的KS片段的克隆DNA推导出的氨基酸序列具有最高水平同源性的序列是：裂殖壶菌ATCC 20888“多不饱和脂肪酸合酶亚基A”(相同性＝80％；阳性率＝90％)；本希卡希瓦氏菌(Shewanella benthica)KT99“ω-3多不饱和脂肪酸合酶PfaA”(相同性＝51％；阳性率＝67％)；罗西卡希瓦氏菌(Shewanella loihica)PV-4“β-酮酰合酶“(相同性＝50％；阳性率＝67％)；沃地希瓦氏菌(Shewanella woodyi)ATCC 51908“聚酮型多不饱和脂肪酸合酶PfaA”(相同性＝51％；阳性率＝66％)。At the amino acid level, the sequences with the highest levels of homology to the amino acid sequence deduced from cloned DNA containing the KS fragment of PFA1 from Thraustochytrium ATCC PTA-10212 were: Schizochytrium ATCC 20888 "Polyunsaturated fatty acid synthase subunit A" (identity = 80%; positive rate = 90%); Shewanella benthica KT99 "ω-3 polyunsaturated fatty acid synthase PfaA" (identity = 51%; positive rate = 67%); Shewanella loihica PV-4 "β-ketoacyl synthase" (identity = 50%; positive rate = 67%); Shewanella woodyi ATCC 51908 "Polyketide-type polyunsaturated fatty acid synthase PfaA" (identity = 51%; positive rate = 66%).

在氨基酸水平，与含来自含破囊壶菌(Thraustochytrium sp.)ATCCPTA-10212的PFA2的ER片段的克隆DNA推导出的氨基酸序列具有最高水平同源性的序列是：裂殖壶菌ATCC 20888“多不饱和脂肪酸合酶亚基B”(相同性＝70％；阳性率＝85％)；裂殖壶菌ATCC20888“多不饱和脂肪酸合酶亚基C”(相同性＝66％；阳性率＝83％)；泡沫节旋藻(Nodularia spumigena)CCY9414“2-硝基丙烷二加氧酶”(相同性＝57％；阳性率＝74％)；南极细菌(Moritella sp.)PE36“多不饱和脂肪酸合酶PfaD”(相同性＝57％；阳性率＝71％)。At the amino acid level, the sequences with the highest levels of homology to the amino acid sequence deduced from cloned DNA containing the ER fragment of PFA2 from Thraustochytrium sp. ATCC PTA-10212 were: Schizochytrium ATCC 20888 "Polyunsaturated fatty acid synthase subunit B" (identity = 70%; positive rate = 85%); Schizochytrium ATCC 20888 "Polyunsaturated fatty acid synthase subunit C" (identity = 66%; positive rate = 83%); Nodularia spumigena CCY9414 "2-nitropropane dioxygenase" (identity = 57%; positive rate = 74%); and Moritella sp. PE36 "Polyunsaturated fatty acid synthase PfaD" (identity = 57%; positive rate = 71%).

在氨基酸水平，与来自含破囊壶菌ATCC PTA-10212的PFA3的ER片段的克隆DNA推导出的氨基酸序列具有最高水平同源性的序列是：裂殖壶菌ATCC 20888“多不饱和脂肪酸合酶亚基C”(相同性＝80％；阳性率＝90％)；裂殖壶菌ATCC 20888“多不饱和脂肪酸合酶亚基B”(相同性＝78％；阳性率＝89％)；南极细菌PE36“多不饱和脂肪酸合酶PfaD”(相同性＝56％；阳性率＝71％)；亚马逊希瓦氏菌(Shewanella amazonensis)SB2B“ω-3多不饱和脂肪酸合酶”(相同性＝55％；阳性率＝73％)。At the amino acid level, the sequences with the highest levels of homology to the amino acid sequence deduced from cloned DNA of the ER fragment containing PFA3 from Thraustochytrium ATCC PTA-10212 were: Schizochytrium ATCC 20888 "Polyunsaturated fatty acid synthase subunit C" (identity = 80%; positive rate = 90%); Schizochytrium ATCC 20888 "Polyunsaturated fatty acid synthase subunit B" (identity = 78%; positive rate = 89%); Antarctic bacterium PE36 "Polyunsaturated fatty acid synthase PfaD" (identity = 56%; positive rate = 71%); Shewanella amazonensis SB2B "ω-3 polyunsaturated fatty acid synthase" (identity = 55%; positive rate = 73%).

在氨基酸水平，与来自含破囊壶菌ATCC PTA-10212的PFA3的DH片段的克隆DNA推导出的氨基酸序列具有最高水平同源性的序列是：裂殖壶菌ATCC 20888“多不饱和脂肪酸合酶亚基C”(相同性＝63％；阳性率＝76％)；皮氏希瓦氏菌(Shewanella pealeana)ATCC700345“β-羟酰基-(酰基-运载体-蛋白)脱水酶FabA/FabZ”(相同性＝35％；阳性率＝53％)；皮佐希瓦氏菌(Shewanella piezotolerans)WP3“多结构域β-酮酰基合酶”(相同性＝36％；阳性率＝52％)；本希卡希瓦氏菌(Shewanella benthica)KT99″ω-3多不饱和脂肪酸合酶PfaC”(相同性＝35％；阳性率＝51％)。At the amino acid level, the sequences with the highest levels of homology to the amino acid sequence deduced from cloned DNA containing the DH fragment of PFA3 from Thraustochytrium ATCC PTA-10212 were: Schizochytrium ATCC 20888 "Polyunsaturated fatty acid synthase subunit C" (identity = 63%; positive rate = 76%); Shewanella pealeana ATCC 700345 "β-hydroxyacyl-(acyl-carrier-protein) dehydratase FabA/FabZ" (identity = 35%; positive rate = 53%); Shewanella piezotolerans WP3 "Multidomain β-ketoacyl synthase" (identity = 36%; positive rate = 52%); Shewanella benthica KT99 "ω-3 polyunsaturated fatty acid synthase PfaC" (identity = 35%; positive rate = 51%).

实施例5Example 5

从破囊壶菌(Schizochytrium sp.)ATCC PTA-10212中鉴定PUFA合酶基因。PUFA synthase genes were identified from Schizochytrium sp. ATCC PTA-10212.

在室温下将1mL细胞从-80℃冻存管中解冻，加入到含50mL液体HSFM培养基(如下)的250mL无挡板培养瓶中。将培养瓶在23℃孵育3天。收集细胞用于标准细胞人工染色体(BAC)文库构建(露西经公司，米德尔顿，美国威斯康星(Lucigen Corporation，Middleton，WI USA))。Thaw 1 mL of cells from a -80°C cryovial at room temperature and add them to a 250 mL unbaffled culture flask containing 50 mL of liquid HSFM medium (described below). Incubate the flask at 23°C for 3 days. Collect the cells and use them for standard cell artificial chromosome (BAC) library construction (Lucigen Corporation, Middleton, WI USA).

表6：HSFM培养基Table 6: HSFM culture medium

氮给料：Nitrogen feed:

典型的培养条件包括：Typical culture conditions include:

根据BAC载体pSMART(露西经公司(Lucigen Corporation))中的厂商指南处理包含大片段(约120kB)的重组BAC文库。通过使用³²P放射性标记探针的标准克隆杂交步骤筛选BAC文库(Sambrook J.和Russell D.2001.《分子克隆：实验室手册》(第三版)冷泉港实验室出版社，纽约冷泉港(Molecular cloning：A laboratory manual，3rd edition.ColdSpring Harbor Laboratory Press，Cold Spring Harbor，New York))。探针含有与实施例4中来自其它有机体的公开PUFA合酶序列同源的DNA。将来自上述pJET1.2/钝端克隆的克隆片段分别进行DNA限制性酶切得到这些探针，并用标准方法标记。在所有的情况下，各个探针与某些BAC的强烈杂交表明克隆含有与PUFA合酶基因同源的DNA。A recombinant BAC library containing large fragments (approximately 120 kB) was processed according to the manufacturer's instructions for the BAC vector pSMART (Lucigen Corporation). The BAC library was screened using a standard colony hybridization procedure using a ^32P radiolabeled probe (Sambrook J. and Russell D. 2001. Molecular cloning: A laboratory manual, 3rd edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York). The probe contained DNA homologous to the published PUFA synthase sequences from other organisms described in Example 4. These probes were generated by DNA restriction enzyme digestion of cloned fragments from the pJET1.2/blunt-end clones described above and labeled using standard methods. In all cases, strong hybridization of each probe to certain BACs indicated that the clone contained DNA homologous to the PUFA synthase gene.

BAC克隆pLR130(也称作LuMaBAC 2M23)显示与KS和ER区域的探针强烈杂交，表明其含有PFA1和PFA2基因，选择该克隆进行破囊壶菌ATCCPTA-10212 PFA1和PFA2基因的DNA测序。通过标准步骤对BAC进行测序(欧陆MWG操纵子公司，阿拉巴马，汉茨维尔(EurofinsMWG Operon，Huntsville，AL))。BAC克隆pLR130，包含PFA1和PFA2基因，于2009年12月1日保存在布达佩斯条约下的美国典型培养物保藏中心，专利保藏所，大学大道第10801号，马纳萨斯，VA20110-2209(American Type Culture Collection，Patent Depository，10801University Boulevard，Manassas，VA 20110-2209)，ATCC登录号PTA-10511。The BAC clone pLR130 (also known as LuMaBAC 2M23) showed strong hybridization with probes in the KS and ER regions, indicating that it contained the PFA1 and PFA2 genes, and this clone was selected for DNA sequencing of the PFA1 and PFA2 genes of Thraustochytrium ATCC PTA-10212. BACs were sequenced by standard procedures (Eurofins MWG Operon, Huntsville, AL). The BAC clone pLR130, comprising the PFA1 and PFA2 genes, was deposited with the American Type Culture Collection, Patent Depository, 10801 University Boulevard, Manassas, VA 20110-2209, under the Budapest Treaty on December 1, 2009, under ATCC Accession No. PTA-10511.

之前公开的破囊壶菌PUFA合酶系统中，PUFA合酶基因PFA1和PFA2簇集排布以异向转录。来自破囊壶菌(Thraustochytrium sp.)ATCC PTA-10212的PFA1和PFA2也是如此。概念上PFA2的起始位点位距离PFA1起始位点693个核苷酸，并且异向转录。In previously published thraustochytrium PUFA synthase systems, the PUFA synthase genes PFA1 and PFA2 are clustered and transcribed in opposite directions. This is also true for PFA1 and PFA2 from Thraustochytrium sp. ATCC PTA-10212. Conceptually, the start site of PFA2 is located 693 nucleotides from the start site of PFA1 and is transcribed in opposite directions.

BAC克隆pDS127(也称为LuMaBAC 9K17)显示与PFA3的DH区域和ER区域的探针强烈杂交，选择该克隆进行PFA3基因的DNA测序。BAC克隆pDS130，包含PFA3基因，于2009年12月1日保存在布达佩斯条约下的美国典型培养物保藏中心，专利保藏所，大学大道第10801号，马纳萨斯，VA20110-2209，ATCC登录号PTA-10510。使用标准方法设计实施例4中确定的DH区域和ER区域和DNA序列的测序引物。为了确定破囊壶菌ATCC PTA-10212 PFA3基因的DNA序列，进行数轮DNA测序，包括使用标准方法设计后续测序引物，以“走完”BAC克隆。PFA3基因的每一核苷酸碱基对均被至少两个独立的高质量DNA测序反应所覆盖，所述测序反应最小集合Phred评分为40(置信水平为99.99％)。The BAC clone pDS127 (also known as LuMaBAC 9K17) showed strong hybridization with probes for the DH and ER regions of PFA3 and was selected for DNA sequencing of the PFA3 gene. The BAC clone pDS130, containing the PFA3 gene, was deposited under the Budapest Treaty on December 1, 2009, with the American Type Culture Collection, Patent Collection, 10801 University Avenue, Manassas, VA 20110-2209, under ATCC Accession No. PTA-10510. Sequencing primers were designed using standard methods for the DH and ER regions and DNA sequences determined in Example 4. To determine the DNA sequence of the Thraustochytrium ATCC PTA-10212 PFA3 gene, several rounds of DNA sequencing were performed, including the design of subsequent sequencing primers using standard methods, to "walk through" the BAC clones. Every nucleotide base pair of the PFA3 gene was covered by at least two independent high-quality DNA sequencing reactions, with a minimum pooled Phred score of 40 (99.99% confidence level).

表7显示与之前公开的序列以及裂殖壶菌PTA-9695的序列相比，破囊壶菌ATCCPTA-10212 PFA1(SEQ ID NO：68)、PFA2(SEQ ID NO：70)和PFA3(SEQ ID NO：72)多核苷酸序列的相同性。利用DNA比对标准VectorNTI程序的“AlignX”程序中的评分矩阵“swgapdnamt来确定相同性。”Table 7 shows the identities of the Thraustochytrium ATCC PTA-10212 PFA1 (SEQ ID NO: 68), PFA2 (SEQ ID NO: 70), and PFA3 (SEQ ID NO: 72) polynucleotide sequences compared to previously published sequences and the sequence of Schizochytrium sp. PTA-9695. Identities were determined using the scoring matrix "swgapdnamt" in the "AlignX" program of VectorNTI, a standard program for DNA alignment.

表7：PFA1、PFA2和PFA3多核苷酸序列相同性百分比Table 7: Percent identity of polynucleotide sequences of PFA1, PFA2 and PFA3

表8显示与之前公开的序列以及裂殖壶菌PTA-9695的PUFA合酶氨基酸序列相比，破囊壶菌ATCC PTA-10212 Pfa1p(SEQ ID NO：69)、Pfa2p(SEQ ID NO：71)和Pfa3p(SEQ IDNO：73)氨基酸序列的相同性。利用蛋白比对标准VectorNTI程序的“AlignX”程序中的评分矩阵“blosum62mt2来确定相同性。”Table 8 shows the identity of the amino acid sequences of Thraustochytrium sp. ATCC PTA-10212 Pfa1p (SEQ ID NO: 69), Pfa2p (SEQ ID NO: 71), and Pfa3p (SEQ ID NO: 73) compared to previously published sequences and the amino acid sequence of the PUFA synthase from Schizochytrium sp. PTA-9695. Identities were determined using the scoring matrix "blosum62mt2" in the "AlignX" program of VectorNTI, a standard program for protein alignment.

表8：Pfa1p、Pfa2p和Pfa3p氨基酸序列相同性百分比Table 8: Percentage identity of amino acid sequences of Pfa1p, Pfa2p and Pfa3p

实施例6Example 6

进行结构域分析，以注释破囊壶菌ATCC PTA-10212 PFA1、PFA2和PFA3的PUFA合酶结构域以及活性位点各自的序列坐标。基于与已知PUFA合酶、脂肪酸合酶和聚酮化合物合酶结构域的同源性鉴定结构域。Domain analysis was performed to annotate the PUFA synthase domains of Thraustochytrium sp. ATCC PTA-10212 PFA1, PFA2, and PFA3, as well as the sequence coordinates of their respective active sites. Domains were identified based on homology to known PUFA synthase, fatty acid synthase, and polyketide synthase domains.

表9显示与破囊壶菌ATCC PTA-10212 PFA1相关的结构域和活性位点。Table 9 shows the structural domains and active sites associated with Thraustochytrium ATCC PTA-10212 PFA1.

表9：破囊壶菌ATCC PTA-10212 PFA1结构域分析Table 9: Thraustochytrid ATCC PTA-10212 PFA1 domain analysis

破囊壶菌ATCC PTA-10212 Pfa1的第一个结构域是KS结构域。包含破囊壶菌ATCCPTA-10212 Pfa1 KS结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：74，对应SEQID NO：68的第13-1362位。包含破囊壶菌ATCC PTA-10212 Pfa1 KS结构域的氨基酸序列在此表示为SEQ ID NO：75，对应SEQ ID NO：69的第5-454位。KS结构域包含活性位点基序：DXAC*(SEQ ID NO：43)，其中*酰基结合位点对应SEQ ID NO：69的C204。同样，KS结构域末端存在一个特征性基序：GFGG(SEQ ID NO：44)，对应SEQ ID NO：69的第451-454位以及SEQ IDNO：75的第447-450位。The first domain of Thraustochytrium ATCC PTA-10212 Pfa1 is the KS domain. The nucleotide sequence encoding the KS domain of Thraustochytrium ATCC PTA-10212 Pfa1 is set forth herein as SEQ ID NO:74, corresponding to positions 13-1362 of SEQ ID NO:68. The amino acid sequence encoding the KS domain of Thraustochytrium ATCC PTA-10212 Pfa1 is set forth herein as SEQ ID NO:75, corresponding to positions 5-454 of SEQ ID NO:69. The KS domain contains the active site motif: DXAC* (SEQ ID NO:43), where the * acyl binding site corresponds to C204 of SEQ ID NO:69. Similarly, a characteristic motif: GFGG (SEQ ID NO:44) is present at the terminus of the KS domain, corresponding to positions 451-454 of SEQ ID NO:69 and positions 447-450 of SEQ ID NO:75.

破囊壶菌ATCC PTA-10212 Pfa1的第二个结构域是MAT结构域。包含破囊壶菌ATCCPTA-10212 Pfa1 MAT结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：76，对应SEQID NO：68的第1783-2703位。包含破囊壶菌ATCC PTA-10212 Pfa1 MAT结构域的氨基酸序列在此表示为SEQ ID NO：77，对应SEQ ID NO：69的第595-901位。MAT结构域包含活性位点基序：GHS*XG(SEQ ID NO：46)，其中*酰基结合位点对应SEQ ID NO：69的S695。The second domain of Thraustochytrium ATCC PTA-10212 Pfa1 is the MAT domain. The nucleotide sequence encoding the MAT domain of Thraustochytrium ATCC PTA-10212 Pfa1 is set forth herein as SEQ ID NO:76, corresponding to positions 1783-2703 of SEQ ID NO:68. The amino acid sequence encoding the MAT domain of Thraustochytrium ATCC PTA-10212 Pfa1 is set forth herein as SEQ ID NO:77, corresponding to positions 595-901 of SEQ ID NO:69. The MAT domain comprises the active site motif: GHS*XG (SEQ ID NO:46), where the *acyl binding site corresponds to S695 of SEQ ID NO:69.

破囊壶菌ATCC PTA-10212Pfa1p的第3-12个结构域是10个串联的ACP结构域，本文记作ACP1、ACP2、ACP3、ACP4、ACP5、ACP6、ACP7、ACP8、ACP9和ACP10。包含第一个ACP结构域ACP1的核苷酸序列本文中为SEQ ID NO：80，包含于SEQ ID NO：68的约3280-3534位置的核苷酸序列范围内。包含ACP1的氨基酸序列本文记作SEQ ID NO：81，包含于SEQ ID NO：69的约1094-1178位置的氨基酸序列范围内。包含ACP2的核苷酸序列本文记作SEQ ID NO：82，包含于SEQ ID NO：68的约3607-3861位置的核苷酸序列范围内。包含ACP2的氨基酸序列本文记作SEQ ID NO：83，包含于SEQ ID NO：69的约1203-1287位置的氨基酸序列范围内。包含ACP3的核苷酸序列本文记作SEQ ID NO：84，包含于SEQ ID NO：68的约3934-4185位置的核苷酸序列范围内。包含ACP3的氨基酸序列本文记作SEQ ID NO：85，包含于SEQ ID NO：69的约1312-1396位置的氨基酸序列范围内。包含ACP4的核苷酸序列本文记作SEQ ID NO：86，包含于SEQ ID NO：68的约4261-4515位置的核苷酸序列范围内。包含ACP4的氨基酸序列本文记作SEQ ID NO：87，包含于SEQ ID NO：69的约1421-1505位置的氨基酸序列范围内。包含ACP5的核苷酸序列本文记作SEQ ID NO：88，包含于SEQ ID NO：68的约4589-4842位置的核苷酸序列范围内。包含ACP5的氨基酸序列本文记作SEQ ID NO：89，包含于SEQ ID NO：69的约1530-1614位置的氨基酸序列范围内。包含ACP6的核苷酸序列本文记作SEQ ID NO：90，包含于SEQ ID NO：68的约4915-5169位置的核苷酸序列范围内。包含ACP6的氨基酸序列本文记作SEQ ID NO：91，包含于SEQ ID NO：69的约1639-1723位置的氨基酸序列范围内。包含ACP7的核苷酸序列本文记作SEQ ID NO：92，包含于SEQ ID NO：68的约5242-5496位置的核苷酸序列范围内。包含ACP7的氨基酸序列本文记作SEQ ID NO：93，包含于SEQ ID NO：69的约1748-1832位置的氨基酸序列范围内。包含ACP8的核苷酸序列本文记作SEQ ID NO：94，包含于SEQ ID NO：68的约5569-5832位置的核苷酸序列范围内。包含ACP8的氨基酸序列本文记作SEQ ID NO：95，包含于SEQ ID NO：69的约1857-1941位置的氨基酸序列范围内。包含ACP9的核苷酸序列本文记作SEQ ID NO：96，包含于SEQ ID NO：68的约5896-6150位置的核苷酸序列范围内。包含ACP9的氨基酸序列本文记作SEQ ID NO：97，包含于SEQ ID NO：69的约1966-2050位置的氨基酸序列范围内。包含ACP10的核苷酸序列本文记作SEQ ID NO：98，包含于SEQ ID NO：68的约6199-6453位置的核苷酸序列范围内。包含ACP10的氨基酸序列本文记作SEQ ID NO：99，包含于SEQ ID NO：69的约2067-2151位置的氨基酸序列范围内。所有10种ACP结构域一起覆盖破囊壶菌ATCC PTA-10212 Pfa1的SEQ ID NO：68的约3208-6510位置的区域，对应SEQ ID NO：69的氨基酸位置1070-2170。包含所有10个结构域的整个ACP区域的核苷酸序列本文中记作SEQ ID NO：78；而包含所有6个结构域的整个ACP区域的氨基酸序列记作SEQ ID NO：79。SEQ ID NO：78内10个ACP区域的重复间隔约为每327个核苷酸(相邻活性位点丝氨酸间的氨基酸实际数目范围为101-109个氨基酸)。10个ACP结构域的每一个包含泛酰巯基乙胺结合基序LGIDS*(SEQ ID NO：47)，其中S*是泛酰巯基乙胺结合位点丝氨酸(S)。泛酰巯基乙胺结合位点丝氨酸(S)位置靠近每个ACP结构域序列的中央。6个ACPD结构域的活性位点丝氨酸的位置(即泛酰巯基乙胺结合位点)，对于SEQ ID NO：69的氨基酸序列而言是：ACP1＝S1135、ACP2＝S1244、ACP3＝S1353、ACP4＝S1462、ACP5＝S1571、ACP6＝S1680、APC7＝S1789、ACP7＝S1789、ACP8＝S1898、ACP9＝S＝2007和ACP10＝S2108。Domains 3-12 of Thraustochytrium ATCC PTA-10212 Pfa1p consist of ten tandem ACP domains, designated herein as ACP1, ACP2, ACP3, ACP4, ACP5, ACP6, ACP7, ACP8, ACP9, and ACP10. The nucleotide sequence comprising the first ACP domain, ACP1, is designated herein as SEQ ID NO:80 and is encompassed within the nucleotide sequence of approximately positions 3280-3534 of SEQ ID NO:68. The amino acid sequence comprising ACP1 is designated herein as SEQ ID NO:81 and is encompassed within the amino acid sequence of approximately positions 1094-1178 of SEQ ID NO:69. The nucleotide sequence comprising ACP2 is designated herein as SEQ ID NO:82 and is encompassed within the nucleotide sequence of approximately positions 3607-3861 of SEQ ID NO:68. An amino acid sequence comprising ACP2 is designated herein as SEQ ID NO:83 and is encompassed within the amino acid sequence of approximately positions 1203-1287 of SEQ ID NO:69. A nucleotide sequence comprising ACP3 is designated herein as SEQ ID NO:84 and is encompassed within the amino acid sequence of approximately positions 3934-4185 of SEQ ID NO:68. An amino acid sequence comprising ACP3 is designated herein as SEQ ID NO:85 and is encompassed within the amino acid sequence of approximately positions 1312-1396 of SEQ ID NO:69. A nucleotide sequence comprising ACP4 is designated herein as SEQ ID NO:86 and is encompassed within the amino acid sequence of approximately positions 4261-4515 of SEQ ID NO:68. An amino acid sequence comprising ACP4 is designated herein as SEQ ID NO:87 and is encompassed within the amino acid sequence of approximately positions 1421-1505 of SEQ ID NO:69. A nucleotide sequence comprising ACP5 is designated herein as SEQ ID NO:88 and is encompassed within the nucleotide sequence of approximately positions 4589-4842 of SEQ ID NO:68. An amino acid sequence comprising ACP5 is designated herein as SEQ ID NO:89 and is encompassed within the amino acid sequence of approximately positions 1530-1614 of SEQ ID NO:69. A nucleotide sequence comprising ACP6 is designated herein as SEQ ID NO:90 and is encompassed within the nucleotide sequence of approximately positions 4915-5169 of SEQ ID NO:68. An amino acid sequence comprising ACP6 is designated herein as SEQ ID NO:91 and is encompassed within the amino acid sequence of approximately positions 1639-1723 of SEQ ID NO:69. A nucleotide sequence comprising ACP7 is designated herein as SEQ ID NO:92 and is encompassed within the nucleotide sequence of approximately positions 5242-5496 of SEQ ID NO:68. An amino acid sequence comprising ACP7 is designated herein as SEQ ID NO:93 and is encompassed within the amino acid sequence of approximately positions 1748-1832 of SEQ ID NO:69. A nucleotide sequence comprising ACP8 is designated herein as SEQ ID NO:94 and is encompassed within the amino acid sequence of approximately positions 5569-5832 of SEQ ID NO:68. An amino acid sequence comprising ACP8 is designated herein as SEQ ID NO:95 and is encompassed within the amino acid sequence of approximately positions 1857-1941 of SEQ ID NO:69. A nucleotide sequence comprising ACP9 is designated herein as SEQ ID NO:96 and is encompassed within the amino acid sequence of approximately positions 5896-6150 of SEQ ID NO:68. An amino acid sequence comprising ACP9 is designated herein as SEQ ID NO:97 and is encompassed within the amino acid sequence of approximately positions 1966-2050 of SEQ ID NO:69. The nucleotide sequence comprising ACP10 is designated herein as SEQ ID NO:98 and is encompassed within the nucleotide sequence of approximately positions 6199-6453 of SEQ ID NO:68. The amino acid sequence comprising ACP10 is designated herein as SEQ ID NO:99 and is encompassed within the amino acid sequence of approximately positions 2067-2151 of SEQ ID NO:69. All ten ACP domains together cover the region of approximately positions 3208-6510 of SEQ ID NO:68 of Thraustochytrium ATCC PTA-10212 Pfa1, corresponding to amino acid positions 1070-2170 of SEQ ID NO:69. The nucleotide sequence of the entire ACP region, encompassing all ten domains, is designated herein as SEQ ID NO:78, and the amino acid sequence of the entire ACP region, encompassing all six domains, is designated herein as SEQ ID NO:79. The 10 ACP domains within SEQ ID NO:78 are repeated approximately every 327 nucleotides (the actual number of amino acids between adjacent active site serines ranges from 101 to 109 amino acids). Each of the 10 ACP domains contains a pantetheine binding motif LGIDS* (SEQ ID NO:47), where S* is the pantetheine binding site serine (S). The pantetheine binding site serine (S) is located near the center of each ACP domain sequence. The positions of the active site serines of the six ACPD domains (i.e., the pantetheine binding site) for the amino acid sequence of SEQ ID NO: 69 are: ACP1 = S1135, ACP2 = S1244, ACP3 = S1353, ACP4 = S1462, ACP5 = S1571, ACP6 = S1680, APC7 = S1789, ACP8 = S1898, ACP9 = S2007, and ACP10 = S2108.

破囊壶菌ATCC PTA-10212 Pfa1的第13个结构域是KR结构域。包含Pfa1KR结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：100，对应SEQ ID NO：68的第6808-8958位。包含Pfa1 KR结构域的氨基酸序列在此表示为SEQ ID NO：101，对应SEQ ID NO：69的第2270-2986位。KR结构域内部是与短链醛脱氢酶(KR是该家族成员)同源的核心区域(包含于核苷酸序列SEQ ID NO：116和氨基酸序列SEQ ID NO：117中)。该核心区域范围是SEQ IDNO：68的约5998-6900位，对应SEQ ID NO：69的2000-2300氨基酸位置。The 13th domain of Thraustochytrium ATCC PTA-10212 Pfa1 is the KR domain. The nucleotide sequence encoding the Pfa1 KR domain is set forth herein as SEQ ID NO:100, corresponding to positions 6808-8958 of SEQ ID NO:68. The amino acid sequence encoding the Pfa1 KR domain is set forth herein as SEQ ID NO:101, corresponding to positions 2270-2986 of SEQ ID NO:69. Within the KR domain is a core region (enclosed in the nucleotide sequence of SEQ ID NO:116 and the amino acid sequence of SEQ ID NO:117) homologous to short-chain aldehyde dehydrogenases (of which KR is a member). This core region extends from approximately positions 5998-6900 of SEQ ID NO:68, corresponding to amino acid positions 2000-2300 of SEQ ID NO:69.

破囊壶菌ATCC PTA-10212 Pfa1的第14个结构域是DH结构域。包含Pfa1 DH结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：118，对应SEQ ID NO：68的第7027-7065位。包含Pfa1 DH结构域的氨基酸序列在此表示为SEQ ID NO：119，对应SEQ ID NO：69的第2343-2355位。DH结构域包含保守的活性位点基序(参见Donadio，S.和Katz.，L.，Gene 111(1)：51-60(1992))：LxxHxxxGxxxxP(SEQ ID NO：50)。The 14th domain of Thraustochytrium ATCC PTA-10212 Pfa1 is the DH domain. The nucleotide sequence encoding the Pfa1 DH domain is shown herein as SEQ ID NO: 118, corresponding to positions 7027-7065 of SEQ ID NO: 68. The amino acid sequence encoding the Pfa1 DH domain is shown herein as SEQ ID NO: 119, corresponding to positions 2343-2355 of SEQ ID NO: 69. The DH domain contains a conserved active site motif (see Donadio, S. and Katz., L., Gene 111(1): 51-60 (1992)): LxxHxxxGxxxxP (SEQ ID NO: 50).

表10显示与破囊壶菌ATCC PTA-10212PFA2相关的结构域和活性位点。Table 10 shows the structural domains and active sites associated with Thraustochytrium ATCC PTA-10212 PFA2.

表10：破囊壶菌ATCC PTA-10212 PFA2结构域分析Table 10: Thraustochytrid ATCC PTA-10212 PFA2 domain analysis

破囊壶菌ATCC PTA-10212 Pfa2的第一个结构域是KS结构域。包含破囊壶菌ATCCPTA-10212 Pfa2 KS结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：102，对应SEQID NO：70的第10-1320位。包含破囊壶菌ATCC PTA-10212 Pfa2 KS结构域的氨基酸序列在此表示为SEQ ID NO：103，对应SEQ ID NO：71的第4-440位。KS结构域包含活性位点基序：DXAC*(SEQ ID NO：43)，其中*酰基结合位点对应SEQ ID NO：71的C191。同样，KS结构域末端存在一个特征性基序：GFGG(SEQ ID NO：44)，对应SEQ ID NO：71的第423-426位以及SEQ IDNO：70的第1267-1278位。The first domain of Thraustochytrium ATCC PTA-10212 Pfa2 is the KS domain. The nucleotide sequence encoding the KS domain of Thraustochytrium ATCC PTA-10212 Pfa2 is set forth herein as SEQ ID NO:102, corresponding to positions 10-1320 of SEQ ID NO:70. The amino acid sequence encoding the KS domain of Thraustochytrium ATCC PTA-10212 Pfa2 is set forth herein as SEQ ID NO:103, corresponding to positions 4-440 of SEQ ID NO:71. The KS domain contains the active site motif: DXAC* (SEQ ID NO:43), where the * acyl binding site corresponds to C191 of SEQ ID NO:71. Similarly, a characteristic motif: GFGG (SEQ ID NO:44) is present at the terminus of the KS domain, corresponding to positions 423-426 of SEQ ID NO:71 and positions 1267-1278 of SEQ ID NO:70.

破囊壶菌ATCC PTA-10212 Pfa2的第二个结构域是CLF结构域。包含破囊壶菌ATCCPTA-10212 Pfa2 CLF结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：104，对应SEQID NO：70的第1378-2700位。包含破囊壶菌ATCC PTA-10212Pfa2CLF结构域的氨基酸序列在此表示为SEQ ID NO：105，对应SEQ ID NO：71的第460-900位。The second domain of Thraustochytrium ATCC PTA-10212 Pfa2 is the CLF domain. The nucleotide sequence encoding the CLF domain of Thraustochytrium ATCC PTA-10212 Pfa2 is set forth herein as SEQ ID NO:104, corresponding to positions 1378-2700 of SEQ ID NO:70. The amino acid sequence encoding the CLF domain of Thraustochytrium ATCC PTA-10212 Pfa2 is set forth herein as SEQ ID NO:105, corresponding to positions 460-900 of SEQ ID NO:71.

破囊壶菌ATCC PTA-10212 Pfa2的第三个结构域是AT结构域。包含破囊壶菌ATCCPTA-10212 Pfa2 AT结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：106，对应SEQID NO：70的第2848-4200位。包含破囊壶菌ATCC PTA-10212 Pfa2 AT结构域的氨基酸序列在此表示为SEQ ID NO：107，对应SEQ ID NO：71的第950-1400位。AT结构域包含活性位点基序GxS*xG(SEQ ID NO：50)，这是酰基转移酶(AT)蛋白的特征，活性位点丝氨酸残基对应SEQID NO：71的S1121。The third domain of Thraustochytrium ATCC PTA-10212 Pfa2 is the AT domain. The nucleotide sequence encoding the AT domain of Thraustochytrium ATCC PTA-10212 Pfa2 is set forth herein as SEQ ID NO:106, corresponding to positions 2848-4200 of SEQ ID NO:70. The amino acid sequence encoding the AT domain of Thraustochytrium ATCC PTA-10212 Pfa2 is set forth herein as SEQ ID NO:107, corresponding to positions 950-1400 of SEQ ID NO:71. The AT domain contains the active site motif GxS*xG (SEQ ID NO:50), which is characteristic of acyltransferase (AT) proteins. The active site serine residue corresponds to S1121 of SEQ ID NO:71.

破囊壶菌ATCC PTA-10212 Pfa2的第四个结构域是ER结构域。包含破囊壶菌ATCCPTA-10212 Pfa2 ER结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：108，对应SEQID NO：70的第4498-5700位。包含破囊壶菌ATCC PTA-10212 Pfa2 ER结构域的氨基酸序列在此表示为SEQ ID NO：109，对应SEQ ID NO：71的第1500-1900位。The fourth domain of Thraustochytrium ATCC PTA-10212 Pfa2 is the ER domain. The nucleotide sequence encoding the ER domain of Thraustochytrium ATCC PTA-10212 Pfa2 is set forth herein as SEQ ID NO:108, corresponding to positions 4498-5700 of SEQ ID NO:70. The amino acid sequence encoding the ER domain of Thraustochytrium ATCC PTA-10212 Pfa2 is set forth herein as SEQ ID NO:109, corresponding to positions 1500-1900 of SEQ ID NO:71.

表11显示与破囊壶菌ATCC PTA-10212 PFA3相关的结构域和活性位点。Table 11 shows the structural domains and active sites associated with Thraustochytrium ATCC PTA-10212 PFA3.

表11：破囊壶菌ATCC PTA-10212 PFA3结构域分析Table 11: Thraustochytrid ATCC PTA-10212 PFA3 domain analysis

破囊壶菌ATCC PTA-10212 Pfa3的第一和第二个结构域本文分别记作DH1和DH2。包含破囊壶菌ATCC PTA-10212 Pfa3 DH1结构域的编码序列的核苷酸序列在此表示为SEQID NO：110，对应SEQ ID NO：72的第1-1350位。包含破囊壶菌ATCC PTA-10212Pfa3DH1结构域的氨基酸序列在此表示为SEQ ID NO：111，对应SEQ ID NO：73的第1-450位。包含破囊壶菌ATCC PTA-10212 Pfa3 DH2结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：112，对应SEQ ID NO：72的第1501-2700位。包含破囊壶菌ATCC PTA-10212 Pfa3 DH2结构域的氨基酸序列在此表示为SEQ ID NO：113，对应SEQ ID NO：73的第501-900位。DH结构域包含活性位点基序：FxxH*F(SEQ ID NO：53).DH1中包含活性位点基序的核苷酸序列对应SEQ IDNO：72的位置934-936，而DH2中包含活性位点基序的核苷酸序列对应SEQ ID NO：72的位置2401-2403。基序FxxH*F中的活性位点H*基于来自Leesong等，Structure 4：253-64(1996)和Kimber等，J Biol Chem.279：52593-602(2004)的数据，DH1中的活性位点H*对应SEQ IDNO：73的H312，DH2中的活性位点H*对应SEQ ID NO：73的H801。The first and second domains of Thraustochytrium ATCC PTA-10212 Pfa3 are referred to herein as DH1 and DH2, respectively. The nucleotide sequence comprising the coding sequence for the DH1 domain of Thraustochytrium ATCC PTA-10212 Pfa3 is set forth herein as SEQ ID NO: 110, corresponding to positions 1-1350 of SEQ ID NO: 72. The amino acid sequence comprising the DH1 domain of Thraustochytrium ATCC PTA-10212 Pfa3 is set forth herein as SEQ ID NO: 111, corresponding to positions 1-450 of SEQ ID NO: 73. The nucleotide sequence comprising the coding sequence for the DH2 domain of Thraustochytrium ATCC PTA-10212 Pfa3 is set forth herein as SEQ ID NO: 112, corresponding to positions 1501-2700 of SEQ ID NO: 72. The amino acid sequence comprising the DH2 domain of Thraustochytrium ATCC PTA-10212 Pfa3 is set forth herein as SEQ ID NO:113 and corresponds to positions 501-900 of SEQ ID NO:73. The DH domain comprises the active site motif: FxxH*F (SEQ ID NO:53). The nucleotide sequence comprising the active site motif in DH1 corresponds to positions 934-936 of SEQ ID NO:72, while the nucleotide sequence comprising the active site motif in DH2 corresponds to positions 2401-2403 of SEQ ID NO:72. The active site H* in the motif FxxH*F is based on data from Leesong et al., Structure 4:253-64 (1996) and Kimber et al., J Biol Chem. 279:52593-602 (2004). The active site H* in DH1 corresponds to H312 of SEQ ID NO:73, and the active site H* in DH2 corresponds to H801 of SEQ ID NO:73.

破囊壶菌ATCC PTA-10212Pfa3的第三个结构域是ER结构域。包含破囊壶菌ATCCPTA-10212 Pfa3 ER结构域的编码序列的核苷酸序列在此表示为SEQ ID NO：114，对应SEQID NO：72的第2848-4200位。包含破囊壶菌ATCCPTA-10212 Pfa3 ER结构域的氨基酸序列在此表示为SEQ ID NO：115，对应SEQ ID NO：73的第950-1400位。The third domain of Thraustochytrium ATCC PTA-10212 Pfa3 is the ER domain. The nucleotide sequence encoding the ER domain of Thraustochytrium ATCC PTA-10212 Pfa3 is set forth herein as SEQ ID NO:114, corresponding to positions 2848-4200 of SEQ ID NO:72. The amino acid sequence encoding the ER domain of Thraustochytrium ATCC PTA-10212 Pfa3 is set forth herein as SEQ ID NO:115, corresponding to positions 950-1400 of SEQ ID NO:73.

实施例7Example 7

之前展示并描述了在裂殖壶菌ATCC 20888中失活天然的PUFA合酶基因，以产生PUFA营养缺陷型，并用外源引入同源基因替换此类失活基因以恢复PUFA合成。参见例如，美国专利号7,217,856，将其全文纳入本文作参考。之前将三个来源于裂殖壶菌ATCC 20888的PUFA合酶基因命名为orfA、orfB和orfC，分别对应本文所用PFA1、PFA2和PFA3的命名法。同上。Inactivation of native PUFA synthase genes in Schizochytrium ATCC 20888 to generate PUFA auxotrophy and replacement of these inactivated genes with exogenously introduced homologous genes to restore PUFA synthesis have been previously demonstrated and described. See, for example, U.S. Patent No. 7,217,856, which is incorporated herein by reference in its entirety. The three PUFA synthase genes from Schizochytrium ATCC 20888 were previously designated orfA, orfB, and orfC, corresponding to the nomenclature used herein for PFA1, PFA2, and PFA3, respectively. Ibid.

用包含orfA侧接区域序列包围的零霉素(Zeocin)^TM抗性标记物的载体转化后，通过同源重组替换裂殖壶菌ATCC 20888的天然orfA基因。产生缺少功能性orfA基因的突变株。该突变株是营养缺陷型，生长需要补充PUFA。After transformation with a vector containing the Zeocin ^™ resistance marker surrounded by orfA flanking region sequences, the native orfA gene of Schizochytrium sp. ATCC 20888 was replaced by homologous recombination. This resulted in a mutant lacking a functional orfA gene. This mutant is auxotrophic and requires PUFA supplementation for growth.

将裂殖壶菌ATCC PTA-9695 PFA1(SEQ ID NO：1)克隆入表达载体pREZ37产生pREZ345。表达载体包含来自裂殖壶菌ATCC 20888的天然orfA基因座的侧接区域的约2kbDNA。使用包含PFA1的pREZ345通过电穿孔加上酶预处理(见下)转化缺少功能性orfA的裂殖壶菌ATCC 20888突变子。基于突变体中侧接零霉素^TM抗性标记物和pREZ345中侧接PFA1基因的同源区域，发生双交叉重组，从而使PFA1插入天然的orfA基因座。与裂殖壶菌ATCC PTA-9695 PFA1(SEQ ID NO：1)发生重组恢复了缺少orfA的裂殖壶菌(Schizochytrium sp)ATCC20888突变体的PUFA生产。简而言之，将细胞在M2B液体培养基(见下)中30℃200rpm振荡培养3天。收集细胞，利用标准技术将脂肪酸转化为甲酯。使用配备火焰离子化探测的气相色谱(GC-FID)确定脂肪酸甲酯(FAME)形式的脂肪酸分布。包含功能性orfA基因的天然裂殖壶菌(Schizochytrium sp.)ATCC 20888菌株以2.3∶1的比例产生DHA和DPA。其中失活orfA基因替换为裂殖壶菌(Schizochytrium sp.)ATCC PTA-9695 PFA1(SEQ ID NO：1)的重组菌株也以2.4∶1的比例产生DHA和DPA n-6。重组菌株的EPA含量为2.7％脂肪酸甲酯(FAME)，DPAn-3含量为0.7％，DPA n-6含量为8.8％，DHA含量为21.2％。Schizochytrium sp. ATCC PTA-9695 PFA1 (SEQ ID NO: 1) was cloned into the expression vector pREZ37 to generate pREZ345. The expression vector contains approximately 2 kb of DNA flanking regions of the native orfA locus of Schizochytrium sp. ATCC 20888. pREZ345 containing PFA1 was used to transform a Schizochytrium sp. ATCC 20888 mutant lacking a functional orfA gene by electroporation with enzyme pretreatment (see below). Double crossover recombination occurred based on the flanking Zeocin ^™ resistance marker in the mutant and homologous regions flanking the PFA1 gene in pREZ345, resulting in the insertion of PFA1 into the native orfA locus. Recombination with Schizochytrium sp. ATCC PTA-9695 PFA1 (SEQ ID NO: 1) restored PUFA production in the Schizochytrium sp. ATCC 20888 mutant lacking orfA. Briefly, cells were cultured in M2B liquid medium (see below) at 30°C with shaking at 200 rpm for 3 days. Cells were harvested and fatty acids were converted to methyl esters using standard techniques. Fatty acid distribution as fatty acid methyl esters (FAMEs) was determined using gas chromatography with flame ionization detection (GC-FID). A native Schizochytrium sp. ATCC 20888 strain containing a functional orfA gene produced DHA and DPA in a ratio of 2.3:1. A recombinant strain in which the inactivated orfA gene was replaced with Schizochytrium sp. ATCC PTA-9695 PFA1 (SEQ ID NO: 1) also produced DHA and DPA n-6 in a ratio of 2.4:1. The recombinant strain had an EPA content of 2.7% fatty acid methyl esters (FAMEs), a DPAn-3 content of 0.7%, a DPA n-6 content of 8.8%, and a DHA content of 21.2%.

M2B培养基-10g/L葡萄糖、0.8g/L(NH₄)₂SO₄、5g/L Na₂SO₄、2g/L MgSO₄·7H₂O、0.5g/L KH₂PO₄、0.5g/L KCl、0.1g/L CaCl₂·2H₂O、0.1M MES(pH 6.0)0.1％PB26金属和0.1％PB26维生素s(v/v)。PB26维生素由50mg/mL维生素B12、100μg/mL硫胺素和100μg/mLCa-泛酸盐组成。调节PB26为pH 4.5，它由3g/L FeSO₄·7H₂O、1g/L MnCl₂·4H₂O、800mg/mLZnSO₄·7H₂O、20mg/mL CoCl₂·6H₂O、10mg/mL Na₂MoO₄·2H₂O、600mg/mL CuSO₄·5H₂O，和800mg/mL NiSO₄·6H₂O组成。将PB26母液分别过滤除菌，并在高压灭菌后加入肉汤。将葡萄糖、KH₂PO₄和CaCl₂·2H₂O每种均在与肉汤其它组分混合前单独高压灭菌，以防止盐沉淀和糖焦化。所有培养基成分均购自密苏里州圣路易斯的西格玛化学公司(Sigma Chemical，StLouis，MO)。M2B medium—10 g/L glucose, 0.8 g/L (NH ₄ ) ₂ SO ₄ , 5 g/L Na ₂ SO ₄ , 2 g/L MgSO ₄ ·7H ₂ O, 0.5 g/L KH ₂ PO ₄ , 0.5 g/L KCl, 0.1 g/L CaCl ₂ ·2H ₂ O, 0.1 M MES (pH 6.0), 0.1% PB26 metals, and 0.1% PB26 vitamins (v/v). PB26 vitamins consist of 50 mg/mL vitamin B12, 100 μg/mL thiamine, and 100 μg/mL Ca-pantothenate. PB26 was adjusted to pH 4.5 and consisted of 3 g/L _FeSO₄ · _7H₂O , 1 g/ _L MnCl₂· _4H₂O , 800 mg/ _mL ZnSO₄· _7H₂O , 20 mg/mL _CoCl₂ ·6H₂O, ₁₀ _mg /mL _Na₂MoO₄ · _2H₂O , 600 mg/mL _CuSO₄ · _5H₂O , and 800 mg/mL NiSO₄· _6H₂O . The PB26 stock solution was filter sterilized separately and added to the broth after autoclaving. _Glucose , _KH₂PO₄ , and _CaCl₂ · _2H₂O were each autoclaved separately before mixing with _the other broth components to prevent salt precipitation and sugar caramelization. All media components were purchased from Sigma Chemical (St. Louis, MO).

电穿孔和酶预处理-细胞生长于50mL M50-20培养基中(参见美国公开号2008/0022422)，30℃振荡培养2天。用M2B培养基以1∶100比例稀释细胞并培养过夜(16-24h)，让其生长至对数中期(OD600 1.5-2.5)。将细胞在50mL锥形管中约3000x g离心5min。去掉上清，然后将细胞重悬于合适体积的1M甘露醇，pH 5.5中，使其终浓度为2个OD₆₀₀单位。将5mL细胞分装入25mL振荡培养瓶，加入10mM CaCl₂(1.0M母液，过滤除菌)和0.25mg/mL蛋白酶XIV (10mg/mL母液，过滤除菌；西格玛-奥德里奇公司，密苏里圣路易斯)(Sigma-Aldrich，St.Louis，MO)。将培养瓶在30℃约100rpm振荡培养4h。在显微镜下观察形成原生质体的程度，最好是单细胞。将细胞在圆底试管(即14mL鹰牌(Falcon)^TM试管，BD生物科学，加州圣何塞(BD Biosciences，San Jose，CA))中2500x g离心5min。去除上清，将细胞轻柔重悬于5mL冰冷的10％甘油中。在圆底试管中将细胞再次约2500x g离心5min。去除上清，使用粗枪头将细胞轻柔重悬于500μL冰冷的10％甘油中。将90μL细胞分装入预冷的电转杯中(基因脉冲(Gene Pulser)杯-0.1cm间隙或0.2cm间隙，伯乐公司，加州赫尔库司)(Bio-Rad，Hercules，CA)。将1-5μg DNA(小于等于10μL体积)加入杯中，用移液器头轻柔混合，置于冰上5min。在200欧姆(电阻)、25μF(电容)以及250V(0.1cm间隙)或500V(0.2cm间隙)的条件下对细胞进行电穿孔。立即将0.5mL M50-20培养基加入杯中。接着将细胞转移到含4.5mLM50-20培养基的25mL振荡培养瓶中，并在30℃约100rpm振荡培养2-3h。在圆底试管中将细胞约2500x g离心5min。去除上清，将细胞团块重悬于0.5mLM50-20培养基中。将细胞铺板于合适数量(2-5)的含合适选择压力的M2B平板上，并在30℃孵育。Electroporation and Enzyme Pretreatment - Cells were grown in 50 mL of M50-20 medium (see U.S. Publication No. 2008/0022422) with shaking at 30°C for 2 days. Cells were diluted 1:100 with M2B medium and cultured overnight (16-24 hours) to mid-logarithmic phase (OD600 1.5-2.5). Cells were centrifuged at approximately 3000 x g for 5 minutes in a 50 mL conical tube. The supernatant was removed and the cells were resuspended in an appropriate volume of 1 M mannitol, pH 5.5, to a final concentration of 2 _OD600 units. Aliquot 5 mL of cells into a 25 mL shaker flask. Add 10 mM _CaCl₂ (1.0 M stock, filter sterilized) and 0.25 mg/mL Protease XIV (10 mg/mL stock, filter sterilized; Sigma-Aldrich, St. Louis, MO). Incubate the flask at 30°C with shaking at approximately 100 rpm for 4 hours. Observe microscopically for the extent of protoplast formation, ideally single cells. Centrifuge the cells at 2500 x g for 5 minutes in a round-bottom tube (i.e., a 14 mL Falcon ^™ tube; BD Biosciences, San Jose, CA). Remove the supernatant and gently resuspend the cells in 5 mL of ice-cold 10% glycerol. Centrifuge the cells again at approximately 2500 x g for 5 minutes in the round-bottom tube. Remove the supernatant and gently resuspend the cells in 500 μL of ice-cold 10% glycerol using a thick pipette tip. Aliquot 90 μL of cells into a pre-chilled electroporation cuvette (Gene Pulser cuvette - 0.1 cm gap or 0.2 cm gap, Bio-Rad, Hercules, CA) (Bio-Rad, Hercules, CA). Add 1-5 μg of DNA (10 μL volume or less) to the cuvette, gently mix with a pipette tip, and place on ice for 5 minutes. Electroporate the cells at 200 ohms (resistance), 25 μF (capacitance) and 250 V (0.1 cm gap) or 500 V (0.2 cm gap). Immediately add 0.5 mL of M50-20 medium to the cuvette. Then transfer the cells to a 25 mL shaking culture flask containing 4.5 mL of M50-20 medium and shake at approximately 100 rpm at 30°C for 2-3 hours. Centrifuge the cells in a round-bottom tube at approximately 2500 x g for 5 minutes. Remove the supernatant and resuspend the cell pellet in 0.5 mL of M50-20 medium. Plate the cells onto an appropriate number (2-5) of M2B plates containing the appropriate selection pressure and incubate at 30°C.

也用含有PFA1的pREZ345转化缺少功能性orfA的裂殖壶菌(Schizochytrium sp.)ATCC 20888突变体，以使PFA1随机整合入该突变体，并恢复产生PUFA。A Schizochytrium sp. ATCC 20888 mutant lacking a functional orfA was also transformed with pREZ345 containing PFAl to allow random integration of PFAl into the mutant and restore PUFA production.

实施例8Example 8

重新合成破囊壶菌ATCC PTA-10212 PFA1(SEQ ID NO：68)(DNA2.0)，并为在裂殖壶菌中表达进行密码子优化(SEQ ID NO：120)，然后将其克隆入表达载体中产生pLR95。使用裂殖壶菌密码子使用表进行密码子优化(图22)。表达载体包含来自裂殖壶菌ATCC 20888的天然orfA基因座的侧接区域的约2kbDNA。Thraustochytrium ATCC PTA-10212 PFA1 (SEQ ID NO: 68) (DNA2.0) was resynthesized and codon-optimized for expression in Schizochytrium (SEQ ID NO: 120), then cloned into the expression vector pLR95. Codon optimization was performed using the Schizochytrium codon usage table ( FIG. 22 ). The expression vector contained approximately 2 kb of DNA flanking the native orfA locus of Schizochytrium ATCC 20888.

通过电穿孔以及酶预处理(参见实施例7)，利用含有密码子优化的破囊壶菌ATCCPTA-10212 PFA1(SEQ ID NO：120)的pLR95转化来自实施例7的缺少功能性orfA的裂殖壶菌ATCC 20888。基于突变体中侧接零霉素^TM抗性标记物和pLR95中侧接PFA1基因的同源区域，发生双交叉重组，从而使密码子优化的破囊壶菌ATCC PTA-10212 PFA1插入天然的orfA基因座。与密码子优化的破囊壶菌ATCC PTA-10212 PFA1(SEQ ID NO：120)发生重组恢复了缺少orfA的裂殖壶菌ATCC 20888突变体的PUFA生产。如实施例7进行细胞培养和FAME分析。包含功能性orfA基因的天然裂殖壶菌ATCC 20888菌株以25∶1的比例产生DHA和DPA。其中失活orfA基因替换为密码子优化的破囊壶菌ATCC PTA-10212 PFA1(SEQ ID NO：120)的重组菌株以5.4∶1的比例产生DHA和DPA，这进一步表明裂殖壶菌中的PUFA分布特征可被本文所述核酸分子所改变。重组菌株的EPA含量为4.4％FAME，DPA n-3含量为2.3％，DPA n-6含量为4.9％，DHA含量为24.0％。The Schizochytrium ATCC 20888 mutant lacking a functional orfA gene from Example 7 was transformed with pLR95 containing a codon-optimized Thraustochytrium ATCC PTA-10212 PFA1 (SEQ ID NO: 120) by electroporation and enzyme pretreatment (see Example 7). Double crossover recombination occurred based on the flanking Zeocin ^™ resistance marker in the mutant and the homologous regions flanking the PFA1 gene in pLR95, resulting in insertion of the codon-optimized Thraustochytrium ATCC PTA-10212 PFA1 into the native orfA locus. Recombination with the codon-optimized Thraustochytrium ATCC PTA-10212 PFA1 (SEQ ID NO: 120) restored PUFA production in the Schizochytrium ATCC 20888 mutant lacking orfA. Cell culture and FAME analysis were performed as in Example 7. A native Schizochytrium ATCC 20888 strain containing a functional orfA gene produced DHA and DPA in a ratio of 25:1. A recombinant strain in which the inactivated orfA gene was replaced with the codon-optimized Thraustochytrium ATCC PTA-10212 PFA1 (SEQ ID NO: 120) produced DHA and DPA in a ratio of 5.4:1, further demonstrating that the PUFA profile in Schizochytrium can be altered by the nucleic acid molecules described herein. The recombinant strain had an EPA content of 4.4% FAME, a DPA n-3 content of 2.3%, a DPA n-6 content of 4.9%, and a DHA content of 24.0%.

也用含有PFA1的pLR95转化缺少功能性orfA的裂殖壶菌ATCC 20888突变体，从而使PFA1随机整合入该突变体，并恢复产生PUFA。A Schizochytrium sp. ATCC 20888 mutant lacking a functional orfA was also transformed with pLR95 containing PFAl, resulting in random integration of PFAl into the mutant and restoration of PUFA production.

实施例9Example 9

通过电穿孔以及酶预处理(参照实施例7)用包含orfB侧接区域序列包围的零霉素^TM抗性标记物的载体转化后，通过同源重组替换裂殖壶菌ATCC 20888的天然orfB基因。产生缺少功能性orfB基因的突变株。该突变株是营养缺陷型，生长需要补充PUFA。After transformation with a vector containing the Zeocin ^™ resistance marker surrounded by orfB flanking region sequences via electroporation and enzyme pretreatment (see Example 7), the native orfB gene of Schizochytrium sp. ATCC 20888 was replaced by homologous recombination. This generated a mutant lacking a functional orfB gene. This mutant is auxotrophic and requires PUFA supplementation for growth.

将裂殖壶菌ATCC PTA-9695PFA2(SEQ ID NO：3)克隆入表达载体pDS04产生pREZ331。表达载体包含来自裂殖壶菌ATCC 20888的天然orfB基因座的侧接区域的约2kbDNA。Schizochytrium ATCC PTA-9695 PFA2 (SEQ ID NO: 3) was cloned into the expression vector pDS04 to generate pREZ331. The expression vector contains approximately 2 kb of DNA flanking regions of the native orfB locus of Schizochytrium ATCC 20888.

用包含PFA2的pREZ331转化缺少功能性orfB的裂殖壶菌ATCC 20888。基于突变体中的随机整合，裂殖壶菌ATCC PTA-9695 PFA2(SEQ ID NO：3)恢复了PUFA的产生。Schizochytrium ATCC 20888, which lacks a functional orfB, was transformed with pREZ331 containing PFA2. Schizochytrium ATCC PTA-9695 PFA2 (SEQ ID NO: 3) restored PUFA production based on random integration in the mutant.

如实施例7进行细胞培养和FAME分析。Cell culture and FAME analysis were performed as in Example 7.

包含功能性orfB基因的天然裂殖壶菌ATCC 20888菌株以2.3∶1的比例产生DHA和DPA n-6。其中失活orfB基因替换为裂殖壶菌ATCC PTA-9695PFA2(SEQ ID NO：3)的重组菌株以3.5∶1的比例产生DHA和DPA n-6。重组菌株的EPA含量为0.8％FAME，DPA n-3含量为0.1％，DPA n-6含量为7.1％以及DHA含量为25.1％。The native Schizochytrium ATCC 20888 strain, containing a functional orfB gene, produced DHA and DPA n-6 at a ratio of 2.3:1. A recombinant strain in which the inactivated orfB gene was replaced with Schizochytrium ATCC PTA-9695 PFA2 (SEQ ID NO: 3) produced DHA and DPA n-6 at a ratio of 3.5:1. The recombinant strain had an EPA content of 0.8% FAME, a DPA n-3 content of 0.1%, a DPA n-6 content of 7.1%, and a DHA content of 25.1%.

也用含有PFA2的pREZ331转化缺少功能性orfB的裂殖壶菌ATCC 20888突变体，从而使PFA2插入天然orfB基因座，并恢复产生PUFA。A Schizochytrium ATCC 20888 mutant lacking a functional orfB was also transformed with pREZ331 containing PFA2, thereby inserting PFA2 into the native orfB locus and restoring PUFA production.

实施例10Example 10

重新合成破囊壶菌ATCC PTA-10212 PFA2(SEQ ID NO：70)(DNA2.0)，并为在裂殖壶菌中表达进行密码子优化(SEQ ID NO：121)，然后将其克隆入表达载体产生pLR85。使用裂殖壶菌密码子使用表进行密码子优化(图22)。表达载体包含来自裂殖壶菌ATCC 20888的天然orfB基因座的侧接区域的约2kbDNA。Thraustochytrium ATCC PTA-10212 PFA2 (SEQ ID NO: 70) (DNA2.0) was resynthesized and codon-optimized for expression in Schizochytrium (SEQ ID NO: 121), then cloned into the expression vector pLR85. Codon optimization was performed using the Schizochytrium codon usage table ( FIG. 22 ). The expression vector contained approximately 2 kb of DNA flanking regions of the native orfB locus of Schizochytrium ATCC 20888.

在DHA产量提高的裂殖壶菌ATCC 20888的子品系中也对orf基因的替换进行了研究。通过电穿孔以及酶预处理(参照实施例7)用包含orfB侧接区域序列包围的零霉素^TM抗性标记物的载体转化后，通过同源重组替换子菌株的天然orfB基因。产生缺少功能性orfB基因的突变株。该突变株是营养缺陷型，生长需要补充PUFA。使用包含密码子优化的破囊壶菌ATCC PTA-10212 PFA2(SEQ ID NO：121)的pLR85通过电穿孔以及酶预处理转化突变株(参见实施例8)。基于突变体中侧接零霉素^TM抗性标记物和pLR85中侧接PFA2基因的同源区域，发生双交叉重组，从而使密码子优化的破囊壶菌ATCC PTA-10212PFA2插入天然的orfB基因座。与密码子优化的破囊壶菌(Thraustochytrium sp.)ATCC PTA-10212PFA2(SEQ ID NO：121)发生重组恢复了缺少orfB的子菌株突变体的PUFA生产。如实施例7进行细胞培养和FAME分析。重组菌株的EPA含量为1.0％FAME，DPA n-3含量为0.3％，DPA n-6含量为7.0％以及DHA含量为31.0％。Orf gene replacement was also investigated in a daughter strain of Schizochytrium ATCC 20888 with enhanced DHA production. After transformation with a vector containing the Zeocin ^™ resistance marker surrounded by orfB flanking regions via electroporation and enzymatic pretreatment (see Example 7), the daughter strain's native orfB gene was replaced by homologous recombination. This generated a mutant lacking a functional orfB gene. This mutant is auxotrophic and requires PUFA supplementation for growth. The mutant was transformed with pLR85 containing the codon-optimized Thraustochytrium ATCC PTA-10212 PFA2 (SEQ ID NO: 121) via electroporation and enzymatic pretreatment (see Example 8). Double crossover recombination occurred based on the homologous regions flanking the Zeocin ^™ resistance marker in the mutant and the PFA2 gene in pLR85, resulting in insertion of the codon-optimized Thraustochytrium ATCC PTA-10212 PFA2 into the native orfB locus. Recombination with codon-optimized Thraustochytrium sp. ATCC PTA-10212 PFA2 (SEQ ID NO: 121) restored PUFA production in the substrain mutant lacking orfB. Cell culture and FAME analysis were performed as in Example 7. The recombinant strain had an EPA content of 1.0% FAME, a DPA n-3 content of 0.3%, a DPA n-6 content of 7.0%, and a DHA content of 31.0%.

在进行的实验中，通过电穿孔以及酶预处理(参见实施例8)，利用含有密码子优化的破囊壶菌(Thraustochytrium sp.)ATCC PTA-10212PFA2(SEQ ID NO：121)的pLR85转化来自实施例9的缺少功能性orfB的裂殖壶菌ATCC 20888突变体。基于突变体中侧接零霉素^TM抗性标记物和pLR85中侧接PFA2基因的同源区域，发生双交叉重组，从而使密码子优化的破囊壶菌(Thraustochytrium sp.)ATCC PTA-10212 PFA2插入天然的orfB基因座。与密码子优化的破囊壶菌(Thraustochytrium sp.)ATCC PTA-10212PFA2(SEQ ID NO：121)发生重组恢复了缺少orfB的裂殖壶菌ATCC 20888突变体的PUFA生产。In the experiments performed, the Schizochytrium sp. ATCC 20888 mutant lacking a functional orfB gene from Example 9 was transformed with pLR85 containing a codon-optimized Thraustochytrium sp. ATCC PTA-10212 PFA2 (SEQ ID NO: 121) by electroporation and enzyme pretreatment (see Example 8). Double crossover recombination occurred based on the flanking Zeocin ^™ resistance marker in the mutant and the homologous regions flanking the PFA2 gene in pLR85, resulting in insertion of the codon-optimized Thraustochytrium sp. ATCC PTA-10212 PFA2 into the native orfB locus. Recombination with the codon-optimized Thraustochytrium sp. ATCC PTA-10212 PFA2 (SEQ ID NO: 121) restored PUFA production in the Schizochytrium sp. ATCC 20888 mutant lacking orfB.

也用含有PFA2的pLR85转化裂殖壶菌ATCC 20888以及缺少功能性orfB的子菌株突变体，以使PFA2随机整合入该突变体，并恢复每种突变体的PUFA生产。Schizochytrium ATCC 20888 and a substrain mutant lacking a functional orfB were also transformed with pLR85 containing PFA2 to allow random integration of PFA2 into the mutants and restore PUFA production in each mutant.

实施例11Example 11

通过用源自细菌转座子Tn5的新霉素磷酸转移酶II(npt)的编码区代替pMON50000/pTUBZEO11-2中的博来霉素/零霉素^TM抗性基因(ble)编码区(美国专利7,001,772B2)，为裂殖壶菌ATCC 20888开发含有在裂殖壶菌中有功能的巴龙霉素抗性标记物盒的质粒。pMON50000中，ble抗性基因由裂殖壶菌(Schizochytrium)α-微管蛋白启动子驱动，并且后随SV40转录终止子。pMON50000中的ble区域在ATG起始密码子处包含NcoI限制性位点，紧随TGA终止信号后包含PmlI限制性位点。使用PCR扩增pCaMVnpt中的npt编码区(Shimizu等，Plant J.26(4)：375(2001))，因此产物在起始ATG(粗体)处包含BspHI限制性位点(下文下划线部分，引物CAX055)，在紧随终止信号(粗体-反向互补)处包含PmlI限制性位点(下文下划线部分，引物CAX056)：A plasmid containing a paromomycin resistance marker cassette functional in Schizochytrium was developed for Schizochytrium ATCC 20888 by replacing the bleomycin/zeocin ^™ resistance gene (ble) coding region in pMON50000/pTUBZEO11-2 with the coding region for neomycin phosphotransferase II (npt) derived from the bacterial transposon Tn5 (U.S. Patent 7,001,772 B2). In pMON50000, the ble resistance gene is driven by the Schizochytrium α-tubulin promoter and is followed by the SV40 transcriptional terminator. The ble region in pMON50000 contains an NcoI restriction site at the ATG start codon and a PmlI restriction site immediately following the TGA termination signal. The npt coding region from pCaMVnpt was amplified using PCR (Shimizu et al., Plant J. 26(4):375 (2001)) so that the product contained a BspHI restriction site (underlined below, primer CAX055) at the start ATG (bold) and a PmlI restriction site (underlined below, primer CAX056) immediately following the stop signal (bold - reverse complement):

CAX055(正向)：GTTGAACAAGATGGATTGCAC(SEQ ID NO：66)CAX055 (forward): GTTGAACAAGATGGATTGCAC (SEQ ID NO: 66)

CAX056(反向)：CCACGTGGAAGAACTCGTCAAGAA(SEQ ID NO：67).CAX056 (reverse): C CACGTG GAAGAACTCGTCAAGAA (SEQ ID NO: 67).

使用TaqMaster聚合酶试剂盒(5Prime)，将产物克隆进入pCR4-TOPO(英骏公司)，用所得质粒转化大肠杆菌(E.coli)TOP10(英骏公司)。使用载体引物的DNA序列分析鉴定出包含所需805bp结构的多个克隆(即，序列与源模板加上工程改造的限制性位点匹配)。通过BspHI和PmlI限制性酶消化将修饰的npt编码区域分离，将纯化的DNA片段与NcoI和PmlI酶消化产生的pMON50000载体片段连接。限制性酶BspHI和NcoI留下兼容交叠末端，PmlI留下钝末端。得到的质粒pTS-NPT包含npt新霉素/巴龙霉素抗性基因，且与原来pMON50000中ble基因构架相同。The product was cloned into pCR4-TOPO (Invitrogen) using the TaqMaster polymerase kit (5 Prime), and the resulting plasmid was used to transform Escherichia coli (E. coli) TOP10 (Invitrogen). DNA sequence analysis using vector primers identified multiple clones containing the desired 805 bp structure (i.e., sequences matching the source template plus the engineered restriction sites). The modified npt coding region was isolated by digestion with the BspHI and PmlI restriction enzymes, and the purified DNA fragment was ligated with the pMON50000 vector fragment generated by digestion with the NcoI and PmlI enzymes. The restriction enzymes BspHI and NcoI leave compatible overlapping ends, while PmlI leaves blunt ends. The resulting plasmid, pTS-NPT, contains the npt neomycin/paromomycin resistance gene and the same ble gene architecture as the original pMON50000.

使用裂殖壶菌(Schizochytrium)粒子轰击(美国专利号7,001,772B2)评价pTS-NPT中新巴龙霉素抗性盒的功能。在包含50μg/mL硫酸巴龙霉素(西格玛公司(Sigma))的琼脂平板上进行巴龙霉素(APR)抗性选择。发现巴龙霉素抗性裂殖壶菌(Schizochytrium)转化子的频率与来自pMON50000的零霉素^TM抗性相似。可使用多种限制性酶将“α-微管蛋白启动子/npt/SV40终止子”盒从pTS-NPT中释放，后续可用作其它开发用途。The functionality of the new paromomycin resistance cassette in pTS-NPT was evaluated using Schizochytrium particle bombardment (U.S. Patent No. 7,001,772 B2). Selection for paromomycin (APR) resistance was performed on agar plates containing 50 μg/mL paromomycin sulfate (Sigma). The frequency of paromomycin-resistant Schizochytrium transformants was found to be similar to that of the Zeocin ^™ -resistant strain from pMON50000. The "α-tubulin promoter/npt/SV40 terminator" cassette can be released from pTS-NPT using a variety of restriction enzymes for subsequent use in other development applications.

实施例12Example 12

用包含来自orfC侧接区域序列包围的零霉素^TM抗性标记物的载体转化后，通过同源重组替换裂殖壶菌ATCC 20888的天然orfC基因。产生缺少功能性orfC基因的突变株。该突变株是营养缺陷型，生长需要补充PUFA。After transformation with a vector containing the Zeocin ^™ resistance marker surrounded by sequences from the orfC flanking regions, the native orfC gene of Schizochytrium sp. ATCC 20888 was replaced by homologous recombination. This resulted in a mutant lacking a functional orfC gene. This mutant is auxotrophic and requires PUFA supplementation for growth.

将裂殖壶菌ATCC PTA-9695PFA3(SEQ ID NO：5)克隆入表达载体pREZ22中产生pREZ324。表达载体包含裂殖壶菌ATCC 20888天然orfC基因座的侧接区域的约2kb DNA。Schizochytrium ATCC PTA-9695 PFA3 (SEQ ID NO: 5) was cloned into the expression vector pREZ22 to generate pREZ324. The expression vector contains approximately 2 kb of DNA flanking the native orfC locus of Schizochytrium ATCC 20888.

用包含裂殖壶菌ATCC PTA-9695 PFA3的pREZ324转化缺少功能性orfC的裂殖壶菌ATCC 20888。基于突变体中侧接巴龙霉素抗性标记物和pREZ324中侧接裂殖壶菌PTA-9695PFA3基因的同源区域，发生双交叉重组，从而使裂殖壶菌ATCC PTA-9695 PFA3插入天然的orfC基因座。与裂殖壶菌ATCC PTA-9695 PFA3(SEQ ID NO：5)发生同源重组恢复了缺少orfC的裂殖壶菌ATCC 20888突变体的PUFA生产。如实施例7进行细胞培养和FAME分析。包含功能性orfC基因的天然裂殖壶菌ATCC 20888菌株以2.3∶1的比例产生DHA和DPA n-6。其中失活orfC基因替换为裂殖壶菌ATCC PTA-9695 PFA3(SEQ ID NO：5)的重组株以14∶9的比例产生DHA和DPA n-6，进一步表明裂殖壶菌的PUFA分布特征可被本文所述的核酸分子所改变。重组菌株的EPA含量为1.2％FAME，DPA n-3含量为0.2％，DPA n-6含量为2.9％以及DHA含量为43.4％。Schizochytrium ATCC 20888, lacking a functional orfC gene, was transformed with pREZ324 containing Schizochytrium ATCC PTA-9695 PFA3. Double crossover recombination occurred based on the flanking paromomycin resistance marker in the mutant and the homologous regions flanking the Schizochytrium ATCC PTA-9695 PFA3 gene in pREZ324, resulting in insertion of Schizochytrium ATCC PTA-9695 PFA3 into the native orfC locus. Homologous recombination with Schizochytrium ATCC PTA-9695 PFA3 (SEQ ID NO: 5) restored PUFA production in the Schizochytrium ATCC 20888 mutant lacking orfC. Cell culture and FAME analysis were performed as in Example 7. The native Schizochytrium ATCC 20888 strain containing a functional orfC gene produced DHA and DPA n-6 in a ratio of 2.3:1. A recombinant strain in which the inactivated orfC gene was replaced with PFA3 (SEQ ID NO: 5) from Schizochytrium ATCC PTA-9695 produced DHA and DPA n-6 in a ratio of 14:9, further demonstrating that the PUFA profile of Schizochytrium can be altered by the nucleic acid molecules described herein. The recombinant strain had an EPA content of 1.2% FAME, a DPA n-3 content of 0.2%, a DPA n-6 content of 2.9%, and a DHA content of 43.4%.

也用含有PFA3的pREZ324转化缺少功能性orfC的裂殖壶菌ATCC 20888突变体，因此使PFA3随机整合入该突变体，并恢复产生PUFA。重组菌株的EPA含量为1.2％FAME，DPA n-3含量为0.2％，DPA n-6含量为2.5％以及DHA含量为39.1％。A Schizochytrium sp. ATCC 20888 mutant lacking a functional orfC gene was also transformed with pREZ324 containing PFA3, thereby randomly integrating PFA3 into the mutant and restoring PUFA production. The recombinant strain had an EPA content of 1.2% FAME, a DPA n-3 content of 0.2%, a DPA n-6 content of 2.5%, and a DHA content of 39.1%.

用包含orfC侧接区域序列包围的巴龙霉素抗性标记物的载体转化后，通过同源重组替换实施例10中讨论的子菌株的天然orfC基因。产生缺少功能性orfC基因的突变株。该突变株是营养缺陷型，生长需要补充PUFA。用pREZ324转化缺少功能性orfC的突变体。发生双交叉重组，使裂殖壶菌ATCC PTA-9695 PFA3插入到突变株的天然orfC基因座。与裂殖壶菌ATCC PTA-9695 PFA3(SEQ ID NO：5)发生同源重组恢复了缺少orfC的子菌株突变体的PUFA生产。如实施例7进行细胞培养和FAME分析。重组菌株的EPA含量为1.2％FAME，DPA n-3含量为0.3％，DPA n-6含量为2.8％以及DHA含量为43.1％。After transformation with a vector containing a paromomycin resistance marker surrounded by orfC flanking region sequences, the native orfC gene of the daughter strain discussed in Example 10 was replaced by homologous recombination. This generated a mutant lacking a functional orfC gene. This mutant is auxotrophic and requires PUFA supplementation for growth. The mutant lacking a functional orfC gene was transformed with pREZ324. Double crossover recombination occurred, resulting in the insertion of Schizochytrium sp. ATCC PTA-9695 PFA3 into the mutant's native orfC locus. Homologous recombination with Schizochytrium sp. ATCC PTA-9695 PFA3 (SEQ ID NO: 5) restored PUFA production in the daughter strain lacking orfC. Cell culture and FAME analysis were performed as in Example 7. The recombinant strain had an EPA content of 1.2% FAME, a DPA n-3 content of 0.3%, a DPA n-6 content of 2.8%, and a DHA content of 43.1%.

也用含有PFA3的pREZ324转化缺少功能性orfB的子菌株突变体，从而使PFA3随机整合入突变体中，并恢复产生PUFA。A daughter strain mutant lacking a functional orfB was also transformed with pREZ324 containing PFA3, resulting in random integration of PFA3 into the mutant and restoration of PUFA production.

实施例13Example 13

重新合成破囊壶菌ATCC PTA-10212PFA3(SEQ ID NO：72)(DNA2.0)，并为在裂殖壶菌中表达进行密码子优化(SEQ ID NO：122)，然后将其克隆入表达载体pREZ22产生pREZ337。使用裂殖壶菌密码子使用表进行密码子优化(图22)。表达载体包含来自裂殖壶菌ATCC 20888天然orfC基因座的侧接区域的约2kb DNA。Thraustochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 72) (DNA2.0) was resynthesized and codon-optimized for expression in Schizochytrium (SEQ ID NO: 122) and cloned into the expression vector pREZ22 to generate pREZ337. Codon optimization was performed using the Schizochytrium codon usage table ( FIG. 22 ). The expression vector contained approximately 2 kb of DNA flanking regions of the native orfC locus of Schizochytrium ATCC 20888.

使用包含密码子优化的破囊壶菌ATCC PTA-10212 PFA3(SEQ ID NO：122)的pREZ337通过电穿孔以及酶预处理转化实施例12中的缺少功能性orfC的子菌株突变株(参见实施例8)。基于突变体中侧接零霉素^TM抗性标记物和pREZ337中侧接PFA3基因的同源区域，发生双交叉重组，从而使密码子优化的破囊壶菌ATCC PTA-10212 PFA3(SEQ ID NO：122)插入天然的orfC基因座。与密码子优化的破囊壶菌ATCC PTA-10212 PFA3(SEQ ID NO：122)发生重组恢复了缺少orfC的子菌株突变体的PUFA生产。如实施例7进行细胞培养和FAME分析。重组菌株的EPA含量为1.3％FAME，DPA n-3含量为0.4％，DPA n-6含量为2.7％以及DHA含量为50.2％。The substrain mutant lacking a functional orfC gene described in Example 12 (see Example 8) was transformed using pREZ337 containing a codon-optimized Thraustochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 122) by electroporation and enzyme pretreatment. Double crossover recombination occurred based on the flanking Zeocin ^™ resistance marker in the mutant and the homologous regions flanking the PFA3 gene in pREZ337, resulting in insertion of the codon-optimized Thraustochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 122) into the native orfC locus. Recombination with the codon-optimized Thraustochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 122) restored PUFA production in the substrain mutant lacking orfC. Cell culture and FAME analysis were performed as in Example 7. The recombinant strain had an EPA content of 1.3% FAME, a DPA n-3 content of 0.4%, a DPA n-6 content of 2.7% and a DHA content of 50.2%.

在进行的实验中，通过电穿孔以及酶预处理(参见实施例8)，利用含有密码子优化的破囊壶菌ATCC PTA-10212 PFA3(SEQ ID NO：122)的pREZ337转化来自实施例12的缺少功能性orfC的裂殖壶菌ATCC 20888突变体。基于突变体中侧接零霉素^TM抗性标记物和pREZ337中侧接PFA3基因的同源区域，发生双交叉重组，从而使密码子优化的破囊壶菌ATCC PTA-10212 PFA3(SEQ ID NO：122)插入天然的orfC基因座。与密码子优化的破囊壶菌ATCC PTA-10212 PFA3(SEQ ID NO：122)发生重组恢复了缺少orfC的裂殖壶菌ATCC 20888突变体的PUFA生产。In the experiments performed, the Schizochytrium ATCC 20888 mutant lacking a functional orfC gene from Example 12 was transformed with pREZ337 containing a codon-optimized Thraustochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 122) by electroporation and enzyme pretreatment (see Example 8). Double crossover recombination occurred based on the flanking Zeocin ^™ resistance marker in the mutant and the homologous regions flanking the PFA3 gene in pREZ337, resulting in insertion of the codon-optimized Thraustochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 122) into the native orfC locus. Recombination with the codon-optimized Thraustochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 122) restored PUFA production in the Schizochytrium ATCC 20888 mutant lacking orfC.

也用含有PFA3的pREZ337转化缺少功能性orfC的裂殖壶菌ATCC 20888以及子菌株突变体，以使PFA3随机整合入该突变体，并恢复每种突变体的PUFA生产。Schizochytrium ATCC 20888 and substrain mutants lacking functional orfC were also transformed with pREZ337 containing PFA3 to allow random integration of PFA3 into the mutants and restore PUFA production in each mutant.

实施例14Example 14

在用包含合适orf侧接区域序列包围的零霉素^TM或巴龙霉素抗性标记物的载体转化裂殖壶菌ATCC 20888后，通过同源重组替换orfA、orfB和orfC中的任何两种或全部三种。产生缺少orfA、orfB和orfC中任何两种或全部三种功能性基因的突变菌株。该突变株是营养缺陷型，生长需要补充PUFA。After transforming Schizochytrium sp. ATCC 20888 with a vector containing the Zeocin ^™ or Paromomycin resistance marker surrounded by sequences of the appropriate orf flanking regions, any two or all three of orfA, orfB, and orfC are replaced by homologous recombination. This results in a mutant strain lacking any two or all three functional genes among orfA, orfB, and orfC. This mutant strain is auxotrophic and requires PUFA supplementation for growth.

用包含对应PFA基因(SED ID NO：1、3、5、120、121或122中的一种或多种)的一种或多种表达载体转化缺少功能性orf基因的裂殖壶菌ATCC 20888突变体。基于突变株中侧接零霉素^TM或巴龙霉素抗性标记物和各个载体中侧接PFA基因的同源区域，发生双交叉重组，从而使PFA基因插入到天然的orf基因座中。这些表达载体的随机整合也可在仅基于恢复PUFA生产的转化子选择中。与PFA基因同源重组恢复了突变株中的PUFA产生，从而基于插入突变株的PFA基因的组合，恢复或改变天然的PUFA分布特征。A Schizochytrium sp. ATCC 20888 mutant lacking a functional orf gene was transformed with one or more expression vectors containing the corresponding PFA gene (one or more of SED ID NOs: 1, 3, 5, 120, 121, or 122). Double crossover recombination occurred between the flanking Zeocin ^™ or Paromomycin resistance marker in the mutant and the homologous regions flanking the PFA gene in each vector, resulting in insertion of the PFA gene into the native orf locus. Random integration of these expression vectors also allowed selection of transformants based solely on restoration of PUFA production. Homologous recombination with the PFA gene restored PUFA production in the mutant, thereby restoring or altering the native PUFA profile based on the combination of PFA genes inserted into the mutant.

在进行的实验中，用实施例12中缺少功能性orfC基因并包含随机整合的裂殖壶菌ATCC PTA-9695 PFA3(SEQ ID NO：5)的裂殖壶菌ATCC 20888菌株替换orfA和orfB。用包含orfA和orfB侧接区域序列包围的零霉素^TM抗性标记物的载体转化后，通过同源重组替换菌株的天然orfA和orfB基因。产生的菌株缺少功能性orfA、orfB和orfC，并且包含随机整合的裂殖壶菌ATCC PTA-9695 PFA3。用包含密码子优化裂殖壶菌ATCC PTA-9695PFA1(SEQ IDNO：1)的pREZ345和包含密码子优化的裂殖壶菌ATCC PTA-9695PFA2(SEQ ID NO：3)的pREZ331转化菌株，以便随机整合PFA1和PFA2。得到的缺少功能性orfA、orfB和orfC的重组菌株包含随机整合的裂殖壶菌ATCC PTA-9695PFA1、PFA2和PFA3。如实施例7进行细胞培养和FAME分析。重组菌株的EPA含量为6.6％FAME，DPA n-3含量为0.8％，DPA n-6含量为1.6％以及DHA含量为20.9％。In the experiments performed, the Schizochytrium ATCC 20888 strain lacking a functional orfC gene and containing randomly integrated Schizochytrium ATCC PTA-9695 PFA3 (SEQ ID NO: 5) from Example 12 was used to replace orfA and orfB. Following transformation with a vector containing the Zeocin ^™ resistance marker surrounded by sequences flanking the orfA and orfB regions, the strain's native orfA and orfB genes were replaced by homologous recombination. The resulting strain lacked functional orfA, orfB, and contained randomly integrated Schizochytrium ATCC PTA-9695 PFA3. The strain was transformed with pREZ345, containing codon-optimized Schizochytrium ATCC PTA-9695 PFA1 (SEQ ID NO: 1), and pREZ331, containing codon-optimized Schizochytrium ATCC PTA-9695 PFA2 (SEQ ID NO: 3), to randomly integrate PFA1 and PFA2. The resulting recombinant strain lacking functional orfA, orfB, and orfC contained randomly integrated Schizochytrium sp. ATCC PTA-9695 PFA1, PFA2, and PFA3. Cell culture and FAME analysis were performed as in Example 7. The recombinant strain had an EPA content of 6.6% FAME, a DPA n-3 content of 0.8%, a DPA n-6 content of 1.6%, and a DHA content of 20.9%.

在另一个进行的实验中，实施例12中缺少功能性orfC基因并插入天然orfC基因座的裂殖壶菌ATCC PTA-9695PFA3(SEQ ID NO：5)的子菌株用于替换orfA和orfB基因。用包含orfA和orfB侧接区域序列包围的巴龙霉素抗性标记物的载体转化后，通过同源重组替换菌株的天然orfA和orfB基因。产生的菌株缺少功能性orfA、orfB和orfC，并且包含插入到天然orfC基因座中的裂殖壶菌ATCC PTA-9695 PFA3。用包含密码子优化的裂殖壶菌ATCC PTA-9695 PFA1(SEQ ID NO：1)的pREZ345和包含密码子优化的裂殖壶菌ATCC PTA-9695 PFA2(SEQ ID NO：3)的pREZ331转化菌株。发生双交叉重组，从而使裂殖壶菌ATCC PTA-9695PFA1插入到菌株的天然orfA基因座中，并使裂殖壶菌ATCC PTA-9695 PFA2插入到菌株的天然orfB基因座中。所得重组菌株缺少功能性orfA、orfB和orf，并包含裂殖壶菌各自分别插入orfA、orfB和orfC基因座的ATCC PTA-9695 PFA1、PFA2和PFA3。如实施例7进行细胞培养和FAME分析。重组菌株的EPA含量为7.3％FAME，DPA n-3含量为0.4％，DPA n-6含量为1.5％以及DHA含量为23.9％。In another experiment, a substrain of Schizochytrium ATCC PTA-9695 PFA3 (SEQ ID NO: 5) from Example 12, lacking a functional orfC gene and inserted into the native orfC locus, was used to replace the orfA and orfB genes. Following transformation with a vector containing a paromomycin resistance marker surrounded by orfA and orfB flanking region sequences, the strain's native orfA and orfB genes were replaced by homologous recombination. The resulting strain lacked functional orfA, orfB, and contained Schizochytrium ATCC PTA-9695 PFA3 inserted into the native orfC locus. The strain was transformed with pREZ345, containing codon-optimized Schizochytrium ATCC PTA-9695 PFA1 (SEQ ID NO: 1), and pREZ331, containing codon-optimized Schizochytrium ATCC PTA-9695 PFA2 (SEQ ID NO: 3). Double crossover recombination occurred, resulting in the insertion of Schizochytrium ATCC PTA-9695 PFA1 into the strain's native orfA locus, and the insertion of Schizochytrium ATCC PTA-9695 PFA2 into the strain's native orfB locus. The resulting recombinant strain lacked functional orfA, orfB, and orf, and contained Schizochytrium ATCC PTA-9695 PFA1, PFA2, and PFA3, each inserted into the orfA, orfB, and orfC loci, respectively. Cell culture and FAME analysis were performed as in Example 7. The recombinant strain had an EPA content of 7.3% FAME, a DPA n-3 content of 0.4%, a DPA n-6 content of 1.5%, and a DHA content of 23.9%.

在另一个进行的实验中，实施例12中缺少功能性orfC基因并包含随机整合的裂殖壶菌ATCC PTA-9695 PFA3(SEQ ID NO：5)的子菌株用于替换orfA和orfB基因。用包含orfA和orfB侧接区域序列包围的零霉素^TM抗性标记物的载体转化后，通过同源重组替换菌株的天然orfA和orfB基因。产生的菌株缺少功能性orfA、orfB和orfC，并且包含随机整合的裂殖壶菌ATCC PTA-9695 PFA3。用包含密码子优化裂殖壶菌ATCC PTA-9695 PFA1(SEQ ID NO：1)的pREZ345和包含密码子优化的裂殖壶菌ATCC PTA-9695 PFA2(SEQ ID NO：3)的pREZ331转化菌株，以便随机整合PFA1和PFA2。得到的缺少功能性orfA、orfB和orfC的重组菌株包含随机整合的裂殖壶菌ATCC PTA-9695 PFA1、PFA2和PFA3。如实施例7进行细胞培养和FAME分析。重组菌株的EPA含量为6.2％FAME，DPA n-3含量为1.3％，DPA n-6含量为0.9％以及DHA含量为16.6％。In another experiment, a daughter strain from Example 12 lacking a functional orfC gene and containing randomly integrated Schizochytrium ATCC PTA-9695 PFA3 (SEQ ID NO: 5) was used to replace the orfA and orfB genes. Following transformation with a vector containing the Zeocin ^™ resistance marker surrounded by orfA and orfB flanking region sequences, the strain's native orfA and orfB genes were replaced by homologous recombination. The resulting strain lacked functional orfA, orfB, and contained randomly integrated Schizochytrium ATCC PTA-9695 PFA3. The strain was transformed with pREZ345, containing codon-optimized Schizochytrium ATCC PTA-9695 PFA1 (SEQ ID NO: 1), and pREZ331, containing codon-optimized Schizochytrium ATCC PTA-9695 PFA2 (SEQ ID NO: 3), to randomly integrate PFA1 and PFA2. The resulting recombinant strain lacking functional orfA, orfB, and orfC contained randomly integrated Schizochytrium sp. ATCC PTA-9695 PFA1, PFA2, and PFA3. Cell culture and FAME analysis were performed as in Example 7. The recombinant strain had an EPA content of 6.2% FAME, a DPA n-3 content of 1.3%, a DPA n-6 content of 0.9%, and a DHA content of 16.6%.

在另一个进行的实验中，实施例13中缺少功能性orfC基因并包含插入天然orfC基因座的裂殖壶菌ATCC PTA-10212 PFA3(SEQ ID NO：122)的子菌株用于替换orfA和orfB基因。用包含orfA和orfB侧接区域序列包围的巴龙霉素抗性标记物的载体转化后，通过同源重组替换菌株的天然orfA和orfB基因。产生的菌株缺少功能性orfA、orfB和orfC，并且包含插入到天然orfC基因座中的裂殖壶菌ATCC PTA-10212 PFA3。用包含密码子优化的裂殖壶菌ATCC PTA-10212 PFA1(SEQ ID NO：120)的pLR95和包含密码子优化的裂殖壶菌ATCCPTA-10212 PFA2(SEQ ID NO：121)的pLR85转化菌株。发生双交叉重组，从而使裂殖壶菌ATCC PTA-10212 PFA1插入到菌株的天然orfA基因座中，并使裂殖壶菌ATCC PTA-10212PFA2插入到菌株的天然orfB基因座中。所得重组菌株缺少功能性orfA、orfB和orf，并包含各自分别插入orfA、orfB和orfC基因座的裂殖壶菌ATCC PTA-10212 PFA1、PFA2和PFA3。如实施例7进行细胞培养和FAME分析。重组菌株的EPA含量为5.2％FAME，DPA n-3含量为0.6％，DPA n-6含量为2.1％以及DHA含量为47.1％。In another experiment, a daughter strain of Example 13 lacking a functional orfC gene and containing Schizochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 122) inserted into the native orfC locus was used to replace the orfA and orfB genes. Following transformation with a vector containing a paromomycin resistance marker surrounded by orfA and orfB flanking region sequences, the strain's native orfA and orfB genes were replaced by homologous recombination. The resulting strain lacked functional orfA, orfB, and contained Schizochytrium ATCC PTA-10212 PFA3 inserted into the native orfC locus. The strain was transformed with pLR95 containing codon-optimized Schizochytrium ATCC PTA-10212 PFA1 (SEQ ID NO: 120) and pLR85 containing codon-optimized Schizochytrium ATCC PTA-10212 PFA2 (SEQ ID NO: 121). Double crossover recombination occurred, resulting in the insertion of Schizochytrium ATCC PTA-10212 PFA1 into the strain's native orfA locus, and the insertion of Schizochytrium ATCC PTA-10212 PFA2 into the strain's native orfB locus. The resulting recombinant strain lacked functional orfA, orfB, and orf, and contained Schizochytrium ATCC PTA-10212 PFA1, PFA2, and PFA3, each inserted into the orfA, orfB, and orfC loci, respectively. Cell culture and FAME analysis were performed as in Example 7. The recombinant strain had an EPA content of 5.2% FAME, a DPA n-3 content of 0.6%, a DPA n-6 content of 2.1%, and a DHA content of 47.1%.

在另一个进行的实验中，实施例13中缺少功能性orfC基因并包含随机整合的裂殖壶菌ATCC PTA-10212 PFA3(SEQ ID NO：122)的子菌株用于替换orfA和orfB基因。用包含orfA和orfB侧接区域序列包围的零霉素^TM抗性标记物的载体转化后，通过同源重组替换菌株的天然orfA和orfB基因。产生的菌株缺少功能性orfA、orfB和orfC，并且包含随机整合的裂殖壶菌ATCC PTA-10212 PFA3。用包含密码子优化的裂殖壶菌ATCC PTA-10212PFA1(SEQID NO：120)的pLR95和包含密码子优化的裂殖壶菌ATCC PTA-10212 PFA2(SEQ ID NO：121)的pLR85转化菌株，以便随机整合PFA1和PFA2。得到的缺少功能性orfA、orfB和orfC的重组菌株包含随机整合的裂殖壶菌ATCC PTA-10212 PFA1、PFA2和PFA3。如实施例7进行细胞培养和FAME分析。重组菌株的EPA含量为1.8％FAME，DPA n-3含量为1.8％，DPA n-6含量为2.3％以及DHA含量为34.1％。In another experiment, a daughter strain from Example 13 lacking a functional orfC gene and containing randomly integrated Schizochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 122) was used to replace the orfA and orfB genes. Following transformation with a vector containing the Zeocin ^™ resistance marker surrounded by orfA and orfB flanking region sequences, the strain's native orfA and orfB genes were replaced by homologous recombination. The resulting strain lacked functional orfA, orfB, and contained randomly integrated Schizochytrium ATCC PTA-10212 PFA3. The strain was transformed with pLR95 containing codon-optimized Schizochytrium ATCC PTA-10212 PFA1 (SEQ ID NO: 120) and pLR85 containing codon-optimized Schizochytrium ATCC PTA-10212 PFA2 (SEQ ID NO: 121) to randomly integrate PFA1 and PFA2. The resulting recombinant strain lacking functional orfA, orfB, and orfC contained randomly integrated Schizochytrium sp. ATCC PTA-10212 PFA1, PFA2, and PFA3. Cell culture and FAME analysis were performed as in Example 7. The recombinant strain had an EPA content of 1.8% FAME, a DPA n-3 content of 1.8%, a DPA n-6 content of 2.3%, and a DHA content of 34.1%.

实施例15Example 15

将来自裂殖壶菌ATCC 20888的orfA、orfB和orfC基因克隆入一系列的二重载体(Duet vector)(诺瓦基公司(Novagen))。该二重表达载体是一套兼容性质粒，其中克隆有多个目标基因，并由大肠杆菌的T7诱导型启动子共表达。二重质粒pREZ91在pETDuet-1中包含裂殖壶菌ATCC 20888 orfA；二重质粒pREZ96在pCDFDuet-1中包含裂殖壶菌ATCC 20888orfB；二重质粒pREZ101在pCOLADuet-1中包含裂殖壶菌ATCC 20888 orfC。将二重质粒pREZ91、pREZ96和pREZ101，以及包含所需附属基因HetI的质粒pJK737(如美国专利7,217,856所述，整体纳入作参考)转化到包含诱导型T7RNA聚合酶基因的大肠杆菌菌株BLR(DE3)中。细胞生长和加入IPTG后，根据厂商指南(诺瓦基公司)，产生DHA和DPA n-6。简而言之，当细胞600nm光密度达到约0.5，加入1mM IPTG用于诱导。细胞在30℃卢里亚(Luria)肉汤中生长12小时后收集。用标准技术将脂肪酸转化为甲酯。使用配备火焰离子化探测的气相色谱(GC-FID)确定脂肪酸甲酯(FAME)形式的脂肪酸分布。The orfA, orfB, and orfC genes from Schizochytrium ATCC 20888 were cloned into a series of dual expression vectors (Novagen). The dual expression vectors are a set of compatible plasmids in which multiple target genes are cloned and co-expressed from the T7 inducible promoter of E. coli. The dual plasmid pREZ91 contains Schizochytrium ATCC 20888 orfA in pETDuet-1; the dual plasmid pREZ96 contains Schizochytrium ATCC 20888 orfB in pCDFDuet-1; and the dual plasmid pREZ101 contains Schizochytrium ATCC 20888 orfC in pCOLADuet-1. The binary plasmids pREZ91, pREZ96, and pREZ101, as well as the plasmid pJK737 containing the desired accessory gene HetI (as described in U.S. Patent 7,217,856, incorporated by reference in its entirety), were transformed into E. coli strain BLR(DE3) containing the inducible T7 RNA polymerase gene. DHA and DPA n-6 were produced following cell growth and the addition of IPTG according to the manufacturer's instructions (Novagen). Briefly, 1 mM IPTG was added for induction when the cells reached an optical density at 600 nm of approximately 0.5. The cells were grown in Luria broth at 30°C for 12 hours and then harvested. The fatty acids were converted to methyl esters using standard techniques. The fatty acid profile was determined as fatty acid methyl esters (FAMEs) using gas chromatography equipped with flame ionization detection (GC-FID).

将裂殖壶菌ATCC PTA-9695 PFA1(SEQ ID NO：1)基因克隆入表达载体pETDuet-1中产生pREZ346。将二重质粒pREZ346(包含裂殖壶菌ATCC PTA-9695PFA1)、pREZ96(包含orfB)以及pREZ101(包含orfC)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。裂殖壶菌ATCC PTA-9695 PFA1基因与裂殖壶菌ATCC 20888 orfB和orfC基因共表达。裂殖壶菌ATCC PTA-9695 PFA1以及裂殖壶菌ATCC 20888 orfB和orfC在诱导条件下表达时支持大肠杆菌中产生DHA。转化大肠杆菌的DHA含量为2.8％FAME，DPA n-6含量为1.1％，DPA n-3含量为0.6％以及EPA含量为3.7％。The Schizochytrium ATCC PTA-9695 PFA1 (SEQ ID NO: 1) gene was cloned into the expression vector pETDuet-1 to generate pREZ346. The binary plasmids pREZ346 (containing Schizochytrium ATCC PTA-9695 PFA1), pREZ96 (containing orfB), and pREZ101 (containing orfC) were transformed with pJK737 (containing HetI) into E. coli strain BLR(DE3). The Schizochytrium ATCC PTA-9695 PFA1 gene was co-expressed with the Schizochytrium ATCC 20888 orfB and orfC genes. The Schizochytrium ATCC PTA-9695 PFA1 and the Schizochytrium ATCC 20888 orfB and orfC genes supported DHA production in E. coli when expressed under inducing conditions. The DHA content of the transformed E. coli was 2.8% FAME, the DPA n-6 content was 1.1%, the DPA n-3 content was 0.6% and the EPA content was 3.7%.

实施例16Example 16

将密码子优化的破囊壶菌ATCC PTA-10212 PFA1(SEQ ID NO：120)基因克隆入表达载体pETDuet-1中产生pLR100。将二重质粒pLR100(包含密码子优化的破囊壶菌ATCCPTA-10212PFA1)、pREZ96(包含裂殖壶菌ATCC 20888的orfB)、pREZ101(包含裂殖壶菌ATCC20888的orfC)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。破囊壶菌ATCC PTA-10212 PFA1与裂殖壶菌ATCC 20888 orfB和orfC基因共表达。破囊壶菌ATCC PTA-10212 PFA1以及裂殖壶菌ATCC 20888 orfB和orfC在诱导条件下表达支持大肠杆菌中产生DHA和EPA。The codon-optimized Thraustochytrium ATCC PTA-10212 PFA1 (SEQ ID NO: 120) gene was cloned into the expression vector pETDuet-1 to generate pLR100. The binary plasmids pLR100 (containing the codon-optimized Thraustochytrium ATCC PTA-10212 PFA1), pREZ96 (containing orfB from Schizochytrium ATCC 20888), pREZ101 (containing orfC from Schizochytrium ATCC 20888), and pJK737 (containing HetI) were transformed together into E. coli strain BLR(DE3). See Example 15. The Thraustochytrium ATCC PTA-10212 PFA1 was co-expressed with the Schizochytrium ATCC 20888 orfB and orfC genes. The expression of orfB and orfC of Thraustochytrium ATCC PTA-10212 PFA1 and Schizochytrium ATCC 20888 under inducible conditions supported the production of DHA and EPA in E. coli.

实施例17Example 17

将裂殖壶菌ATCC PTA-9695 PFA3(SEQ ID NO：5)基因克隆入表达载体pCOLADuet-1产生pREZ326。将二重质粒pREZ326(包含裂殖壶菌ATCC PTA-9695PFA3)、pREZ91(包含裂殖壶菌ATCC PTA-20888的orfA)、pREZ96(包含裂殖壶菌ATCC 20888的orfB)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。裂殖壶菌ATCC PTA-9695 PFA3以及裂殖壶菌ATCC 20888 orfA和orfB在诱导条件下的表达支持大肠杆菌中产生DHA。如实施例15进行细胞培养和FAME分析。转化大肠杆菌的DHA含量为0.3％FAME。The Schizochytrium ATCC PTA-9695 PFA3 (SEQ ID NO: 5) gene was cloned into the expression vector pCOLADuet-1 to generate pREZ326. The binary plasmids pREZ326 (containing Schizochytrium ATCC PTA-9695 PFA3), pREZ91 (containing orfA from Schizochytrium ATCC PTA-20888), pREZ96 (containing orfB from Schizochytrium ATCC 20888), and pJK737 (containing HetI) were transformed into E. coli strain BLR(DE3). See Example 15. Expression of Schizochytrium ATCC PTA-9695 PFA3 and Schizochytrium ATCC 20888 orfA and orfB under inducing conditions supported DHA production in E. coli. Cell culture and FAME analysis were performed as in Example 15. The DHA content of the transformed E. coli was 0.3% FAME.

实施例18Example 18

将密码子优化的破囊壶菌ATCC PTA-10212 PFA3(SEQ ID NO：122)基因克隆入表达载体pCOLADuet-1产生pREZ348。将二重质粒pREZ348(包含密码子优化的破囊壶菌ATCCPTA-10212 PFA3)、pREZ91(包含裂殖壶菌ATCC 20888 orfA)、pREZ96(包含裂殖壶菌ATCC20888的orfB)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。破囊壶菌ATCC PTA-10212 PFA3以及裂殖壶菌ATCC 20888 orfA和orfB在诱导条件下表达支持大肠杆菌中产生DHA。如实施例15进行细胞培养和FAME分析。转化大肠杆菌DHA含量为2.9％FAME，DPA n-6含量为0.4％。The codon-optimized Thraustochytrium ATCC PTA-10212 PFA3 (SEQ ID NO: 122) gene was cloned into the expression vector pCOLADuet-1 to generate pREZ348. The binary plasmids pREZ348 (containing the codon-optimized Thraustochytrium ATCC PTA-10212 PFA3), pREZ91 (containing orfA from Schizochytrium ATCC 20888), pREZ96 (containing orfB from Schizochytrium ATCC 20888), and pJK737 (containing HetI) were transformed into E. coli strain BLR(DE3). See Example 15. Expression of the Thraustochytrium ATCC PTA-10212 PFA3 and Schizochytrium ATCC 20888 orfA and orfB under inducible conditions supported DHA production in E. coli. Cell culture and FAME analysis were performed as described in Example 15. The DHA content of the transformed E. coli was 2.9% FAME, and the DPA n-6 content was 0.4%.

实施例19Example 19

将裂殖壶菌ATCC PTA-9695 PFA2(SEQ ID NO：3)基因克隆入表达载体pCDFDuet-1产生pREZ330。将二重质粒pREZ330(包含裂殖壶菌ATCC PTA-9695 PFA2)、pREZ326(包含裂殖壶菌ATCC PTA-9695 PFA3)、pREZ91(包含裂殖壶菌ATCC 20888的orfA)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例9。裂殖壶菌ATCC PTA-9695 PFA2和PFA3以及裂殖壶菌ATCC 20888 orfA在诱导条件下表达支持大肠杆菌中产生DHA。如实施例15进行细胞培养和FAME分析。转化大肠杆菌DHA含量为0.8％FAME，DPA n-6含量为0.2％。The Schizochytrium ATCC PTA-9695 PFA2 (SEQ ID NO: 3) gene was cloned into the expression vector pCDFDuet-1 to generate pREZ330. The binary plasmids pREZ330 (containing Schizochytrium ATCC PTA-9695 PFA2), pREZ326 (containing Schizochytrium ATCC PTA-9695 PFA3), pREZ91 (containing orfA from Schizochytrium ATCC 20888), and pJK737 (containing HetI) were transformed into E. coli strain BLR(DE3). See Example 9. Expression of Schizochytrium ATCC PTA-9695 PFA2 and PFA3, as well as Schizochytrium ATCC 20888 orfA, under inducible conditions supported DHA production in E. coli. Cell culture and FAME analysis were performed as in Example 15. The transformed E. coli had a DHA content of 0.8% FAME and a DPA n-6 content of 0.2%.

实施例20Example 20

将密码子优化的破囊壶菌ATCC PTA-10212 PFA2(SEQ ID NO：121)基因克隆入表达载体pCDFDuet-1产生pLR87。将二重质粒pLR87(包含密码子优化的裂殖壶菌ATCC PTA-10212 PFA2)、pREZ348(包含密码子优化的破囊壶菌ATCC PTA-10212 PFA3)、pREZ91(包含裂殖壶菌ATCC 20888的orfA)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。破囊壶菌ATCC PTA-10212 PFA2和PFA3以及裂殖壶菌ATCC 20888 orfA在诱导条件下表达支持大肠杆菌中产生DHA和低水平EPA。如实施例15进行细胞培养和FAME分析。转化大肠杆菌的DHA含量为4.4％FAME，DPA n-6含量为1.1％，EPA含量为0.1％。The codon-optimized Thraustochytrium ATCC PTA-10212 PFA2 (SEQ ID NO: 121) gene was cloned into the expression vector pCDFDuet-1 to generate pLR87. The binary plasmids pLR87 (containing the codon-optimized Schizochytrium ATCC PTA-10212 PFA2), pREZ348 (containing the codon-optimized Thraustochytrium ATCC PTA-10212 PFA3), pREZ91 (containing orfA from Schizochytrium ATCC 20888), and pJK737 (containing HetI) were transformed together into E. coli strain BLR(DE3). See Example 15. Expression of Thraustochytrium ATCC PTA-10212 PFA2 and PFA3, as well as Schizochytrium ATCC 20888 orfA, under inducing conditions supported DHA and low-level EPA production in E. coli. Cell culture and FAME analysis were performed as described in Example 15. The DHA content of the transformed E. coli was 4.4% FAME, the DPA n-6 content was 1.1%, and the EPA content was 0.1%.

实施例21Example 21

将二重质粒pREZ346(包含裂殖壶菌ATCC PTA-9695 PFA1)、pREZ330(包含裂殖壶菌ATCC PTA-9695 PFA2)、pREZ326(包含裂殖壶菌ATCC PTA-9695 PFA3)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。裂殖壶菌ATCC PTA-9695 PFA1、PFA2和PFA3在诱导条件下表达支持在大肠杆菌中产生DHA。如实施例15进行细胞培养和FAME分析。转化大肠杆菌的DHA含量为0.3％FAME，EPA含量为0.3％。The binary plasmids pREZ346 (containing Schizochytrium ATCC PTA-9695 PFA1), pREZ330 (containing Schizochytrium ATCC PTA-9695 PFA2), pREZ326 (containing Schizochytrium ATCC PTA-9695 PFA3), and pJK737 (containing HetI) were transformed together into E. coli strain BLR(DE3). See Example 15. Expression of Schizochytrium ATCC PTA-9695 PFA1, PFA2, and PFA3 under inducing conditions supported DHA production in E. coli. Cell culture and FAME analysis were performed as in Example 15. The DHA content of the transformed E. coli was 0.3% FAME and the EPA content was 0.3%.

实施例22Example 22

将二重质粒pLR100(包含密码子优化的破囊壶菌ATCC PTA-10212 PFA1)、pLR87(包含密码子优化的破囊壶菌ATCC PTA-10212PFA2)、pREZ348(包含裂殖壶菌ATCC PTA-10212的PFA3)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。破囊壶菌ATCC PTA-10212 PFA1、PFA2和PFA3在诱导条件下表达支持在大肠杆菌中产生DHA和EPA。The binary plasmids pLR100 (containing codon-optimized Thraustochytrium ATCC PTA-10212 PFA1), pLR87 (containing codon-optimized Thraustochytrium ATCC PTA-10212 PFA2), pREZ348 (containing PFA3 of Schizochytrium ATCC PTA-10212), and pJK737 (containing HetI) were transformed together into E. coli strain BLR(DE3). See Example 15. Expression of Thraustochytrium ATCC PTA-10212 PFA1, PFA2, and PFA3 under inducing conditions supported the production of DHA and EPA in E. coli.

实施例23Example 23

将二重质粒pREZ330(包含裂殖壶菌ATCC PTA-9695 PFA2)、pREZ91(包含裂殖壶菌ATCC 20888 orfA)、pREZ101(包含裂殖壶菌ATCC 20888的orfC)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。裂殖壶菌ATCC PTA-9695PFA2以及裂殖壶菌ATCC 20888 orfA和orfC在诱导条件下表达支持在大肠杆菌中产生DHA。如实施例15进行细胞培养和FAME分析。转化大肠杆菌的DHA含量为0.6％FAME，DPA n-6含量为0.3％。The binary plasmids pREZ330 (containing Schizochytrium ATCC PTA-9695 PFA2), pREZ91 (containing Schizochytrium ATCC 20888 orfA), pREZ101 (containing Schizochytrium ATCC 20888 orfC), and pJK737 (containing HetI) were transformed into E. coli strain BLR(DE3). See Example 15. Expression of Schizochytrium ATCC PTA-9695 PFA2 and Schizochytrium ATCC 20888 orfA and orfC under inducible conditions supported DHA production in E. coli. Cell culture and FAME analysis were performed as in Example 15. The DHA content of the transformed E. coli was 0.6% FAME and the DPA n-6 content was 0.3%.

实施例24Example 24

将二重质粒pLR187(包含密码子优化的破囊壶菌ATCC PTA-10212 PFA2)、pREZ91(包含裂殖壶菌ATCC 20888的orfA)、pREZ101(包含裂殖壶菌ATCC 20888的orfC)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。密码子优化的破囊壶菌ATCC PTA-10212 PFA2以及裂殖壶菌ATCC 20888 orfA和orfC在诱导条件下表达支持大肠杆菌中产生DHA和低水平EPA。如实施例15进行细胞培养和FAME分析。转化大肠杆菌的DHA含量为1.7％FAME，DPA n-6含量为0.9％，EPA含量为0.1％。The binary plasmids pLR187 (containing codon-optimized Thraustochytrium ATCC PTA-10212 PFA2), pREZ91 (containing orfA from Schizochytrium ATCC 20888), pREZ101 (containing orfC from Schizochytrium ATCC 20888), and pJK737 (containing HetI) were transformed into E. coli strain BLR(DE3). See Example 15. Expression of codon-optimized Thraustochytrium ATCC PTA-10212 PFA2 and Schizochytrium ATCC 20888 orfA and orfC under inducible conditions supported DHA and low-level EPA production in E. coli. Cell culture and FAME analysis were performed as in Example 15. The transformed E. coli had a DHA content of 1.7% FAME, a DPA n-6 content of 0.9%, and an EPA content of 0.1%.

实施例25Example 25

将二重质粒pREZ346(包含裂殖壶菌ATCC PTA-9695 PFA1)、pREZ330(包含裂殖壶菌ATCC PTA-9695 PFA2)、pREZ101(包含裂殖壶菌ATCC 20888 orfC)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。PFA1和PFA2以及裂殖壶菌ATCC 20888orfC在诱导条件下表达支持大肠杆菌中产生DHA。如实施例15进行细胞培养和FAME分析。转化大肠杆菌的DHA含量为0.3％FAME，DPA n-6含量为0.1％，EPA含量为0.5％。The binary plasmids pREZ346 (containing Schizochytrium ATCC PTA-9695 PFA1), pREZ330 (containing Schizochytrium ATCC PTA-9695 PFA2), pREZ101 (containing Schizochytrium ATCC 20888 orfC), and pJK737 (containing HetI) were transformed into E. coli strain BLR(DE3). See Example 15. Expression of PFA1 and PFA2, as well as Schizochytrium ATCC 20888 orfC, under inducible conditions supported DHA production in E. coli. Cell culture and FAME analysis were performed as in Example 15. The transformed E. coli had a DHA content of 0.3% FAME, a DPA n-6 content of 0.1%, and an EPA content of 0.5%.

实施例26Example 26

将二重质粒pLR100(包含密码子优化的破囊壶菌ATCC PTA-10212 PFA1)、pLR87(包含密码子优化的破囊壶菌ATCC PTA-10212 PFA2)、pREZ101(包含裂殖壶菌ATCC 20888的orfC)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。密码子优化的破囊壶菌ATCC PTA-10212 PFA1和PFA2以及裂殖壶菌ATCC 20888 orfC在诱导条件下表达支持在大肠杆菌中产生DHA和EPA。The binary plasmids pLR100 (containing codon-optimized Thraustochytrium ATCC PTA-10212 PFA1), pLR87 (containing codon-optimized Thraustochytrium ATCC PTA-10212 PFA2), pREZ101 (containing orfC of Schizochytrium ATCC 20888), and pJK737 (containing HetI) were transformed together into E. coli strain BLR(DE3). See Example 15. Expression of codon-optimized Thraustochytrium ATCC PTA-10212 PFA1 and PFA2 and Schizochytrium ATCC 20888 orfC under inducing conditions supported the production of DHA and EPA in E. coli.

实施例27Example 27

将二重质粒pREZ346(包含裂殖壶菌ATCC PTA-9695 PFA1)、pREZ96(包含裂殖壶菌ATCC ATCC 20888 orfB)、pREZ326(包含裂殖壶菌ATCC PTA-9695PFA3)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。裂殖壶菌ATCC PTA-9695 PFA1和PFA3以及裂殖壶菌ATCC 20888 orfB在诱导条件下表达支持在大肠杆菌中产生DHA。如实施例15进行细胞培养和FAME分析。转化大肠杆菌的DHA含量为0.1％FAME，EPA含量为0.1％。The binary plasmids pREZ346 (containing Schizochytrium ATCC PTA-9695 PFA1), pREZ96 (containing Schizochytrium ATCC 20888 orfB), pREZ326 (containing Schizochytrium ATCC PTA-9695 PFA3), and pJK737 (containing HetI) were transformed into E. coli strain BLR(DE3). See Example 15. Expression of Schizochytrium ATCC PTA-9695 PFA1 and PFA3, as well as Schizochytrium ATCC 20888 orfB, under inducible conditions supported DHA production in E. coli. Cell culture and FAME analysis were performed as in Example 15. The DHA content of the transformed E. coli was 0.1% FAME and the EPA content was 0.1%.

实施例28Example 28

将二重质粒pLR100(包含密码子优化的破囊壶菌ATCC PTA-10212 PFA1)、pREZ96(包含裂殖壶菌ATCC 20888的orfB)、pREZ348(包含密码子优化的破囊壶菌ATCC PTA-10212PFA3)和pJK737(包含HetI)一起转化到大肠杆菌菌株BLR(DE3)中。见实施例15。密码子优化的破囊壶菌ATCC PTA-10212 PFA1和PFA3以及裂殖壶菌ATCC 20888 orfB在诱导条件下表达支持在大肠杆菌中产生DHA和EPA。The binary plasmids pLR100 (containing codon-optimized Thraustochytrium ATCC PTA-10212 PFA1), pREZ96 (containing orfB of Schizochytrium ATCC 20888), pREZ348 (containing codon-optimized Thraustochytrium ATCC PTA-10212 PFA3), and pJK737 (containing HetI) were transformed together into E. coli strain BLR(DE3). See Example 15. Expression of codon-optimized Thraustochytrium ATCC PTA-10212 PFA1 and PFA3 and Schizochytrium ATCC 20888 orfB under inducing conditions supported the production of DHA and EPA in E. coli.

实施例29Example 29

通过标准步骤在裂殖壶菌ATCC PTA-9695和破囊壶菌ATCC PTA-10212单独敲除Pfa1p、Pfa2p和Pfa3p PUFA合酶活性。参见例如，美国专利号7,217,856，将其全文纳入本文作参考。The Pfa1p, Pfa2p, and Pfa3p PUFA synthase activities were individually knocked out in Schizochytrium ATCC PTA-9695 and Thraustochytrium ATCC PTA-10212 by standard procedures. See, eg, US Patent No. 7,217,856, which is incorporated herein by reference in its entirety.

将零霉素^TM、潮霉素、杀稻瘟素或其它合适的抗性标记物插入包含于质粒中的PFA1基因(SEQ ID NO：1或SEQ ID NO：68)的限制性位点中。在插入抗性标记物之后，通过粒子轰击、电穿孔或其它合适的转化方法将质粒分别引入裂殖壶菌ATCC PTA-9695或破囊壶菌ATCC PTA-10212。发生同源重组，产生天然PFA1基因被零霉素^TM、潮霉素、杀稻瘟素或其它合适抗性标记物替换或破坏的突变体。在包含零霉素^TM、潮霉素、杀稻瘟素或其它合适选择试剂并补充PUFA的平板上选择转化体。在不补充PUFA的条件下进一步检测克隆的生长能力。抽提对于选择试剂有抗性且无法在缺乏PUFA的条件下生长的克隆的基因组DNA。所述DNA的PCR和Southern印记分析表明PFA1基因缺失或被破坏。Zeocin ^™ , hygromycin, blasticidin, or another suitable resistance marker is inserted into the restriction site of the PFA1 gene (SEQ ID NO: 1 or SEQ ID NO: 68) contained in the plasmid. After insertion of the resistance marker, the plasmid is introduced into Schizochytrium sp. ATCC PTA-9695 or Thraustochytrium sp. ATCC PTA-10212, respectively, by particle bombardment, electroporation, or another suitable transformation method. Homologous recombination occurs, generating mutants in which the native PFA1 gene is replaced or disrupted with Zeocin ^™ , hygromycin, blasticidin, or another suitable resistance marker. Transformants are selected on plates containing Zeocin ^™ , hygromycin, blasticidin, or another suitable selective agent and supplemented with PUFAs. Clones are further tested for their growth ability without PUFA supplementation. Genomic DNA is extracted from clones that are resistant to the selective agent and unable to grow in the absence of PUFAs. PCR and Southern blot analysis of this DNA confirms the deletion or disruption of the PFA1 gene.

以相似步骤敲除PFA2。发现得到的需要补充PUFA的敲除突变体缺少全长PFA2。A similar procedure was used to knock out PFA2. The resulting knockout mutant, which required PUFA supplementation, was found to lack full-length PFA2.

以相似步骤敲除PFA3。发现得到的需要补充PUFA的敲除突变体缺少全长PFA3。A similar procedure was used to knock out PFA3. The resulting knockout mutant, which required PUFA supplementation, was found to lack full-length PFA3.

本文所述的所有不同方面、实施方式和选择可以任何和全部的变化形式组合。All of the various aspects, embodiments, and options described herein may be combined in any and all variations.

将本说明书中提到的所有发表物、专利和专利申请以其全文形式纳入本文作为参考，就好像将各篇单独的发表物、专利或专利申请具体地和单独地通过引用纳入本文一样。All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference in their entirety to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.

Claims

1. An isolated nucleic acid molecule having a polynucleotide sequence that is 100% identical to SEQ ID NO: 37 or SEQ ID NO: 5, wherein the nucleic acid molecule encodes a polypeptide containing polyunsaturated fatty acid (PUFA) synthase activity.

2. The isolated nucleic acid molecule according to claim 1, wherein the polynucleotide sequence of the nucleic acid molecule is SEQ ID NO: 5.

3. An isolated nucleic acid molecule whose polynucleotide sequence encodes a polypeptide that is 100% identical to SEQ ID NO: 38 or SEQ ID NO: 6, wherein the polypeptide contains polyunsaturated fatty acid (PUFA) synthase activity.

4. The isolated nucleic acid molecule according to claim 3, wherein the amino acid sequence of the polypeptide is SEQ ID NO: 6.

5. An isolated nucleic acid molecule whose polynucleotide sequence is completely complementary to the polynucleotide sequence of any one of claims 1-4.

6. A recombinant nucleic acid molecule comprising the nucleic acid molecule or combination thereof as described in any one of claims 1-4 and a transcription control sequence.

7. The recombinant nucleic acid molecule according to claim 6, wherein the recombinant nucleic acid molecule is a recombinant vector.

8. A host cell expressing a nucleic acid molecule as described in any one of claims 1-4, a recombinant nucleic acid molecule as described in claim 6 or 7, or a combination thereof, wherein the host cell is selected from microbial cells and animal cells.

9. The host cell according to claim 8, wherein the host cell is an animal cell.

10. The host cell according to claim 8, wherein the host cell is a microbial cell.

11. The host cell as claimed in claim 10, wherein the microbial cell is a bacterium.

12. The host cell as claimed in claim 10, wherein the microbial cell is *Cyclochytrium*.

13. The host cell according to claim 12, wherein the schistocytic fungus is a member of the genus Schistocyticum or the genus Schistocyticum.

14. A method for generating at least one PUFA, comprising:

Under conditions that effectively produce PUFA, the PUFA synthase gene is expressed in the host cell.

The host cell comprises the isolated nucleic acid molecule of any one of claims 1-4, the recombinant nucleic acid molecule of claim 6 or 7, or a combination thereof, and

At least one PUFA is produced.

15. The method of claim 14, wherein the host cell is selected from plant cells, isolated animal cells, and microbial cells.

16. The method of claim 14, wherein the at least one PUFA contains docosahexaenoic acid (DHA).

17. A method for producing DHA-rich lipids, comprising:

Under conditions conducive to lipid production, the PUFA synthase gene is expressed in the host cells of any one of claims 8-13, and

It produces lipids rich in DHA.

18. A method for producing a recombinant vector, comprising inserting the isolated nucleic acid molecule of any one of claims 1-4 into the vector.

19. A method for generating recombinant host cells, comprising introducing the recombinant vector of claim 18 into a host cell.

20. The method of claim 19, wherein the host cell is selected from plant cells, isolated animal cells, and microbial cells.

21. An isolated polypeptide, wherein the polypeptide is encoded by the polynucleotide sequence of any one of claims 1-4.

22. An isolated polypeptide having an amino acid sequence that is 100% identical to SEQ ID NO: 38, wherein the polypeptide contains polyunsaturated fatty acid (PUFA) synthase activity.

23. An isolated polypeptide having an amino acid sequence that is 100% identical to SEQ ID NO: 6, wherein the polypeptide contains polyunsaturated fatty acid (PUFA) synthase activity.

24. The isolated polypeptide according to any one of claims 22-23, wherein the polypeptide is a fusion polypeptide.

25. A composition comprising the polypeptide of any one of claims 22-23 and a biologically acceptable carrier.

26. A method for increasing DHA production in an organism possessing PUFA synthase activity, comprising:

Under conditions that effectively produce DHA, the isolated nucleic acid molecule of any one of claims 1-4, the recombinant nucleic acid molecule of claim 6 or 7, or a combination thereof, is expressed in the organism.

The PUFA synthase activity described herein replaces inactivated or missing activity in the organism, introduces new activity, or enhances existing activity, and

The DHA production in the organism increases.

27. A method for isolating lipids from host cells, comprising:

(a) Under conditions conducive to lipid production, the PUFA synthase gene is expressed in the host cell, wherein the polyketide synthase system in the host cell comprises the isolated nucleic acid molecule of any one of claims 1-4, the recombinant nucleic acid molecule of claim 6 or 7, or a combination thereof, and

(b) Isolate lipids from the host cells.

28. The method of claim 27, wherein the host cell is selected from plant cells, isolated animal cells, and microbial cells.

29. The method of claim 27, wherein the lipid comprises DHA.