CN1071791C

CN1071791C - New DNA sequences

Info

Publication number: CN1071791C
Application number: CN93108717A
Authority: CN
Inventors: K·G·比约塞尔; P·N·I·卡尔森; C·S·M·恩纳贝克; S·L·汉森; U·F·P·里德堡; J·A·尼尔森; I·B·F·特奈尔
Original assignee: Astra AB
Current assignee: AstraZeneca AB
Priority date: 1992-06-11
Filing date: 1993-06-11
Publication date: 2001-09-26
Anticipated expiration: 2013-06-11
Also published as: HU9403536D0; CA2137815C; IS4029A; HUT71789A; ES2335626T3; UA41322C2; AU670598B2; CZ288677B6; IL105874A0; NO944715L; AU4366793A; NO944715D0; IL105874A; DE69334298D1; EP0651793A1; PH31680A; CN1086262A; AP411A; SI9300319A; PT651793E

Abstract

本发明涉及包含内含子序列并编码根据作用位点称之为胆汁盐刺激脂酶(BSSL)或羧酸酯脂酶(CEL)人蛋白质的DNA分子。可以用该DNA分子生产重组人BSSL/CEL，尤其是用基因突变的非人哺乳动物生产。可以将该重组人BSSL/CEL作为取代人奶喂养婴儿的婴乳代乳品成分，或将其用于生产抗吸收障碍，囊性纤维性变及慢性胰炎的药物。The present invention relates to a DNA molecule comprising an intronic sequence and encoding a human protein called bile salt-stimulated lipase (BSSL) or carboxylester lipase (CEL) depending on the site of action. This DNA molecule can be used to produce recombinant human BSSL/CEL, especially from genetically mutated non-human mammals. The recombinant human BSSL/CEL can be used as an infant milk formula component to replace human milk-fed infants, or it can be used to produce drugs against malabsorption, cystic fibrosis and chronic pancreatitis.

Description

DNA molecule encoding BSSL/CEL

本发明涉及包含内含子序列并编码根据作用位点而称之为胆汁盐刺激的脂酶(BSSL)或羧酸酶脂酶(CEL)的人蛋白质的DNA分子。用该DNA分子有助于生产重组人BSSL/CEL，尤其是通过用转基因非人哺乳动物进行生产。可以将该重组人BSSL/CEL作为可取代人奶喂养婴幼儿的婴幼儿食品的组分，或将其用于生产抗如：脂肪吸收障碍，囊性纤维性交和慢性胰炎的药物。食物脂类的水解The present invention relates to a DNA molecule comprising an intronic sequence and encoding a human protein called bile salt-stimulated lipase (BSSL) or carboxylase lipase (CEL) depending on the site of action. The use of this DNA molecule facilitates the production of recombinant human BSSL/CEL, especially by using transgenic non-human mammals. The recombinant human BSSL/CEL can be used as a component of infant food that can replace human milk to feed infants, or it can be used to produce anti-fat absorption disorders, cystic fibrosis and chronic pancreatitis. Hydrolysis of food lipids

食物中的脂类是能量的一个重要来源。富含能量的三酰基甘油占这些脂类的95％以上。一些脂类，如特定的脂肪酸和脂溶性维生素是饮食中必需的成分。在胃肠道吸收前，三酰基甘油以及微量成分，即酯化的脂溶性维生素和胆固醇，以及二酰基磷脂酰甘油需要水解酯键以得到疏水性较低的可吸收产物，这些反应由称之为脂酶的一组特定酶催化。Lipids in food are an important source of energy. Energy-rich triacylglycerols make up over 95% of these lipids. Some lipids, such as certain fatty acids and fat-soluble vitamins, are essential components of the diet. Triacylglycerols, as well as minor constituents, namely esterified fat-soluble vitamins and cholesterol, and diacylphosphatidylglycerols, require hydrolysis of ester bonds to yield less hydrophobic absorbable products prior to gastrointestinal absorption. These reactions are known as Catalyzed by a specific group of enzymes known as lipases.

在成人中，据认为涉及的必需脂酶是胃脂肪酶，胰辅脂酶依赖的脂酶(水解三和二酰基甘油)，胰磷脂酶A2(水解二酰基磷脂酰甘油)以及羧酸酯脂酶(CEL)(水解胆甾烯基和脂溶性维生素酯类)。在哺乳喂养的新生儿中，胆汁盐刺激的脂酶(BSSL)在上述几种脂类的水解中起必不可少的作用。脂类消化的产物与胆汁盐一起形成可开始吸收的混合微胶粒。胆汁盐刺激的酯酶In adults, the essential lipases thought to be involved are gastric lipase, pancreatic colipase-dependent lipase (hydrolyzes tri- and diacylglycerols), pancreatic phospholipase A2 (hydrolyses diacylphosphatidylglycerols), and carboxylate lipids Enzyme (CEL) (hydrolyzes cholesteryl and fat-soluble vitamin esters). In breastfed neonates, bile salt-stimulated lipase (BSSL) plays an essential role in the hydrolysis of several of the above lipids. The products of lipid digestion form mixed micelles with bile salts to initiate absorption. bile salt stimulated esterase

喂奶的人的乳腺合成并与奶一起分泌胆汁盐刺激的脂酶(BSSL)(B1ckberg et al.,1987)，在由一级胆汁盐特定激活后，该酶对哺乳喂养婴儿的内源性肠内脂肪消化能力产生影响。占全部奶蛋白质总量约1％的这种酶(Blckberg&Hernell,1981)在与奶一起通过胃时并不减少，在十二指肠的内含物中，经胰蛋白酶如胰蛋白酶和胰凝乳蛋白酶失活成胆汁盐以保护该酶。但当奶经巴氏灭菌，如加热至62.5℃,30分钟可使其失活(Bjrksten et al.,1980)。The mammary glands of breastfed humans synthesize and secrete with milk bile salt-stimulated lipase (BSSL) (Blöckberg et al., 1987), which, after specific activation by primary bile salts, acts on the endogenous affect intestinal fat digestion. This enzyme (Blckberg & Hernell, 1981), which accounts for about 1% of the total milk protein content, does not decrease when it passes through the stomach with milk. Chymotrypsin is inactivated into bile salts to protect the enzyme. But when the milk is pasteurized, it can be inactivated by heating to 62.5°C for 30 minutes (Bjrksten et al., 1980).

体外模式实验表明，在BSSL存在的情况下，三酰基甘油消化的终产物不同(Bernbck et al.,1990；Hernell & Blckberg,1982)。这可能是由于新生儿期间，较低的管腔内胆汁盐浓度有利于产物吸收。羧酸酯脂酶In vitro model experiments have shown that the end products of triacylglycerol digestion differ in the presence of BSSL (Bernböck et al., 1990; Hernell & Blöckberg, 1982). This may be due to the lower intraluminal bile salt concentration that favors product absorption during the neonatal period. Carboxylate esterase

人胰液中的羧酸酯脂酶(CEL)(Lombardo et al,1978)似乎与BSSL功能相同，或者至少很相似(Blckberg et al,1981)。它们具有共同的抗原决定簇，有相同的N末端氨基酸序列(Abouakilet al.,1988)，并由丝氨酸酯酶的抑制剂，如毒扁豆碱及二异丙基氟磷酸酯抑制。最近几个实验室的研究中对奶脂酶及胰脂酶的cDNA结构进行了定性(Baba et al.,1991；Hui et al.,1991；Nilsson et al.,1990；Reue et al.,1991)并且得出的结论是奶酶与胰酶是相同基因的产物(在本申请中涉及CEL基团，EC3.1.1.1)。在WO 91/15234(Oklahoma MedicalResearch Foundation)和WO 91/18923(AktiebolagetAstra)中描述了CEL基因的cDNA序列及推断的氨基酸序列。Carboxylate ester lipase (CEL) in human pancreatic juice (Lombardo et al, 1978) appears to function identically, or at least closely, to BSSL (Blöckberg et al, 1981). They have a common antigenic determinant, have the same N-terminal amino acid sequence (Abouakilet al., 1988), and are inhibited by serine esterase inhibitors, such as physostigmine and diisopropyl fluorophosphate. The cDNA structures of milk lipase and pancreatic lipase have been characterized in recent studies by several laboratories (Baba et al., 1991; Hui et al., 1991; Nilsson et al., 1990; Reue et al., 1991 ) and concluded that the milk enzyme is the product of the same gene as pancreatin (relevant to the CEL group in this application, EC 3.1.1.1). The cDNA sequence and deduced amino acid sequence of the CEL gene are described in WO 91/15234 (Oklahoma Medical Research Foundation) and WO 91/18923 (Aktiebolaget Astra).

因此认为CEL与BSSL相同，且由CEL基因编码的多肽在本发明说明书中称为BSSL/CEL。脂类吸收障碍Therefore, CEL is considered to be the same as BSSL, and the polypeptide encoded by the CEL gene is referred to as BSSL/CEL in the specification of the present invention. lipid malabsorption

脂类吸收障碍，并由此产生营养不良的普通原因是管腔内胰辅脂酶依赖的脂酶和/或胆汁盐水平降低。脂酶缺陷的典型例子是患有囊性纤维性变和慢性胰炎的患者，80％囊性纤维性交的患者是普通的遗传无序导致终身缺陷，慢性胰炎经常起因于慢性酒精中毒。A common cause of lipid malabsorption, and hence malnutrition, is reduced levels of intraluminal pancreatic colipase-dependent lipase and/or bile salts. Classic examples of lipase deficiency are patients with cystic fibrosis, 80% of patients with cystic fibrosis intercourse is a common genetic disorder leading to a lifelong defect, and chronic pancreatitis often results from chronic alcoholism.

患有胰脂酶缺陷的患者目前的治疗是口服很大剂量的猪胰酶粗制剂。但是在胃中，低PH值使辅酶依赖性胰脂酶失活。而使用大剂量的酶并不能完全克服这种影响。因此，对于大多数患者来说，口服大剂量是不合适的，此外，该制剂不纯并且味道不好。Patients with pancreatic lipase deficiency are currently treated with oral very high doses of a crude porcine pancreatic enzyme preparation. But in the stomach, low pH inactivates coenzyme-dependent pancreatic lipase. However, the use of large doses of enzymes did not completely overcome this effect. Therefore, oral large doses are not suitable for most patients, and moreover, the preparation is impure and tastes bad.

已经制成可通过胃酸区域并仅在空肠的相对碱环境中释放酶的特定片剂。然而，许多胰不正常的患者有一副不正常的酸性空肠，因而在这种情况下，片剂不可能释放出酶。Specific tablets have been made to pass through the acidic region of the stomach and release the enzyme only in the relatively alkaline environment of the jejunum. However, many patients with pancreatic abnormalities have an abnormally acidic jejunum, so in this case it is unlikely that the tablet will release the enzyme.

此外，由于目前市场上的制品不是来源于人，因而有免疫反应的危险，这种免疫反应可能对患者有不良影响或者导致治疗效果减弱。现有制剂的另一缺点是没有说明其内含物中除辅脂酶依赖性脂酶外其他脂解活性。事实上，这些制剂大多数包含有很低的BSSL/CEL活性水平。这可能是尽管进行补充治疗，但许多患囊性纤维性变的患者仍患有脂溶性维生素和必需的脂肪酸缺陷症的一个原因。Furthermore, since the products currently on the market are not of human origin, there is a risk of an immune response that could adversely affect the patient or lead to a diminished therapeutic effect. Another disadvantage of the existing formulations is that there is no indication of lipolytic activity other than colipase-dependent lipase in their contents. In fact, most of these formulations contained very low levels of BSSL/CEL activity. This may be one reason why many patients with cystic fibrosis suffer from deficiencies in fat-soluble vitamins and essential fatty acids despite supplementation therapy.

因此，很需要具有人脂酶特性及结构并具有广泛的底物特异性的产品，可以让患一种或几种胰脂解酶缺陷症的患者口服该产品。应用本发明得到的产品，其本身或与含其它脂酶的制剂一起满足这种需要。婴儿食物Therefore, there is a great need for a product having the properties and structure of human lipase with broad substrate specificity which can be administered orally to patients suffering from deficiencies in one or more of the pancreatic lipolytic enzymes. The products obtained using the present invention fulfill this need by themselves or in combination with formulations containing other lipases. baby food

在家都知道，对婴幼儿来说，人奶喂养是最好的喂养方式。人奶不仅提供了十分平衡的营养，而且很容易被婴儿消化。因此，在婴儿中已知有生理功能的几种生物学活性组分要么是人奶中的成分，要么就是在其消化期间产生的，包括抵抗感染的成分以及有利于从人奶中吸收营养的成分。Everyone at home knows that human milk is the best way to feed infants and young children. Human milk not only provides well-balanced nutrition, but is also easily digested by babies. Thus, several biologically active components known to have physiological functions in infants are either components of human milk or are produced during its digestion, including components that fight infection and that facilitate the absorption of nutrients from human milk. Element.

尽管在制备婴儿食品中已化费了巨大努力，还不可能生产在任何显著的范围内均有人奶优点的产品。因此，娶儿食品通常是在牛奶的基础上制备的，一段不能完全被婴儿消化，并且缺少对婴儿生理功能有影响的已知物质。为了得到营养价值与人奶相似的婴儿食品，将包括蛋白质片段，维生素、矿物质等(在婴儿消化人奶期间形成或吸收的)大量添加削加到食品中，而这同时会造成增加重要器官如肝和肾脏的负担并可能长期损坏它们的危险。与使用以牛奶为基础食品有关的危害是会增加诱发婴儿抗牛蛋白质过敏反应的危险。Despite the enormous effort that has been expended in the preparation of baby food, it has not been possible to produce a product which shares the advantages of human milk to any significant extent. Therefore, marital food is usually prepared on the basis of milk, which cannot be completely digested by the baby, and lacks known substances that have an impact on the baby's physiological functions. In order to obtain infant food with a nutritional value similar to that of human milk, a large number of additions including protein fragments, vitamins, minerals, etc. (formed or absorbed during the digestion of human milk by the infant) are added to the food, which at the same time causes the increase of vital organs. Such as the burden on the liver and kidneys and the danger of possible long-term damage to them. A hazard associated with the use of cow's milk-based foods is the increased risk of inducing an allergic reaction to bovine protein in infants.

除了以牛奶为基础的婴儿食品外，也可使用从所谓奶库得到的人奶。然而，用来自奶库的人奶喂养新生儿在近几年内不会有广泛的增加，这是由于惧怕人奶中存在的传染因素如HIV和CMV。为了破坏人奶中的传染因素，在使用前，必需要进行巴氏灭菌。然而，经过巴氏灭菌，就会降低奶中的营养价值和生物学作用，例如，上文所述使BSSL失活。将脂酶添加到婴儿食品中In addition to milk-based baby food, human milk obtained from so-called milk banks can also be used. However, the feeding of newborns with human milk from milk banks will not increase widely in recent years due to fears of infectious agents such as HIV and CMV present in human milk. In order to destroy the infectious agents in human milk, it must be pasteurized before use. However, pasteurization reduces the nutritional value and biological effects in the milk, eg inactivation of BSSL as described above. Adding lipase to baby food

出生时，胰和肝的功能还未完成发育成熟，特别是足月前出生的婴儿。从生理学因素看，普遍发现脂肪吸收障碍，并且认为是由管腔内胰辅酶依赖性脂酶及胆汁盐浓度低引起的。但是，由于有BSSL,上述吸收障碍在哺乳喂养的婴儿中比用巴氏灭菌人奶或婴儿食品喂养的婴儿发生频率低(Bernbck et al.,1990)At birth, the pancreas and liver are not fully functional, especially in babies born before term. Physiologically, fat malabsorption is commonly found and is thought to be caused by low intraluminal concentrations of pancreatic coenzyme-dependent lipase and bile salts. However, due to the presence of BSSL, the aforementioned malabsorption occurs less frequently in breastfed infants than in infants fed pasteurized human milk or infant food (Bernböck et al., 1990)

为了避免与经过巴氏灭菌奶及基于牛奶的婴儿食品有关的上述缺点，因此，需要制备其组分与人奶相近的婴儿食品，即含有人奶蛋白质的食品。In order to avoid the above-mentioned disadvantages associated with pasteurized milk and milk-based baby foods, it is therefore desirable to prepare baby foods whose composition is similar to that of human milk, ie foods containing human milk proteins.

BSSL/CEL有几个独特的性质，使其较理想地适用于补充婴儿食品：BSSL/CEL has several unique properties that make it ideal for use in complementary baby foods:

它是按天然的方式设计用于口服。因此，它耐得住通过胃并由小肠的内含物激活。It is designed by nature to be taken orally. Therefore, it withstands passage through the stomach and is activated by the contents of the small intestine.

·它特定的激活机制可防止在贮存及传递到其作用的部位期间食物或组织脂类有害的脂类分解作用。• Its specific activation mechanism prevents the deleterious lipolytic action of food or tissue lipids during storage and delivery to its site of action.

·由于其广泛的底物转异性，它具有在其自身上介导大多数饮食中脂类，包括脂溶性维生素醋完全消化的潜力。• Due to its broad substrate transtropy, it has the potential to mediate, on its own, complete digestion of most dietary lipids, including fat-soluble vitamin vinegars.

·在水解含长链聚未饱和脂肪酸的酯键方面，BSSL/CEL可能比胰辅脂酶依赖性脂酶好。· BSSL/CEL may be better than pancreatic colipase-dependent lipase in hydrolyzing ester bonds containing long-chain polyunsaturated fatty acids.

·在有胃脂酶和没有，或低水平辅脂酶依赖性脂酶的情况下，即使胆汁盐水平低，如在新生儿中，BSSL/CEL也能在体外完全消化三酰基甘油。在有BSSL/CEL的情况下，三酰基甘油消化的终产物是游离脂肪酸和游离甘油，而不是由其它两种脂酶产生的游离脂肪酸和单酰基甘油(Bernbck et al.,1990)。这可有利于产物吸收，特别是在管腔内的胆汁盐水平低的情况下。BSSL/CEL can completely digest triacylglycerols in vitro in the presence and absence of gastric lipase, or at low levels of colipase-dependent lipase, even with low levels of bile salts, as in neonates. In the presence of BSSL/CEL, the end products of triacylglycerol digestion are free fatty acids and free glycerol, rather than free fatty acids and monoacylglycerols produced by the other two lipases (Bernböck et al., 1990). This may facilitate product absorption, especially in cases where intraluminal bile salt levels are low.

利用BSSL/CEL补充婴儿代乳品需要利用大量的产物。虽然可以从人奶中直接纯化人奶蛋白质，但为了得到大量用于大规模代乳品生产所需的人奶蛋白质，这不是一条现实并且足够经济的途径，而且在制备含人奶蛋白质的婴儿代乳品前，必须因此发展其它方法。本发明提供大量制备BSSL/CEL的上述方法。用基因突变动物的奶生产蛋白质Supplementing infant formula with BSSL/CEL requires the utilization of a large number of products. Although it is possible to directly purify human milk proteins from human milk, this is not a realistic and economical route to obtain large quantities of human milk proteins needed for large-scale milk substitute production, and it is difficult to prepare infant formula containing human milk proteins Before dairying, other methods had to be developed accordingly. The present invention provides the above-mentioned method for preparing BSSL/CEL in large quantities. Protein production from milk of genetically mutated animals

分离出编码药物学活性蛋白质的基因使得可以在异源系统中较廉价地生产上述蛋白质。对于奶蛋白质有吸引力的表达系统是基因突变动物(参见Hennighausen et al.,1990的综述)。在EP317,355(Oklahoma-Medical Reasearch Foundation)中描述了用从如基因突变动物技术衍生得到的含有胆汁盐激活的脂酶的膳食组合物。Isolation of genes encoding pharmaceutically active proteins allows for the relatively inexpensive production of such proteins in heterologous systems. An attractive expression system for milk proteins is the genetically mutant animal (see review by Hennighausen et al., 1990). In EP317,355 (Oklahoma-Medical Research Foundation) a dietary composition containing a bile salt-activated lipase derived from eg genetically mutated animal technology is described.

在基因突变动物中，可以将蛋白质编码序列作为cDNA或基因组序列导入。由于内含子对于被调节的基因在突变动物中表达时是必需的(Brinster et al.,1988；Whitelaw et al.,1991)，所以，在许多情形下，最好使用基因组形式而不是结构基因的DNA形式。WO 90/05188(Pharmaeutical Proteins Limited)描述了在基因突变动物中使用编码蛋白质的DNA，该DNA含有至少一个，但不是全部在一个编码蛋白质基因中天然存在的内含子。In genetically mutant animals, protein coding sequences can be introduced as cDNA or genomic sequences. Since introns are essential for the expression of the regulated gene in mutant animals (Brinster et al., 1988; Whitelaw et al., 1991), in many cases it is better to use the genomic form rather than the structural gene the DNA form. WO 90/05188 (Pharmaeutical Proteins Limited) describes the use in genetically mutant animals of protein-encoding DNA containing at least one, but not all, introns naturally present in a protein-encoding gene.

本发明的目的是提供以高产率及现实的价格生产用于婴儿代乳品的重组人BSSL/CEL的方法，以避免巴氏灭菌奶及以牛蛋白质为基础的代乳品所存在的不足。The object of the present invention is to provide a method for producing recombinant human BSSL/CEL for infant formula with high yield and realistic price, so as to avoid the disadvantages of pasteurized milk and milk substitute based on bovine protein.

通过克隆并测定人CEL基因的序列达到本发明的目的。为了提高BSSL/CEL的产率，已经用得到的含人CEL基因内含子序列的DNA分子代替已知的cDNA序列，以在基因突变的非人哺乳动物中生产人BSSL/CEL。The purpose of the present invention is achieved by cloning and determining the sequence of the human CEL gene. To increase the yield of BSSL/CEL, the obtained DNA molecule containing the intron sequence of the human CEL gene has been used in place of the known cDNA sequence to produce human BSSL/CEL in genetically mutated non-human mammals.

相应地，本发明一方面涉及序列目录中SEQ ID NO:1所示的DNA分子，或者在严紧杂交条件下，与序列目录中SEQ ID NO:1所示的DNA分子杂交的所述DNA分子的类似物或其特定的部分。Correspondingly, the present invention relates to the DNA molecule shown in SEQ ID NO: 1 in the Sequence Catalog, or under stringent hybridization conditions, the DNA molecule hybridized with the DNA molecule shown in SEQ ID NO: 1 in the Sequence Catalog Analogs or specific parts thereof.

在下列实施例中描述了分离人BSSL/CEL DNA分子的方法。Methods for isolating human BSSL/CEL DNA molecules are described in the following Examples.

以其经典意思理解上文提到的严紧杂交条件，即按着普通实验室操作如Sambrook等人(1989)完成杂交。Stringent hybridization conditions mentioned above are understood in their classical sense, ie hybridization is accomplished following common laboratory procedures such as Sambrook et al. (1989).

本发明另一方面是提供含有编码人BSSL/CEL的DNA序列的哺乳动物表达系统，其中该DNA序列插入到非人哺乳动物的编码奶蛋白质的基因中以便形成杂交基因，该杂交基因可在含所述杂交基因的成年雌性哺乳动物的乳腺中表达，以便在表达杂交基因时，生产人BSSL/CEL。Another aspect of the present invention is to provide a mammalian expression system comprising a DNA sequence encoding human BSSL/CEL, wherein the DNA sequence is inserted into a gene encoding a milk protein of a non-human mammal to form a hybrid gene which can be expressed in a gene comprising The hybrid gene is expressed in the mammary gland of an adult female mammal so that when the hybrid gene is expressed, human BSSL/CEL is produced.

另一方面，本发明涉及生产可表达人BSSL/CEL的基因突变的非人哺乳动物的方法，包括将上述哺乳动物表达系统注射到哺乳动物的受精卵或胚胎细胞中，以便将表达系统掺入哺乳动物的幼体中，并将得到的注射过的受精卵或胚胎发育成成年雌性哺乳动物。In another aspect, the present invention relates to a method for producing a non-human mammal capable of expressing a gene mutation of human BSSL/CEL, comprising injecting the above-mentioned mammalian expression system into a fertilized egg or embryonic cell of a mammal, so as to incorporate the expression system into mammalian larvae, and the resulting injected fertilized eggs or embryos develop into adult female mammals.

在序列目录中列出了如全长为11531bp，作为SEQ ID NO:1的DNA分子，它具有下列特征：特征起始碱基终止碱基5＇侧区域 1 1640TATA盒 1611 1617外显子1 1641 1727转译起始 1653 1653外显子2 4071 4221外显子3 4307 4429外显子4 4707 4904外显子5 6193 6323外显子6 6501 6608外显子7 6751 6868外显子8 8335 8521外显子9 8719 8922外显子10 10124 10321外显子11 10650 114903＇侧区域 11491 11531In the sequence directory, a DNA molecule with a full length of 11531bp, as a SEQ ID NO: 1, has the following features: Features starting alkaline base terminal 5 ＇side area 1 1640tata box 161117 outer show 1 1641 1 1641 1727 Translation starting 1653 1653 outer Xianzi 2 4071 4221 outer Xianzi 3 4307 4429 outer Xianzi 4707 4904 outer appendal 5 6193 6323 outer show 6501 6608 outer appendal 7 6751 6868 outer Xianzi 88335 8521 Display Son 9 8719 8922 outer Xianzi 10124 10321 outer Xianzi 111650 114903 ＇side area 11491 11531

在本发明中，术语“基因”用于说明DNA序列，该DNA序列存在于生产多肽链并包括前导区域以及随后的编码区域(5＇上游和3＇下游序列)以及所谓内含子的中间序列，内含子位于两个单独的编码片段(所谓外显子)之间或在5＇上游或3＇下游区域。5＇上游区域含有控制基因表达的调节序列，典型地指启动子。3＇下游区域含有涉及基因转录终止的序列以及可能含有负责转录产物多聚腺苷酸化的序列以及3＇非转译区。In the present invention, the term "gene" is used to describe the DNA sequence that is present in the production polypeptide chain and includes the leading region as well as the subsequent coding region (5' upstream and 3' downstream sequences) and intermediate sequences called introns , introns are located between two separate coding segments (so-called exons) or in the 5' upstream or 3' downstream regions. The 5'upstream region contains regulatory sequences that control gene expression, typically a promoter. The 3' downstream region contains sequences involved in gene transcription termination and may contain sequences responsible for the polyadenylation of transcripts and the 3' untranslated region.

本文所解释的本发明的DNA分子可包括天然的以及合成的DNA序列，该天然序列一般是直接从正常哺乳动物的基因组DNA得到的，如下文所述。可以按合成制备DNA分子的经典方法制备合成序列。DNA序列也可以是基因组与合成序列的混合物。The DNA molecules of the invention as explained herein may include natural as well as synthetic DNA sequences, the natural sequences being generally obtained directly from the genomic DNA of normal mammals, as described below. Synthetic sequences can be prepared according to classical methods for synthetically preparing DNA molecules. The DNA sequence can also be a mixture of genomic and synthetic sequences.

另一方面，本发明涉及可复制的表达载体，该载体可携带并能介导编码人BSSL/CEL的DNA序列的表达。In another aspect, the present invention relates to a replicable expression vector, which can carry and mediate the expression of the DNA sequence encoding human BSSL/CEL.

在本发明中，术语“可复制”的意思是指载体能够在其已导入到的所给的宿主细胞中复制。在确保分泌由含载体的宿主细胞所表达的人BSSL/CEL的情况下，紧接着人BSSL/CEL DNA序列的上游可能提供编码信号肽的序列。该信号序列可以是与人BSSL/CEL DNA序列有关的天然信号序列或其它来源的信号序列。In the present invention, the term "replicable" means that the vector is capable of replicating in a given host cell into which it has been introduced. In cases where secretion of human BSSL/CEL expressed by host cells containing the vector is ensured, a sequence encoding a signal peptide may be provided immediately upstream of the human BSSL/CEL DNA sequence. The signal sequence may be a native signal sequence related to the human BSSL/CEL DNA sequence or a signal sequence from other sources.

载体可以是任何可进行重组DNA操作的载体，载体的选择常取决于它所导入的宿主细胞。因此，载体可以是自我复制载体，即载体可作为染色体外的实体存在，其复制不依赖于染色体的复制，上述载体的举例是质粒，噬菌体，粘性质粒，小染色体或病毒。另外，载体可以是一种当导入宿主细胞后，整合到宿主细胞基因组中并与其所整合的染色体一起复制的载体。适宜的载体的例子是细菌表达载体和酵母表达载体。本发明的载体可携带任何上述所说的本发明DNA分子。The vector can be any vector capable of performing recombinant DNA operations, and the choice of the vector often depends on the host cell into which it is introduced. Thus, the vector may be a self-replicating vector, ie the vector may exist as an extrachromosomal entity whose replication is independent of chromosomal replication, examples of such vectors being plasmids, bacteriophages, cosmids, minichromosomes or viruses. In addition, the vector may be a vector that, when introduced into a host cell, is integrated into the genome of the host cell and replicated together with the chromosome into which it has been integrated. Examples of suitable vectors are bacterial expression vectors and yeast expression vectors. The vector of the present invention can carry any of the aforementioned DNA molecules of the present invention.

本发明还涉及含有上述定义的可复制的表达载体的细胞。理论上，该细胞可以是任何类型的细胞，即，原核细胞，单细胞的真核生物或从多细胞生物，如哺乳动物中得到的细胞。哺乳动物细胞特别适宜于本发明的目的，下文还将对其作进一步讨论。The invention also relates to cells containing a replicable expression vector as defined above. In principle, the cell may be any type of cell, ie, a prokaryotic cell, a unicellular eukaryote or a cell derived from a multicellular organism, such as a mammal. Mammalian cells are particularly suitable for the purposes of the present invention, as discussed further below.

在另一重要方面，本发明涉及生产重组人BSSL/CEL的方法，其中，将编码人BSSL/CEL的DNA序列插入到能够在特定宿主细胞中复制的载体中，将所得的重组载体导入能在合适的条件下合适的培养基中或表面生长的宿主细胞中，以表达人BSSL/ECL并回收人BSSL/CEL。In another important aspect, the present invention relates to a method for producing recombinant human BSSL/CEL, wherein, the DNA sequence encoding human BSSL/CEL is inserted into a vector capable of replicating in a specific host cell, and the resulting recombinant vector is introduced into Under suitable conditions, in a suitable medium or in host cells grown on the surface to express human BSSL/ECL and recover human BSSL/CEL.

用于生长细胞的培养基可以是任何适于本发明目的的常规培养基。适宜的载体可以是任何上述的载体，适当的宿主细胞可以是任何上述列出的细胞类型。用于构建载体并将其有效地导入宿主细胞的方法，可以是壬何重组DNA领域已知的可达到本发明目的方法。由细胞表达的重组人BSSL/CEL可以分泌，即通过细胞膜输出但这取决于细胞的类型以及载体的组成。The medium used for growing the cells may be any conventional medium suitable for the purposes of the present invention. Suitable vectors may be any of those described above, and suitable host cells may be any of the cell types listed above. The method for constructing the vector and efficiently introducing it into the host cell may be any method known in the field of recombinant DNA that can achieve the purpose of the present invention. Recombinant human BSSL/CEL expressed by cells can be secreted, ie exported through the cell membrane, but this depends on the type of cell and the composition of the vector.

如果人BSSL/CEL是由重组细胞在细胞内生产的，即不能由细胞分泌，那么可能用标准方法回收它们，该方法包括由机械方法如声处理或均浆，或由酶或化学方法破碎细胞，然后纯化。If human BSSL/CEL are produced intracellularly by recombinant cells, i.e. not secreted by the cells, then it is possible to recover them by standard methods, including by mechanical means such as sonication or homogenization, or by enzymatic or chemical disruption of the cells , and then purified.

为了能被分泌，在编码人BSSL/CEL的DNA序列前应加上一个编码信号肽的序列，在确保从细胞分泌人BSSL/CEL的情况下，以便将所表达的至少显著比例的人BSSL/CEL分泌到培养基中并将其回收。In order to be secreted, the DNA sequence encoding human BSSL/CEL should be preceded by a sequence encoding a signal peptide, in order to ensure that human BSSL/CEL is secreted from the cell, so that at least a significant proportion of the expressed human BSSL/CEL CEL is secreted into the culture medium and it is recycled.

本发明生产重组人BSSL/CEL目前优选的方法是使用能将人BSSL/CEL分泌到其奶中的基因突变的非人哺乳动物。使用基因突变的非人哺乳动物的优点是，可以以合理的价格得到高产率的重组人BSSL/CEL，特别是当非人哺乳动物是母牛时，将重组人BSSL/CEL用作为基于牛奶的产品中的营养添加物时在作为如婴儿代乳品的正常组分的奶中生产重组人BSSL/CEL就不必进行进一步的纯化。此外，在较高级的生物如非人哺乳动物中生产时，一般会导致正确地加工哺乳动物蛋白质，如上述描述的相应的转译后加工以及恰当的折叠。也可以得到大量基本纯化的人BSSL/CEL。A presently preferred method of producing recombinant human BSSL/CEL according to the invention is the use of a genetically mutated non-human mammal capable of secreting human BSSL/CEL into its milk. The advantage of using genetically mutated non-human mammals is that high yields of recombinant human BSSL/CEL can be obtained at a reasonable price, especially when the non-human mammal is a cow, the use of recombinant human BSSL/CEL as a milk-based No further purification is necessary for the production of recombinant human BSSL/CEL in milk as a normal component such as infant formula for nutritional supplementation in products. Furthermore, production in higher organisms such as non-human mammals generally results in correct processing of mammalian proteins, corresponding post-translational processing as described above, and proper folding. Large quantities of substantially purified human BSSL/CEL are also available.

相应地，在另一重要的方面，本发明还涉及哺乳动物表达系统，它含有插入到非人哺乳动物奶蛋白质编码基因中的编码人BSSL/CEL的DNA序列以便形成杂交基因，在含所述杂交基因的成年雌性哺乳动物的乳腺中可表达该杂交基因。Accordingly, in another important aspect, the present invention also relates to a mammalian expression system comprising a DNA sequence encoding human BSSL/CEL inserted into a non-human mammalian milk protein encoding gene to form a hybrid gene, containing said The hybrid gene can be expressed in the mammary gland of an adult female mammal of the hybrid gene.

编码人BSSL/CEL的DNA序列较好地是在序列目录中以SEQ ID NO:1所示的DNA序列或基因组的人BSSL/CEL基因或其类似物。The DNA sequence encoding human BSSL/CEL is preferably the human BSSL/CEL gene or its analogue of the DNA sequence shown in SEQ ID NO: 1 in the sequence catalog or genome.

一般认为，作为表达组织的乳腺和编码奶蛋白质的基因是特别适用于在基因突变的非人哺乳动物中生产异源蛋白质，象在乳腺中以高表达水平天然生产奶蛋白质一样。另外，奶容易收集并可大量得到。在本发明中，在生产重组人BSSL/CEL中使用奶蛋白质基因还具有的优点是，按照表达的调节和生产场所(乳腺)，可以在与其天然生产条件相似的条件下生产重组人BSSL/CEL。It is generally believed that the mammary gland and the genes encoding milk proteins are particularly suitable for the production of heterologous proteins in genetically mutated non-human mammals as the expression tissue, as is the natural production of milk proteins at high expression levels in the mammary gland. In addition, milk is easy to collect and is available in large quantities. In the present invention, the use of a milk protein gene in the production of recombinant human BSSL/CEL also has the advantage that recombinant human BSSL/CEL can be produced under conditions similar to its natural production conditions in accordance with regulation of expression and production site (mammary gland) .

本文中的术语“杂交基因”一方面是指含有上文所定义编码人BSSL/CEL的DNA序列，另一方面指能介导表达杂交基因产物的奶蛋白质基因DNA序列的DNA序列。术语“编码奶蛋白质的基因”是指能介导并确定杂交基因到所需组织即乳腺中表达的完整基因和其亚序列。通常，所述的亚序列至少含一个或多个启动子区域，转录起始位点，3＇和5＇非编码区及结构序列。编码人BSSL/CEL的DNA序列最好基本上没有如与其克隆后的DNA序列有关的原核序列，如载体序列。The term "hybrid gene" herein refers on the one hand to a DNA sequence comprising the DNA sequence encoding human BSSL/CEL as defined above, and on the other hand to a DNA sequence capable of mediating the expression of the milk protein gene DNA sequence of the hybrid gene product. The term "gene encoding milk protein" refers to the entire gene and its subsequences that can mediate and determine the expression of the hybrid gene into the desired tissue, namely mammary gland. Usually, the subsequence at least includes one or more promoter regions, transcription start sites, 3' and 5' non-coding regions and structural sequences. Preferably, the DNA sequence encoding human BSSL/CEL is substantially free of prokaryotic sequences, such as vector sequences, as related to its cloned DNA sequence.

最好在体外，用本领域已知的技术，将编码人BSSL/CEL的DNA序列插入到奶蛋白质基因中，形成杂交基因。另外，可通过体内同源重组将编码人BSSL/CEL的DNA序列插入。Preferably in vitro, using techniques known in the art, the DNA sequence encoding human BSSL/CEL is inserted into the milk protein gene to form a hybrid gene. Alternatively, the DNA sequence encoding human BSSL/CEL can be inserted by homologous recombination in vivo.

通常，将编码人BSSL/CEL的DNA序列插入到所选择奶蛋白质的第一种外显子之一中或含有第一外显子和优选的据认为有调节重要性的5＇侧面序列的基本部分的其有效亚序列中。Typically, the DNA sequence encoding human BSSL/CEL is inserted into one of the first exons of the milk protein of choice or contains the base of the first exon and preferably 5' flanking sequences thought to be of regulatory importance. Part of its effective subsequence.

杂交基因较好地是含有编码信号肽的序列以便能将杂交基因产物正确地分泌到乳腺中。信号肽一般是通常发现存在于所述奶蛋白质基因中的或与编码人BSSL/CEL的DNA序列有关的。然而，能促使杂交基因产物分泌到乳腺中的其它信号序列也是可以的。当然，也应以上述方法融合杂交基因的各种成分以便正确地表达和加工基因产物。因此，通常应将所选择的编码信号肽的DNA序列精确地融合到编码人BSSL/CEL的DNA序列的N末端部分。在杂交基因中，编码人BSSL/CEL的DNA序列通常含有它自己的终止密码子，但不是它自己的信息劈开处，和多聚腺苷化位点。编码人BSSL/CEL的DNA序列下游，一般保留奶蛋白质的mRNA加工序列。The hybrid gene preferably contains a sequence encoding a signal peptide to enable proper secretion of the hybrid gene product into the mammary gland. Signal peptides are typically those normally found in said milk protein genes or are related to the DNA sequence encoding human BSSL/CEL. However, other signal sequences that drive secretion of the hybrid gene product into the mammary gland are also possible. Of course, the various components of the hybrid gene should also be fused in the manner described above in order to properly express and process the gene product. Therefore, usually the selected DNA sequence encoding the signal peptide should be precisely fused to the N-terminal portion of the DNA sequence encoding human BSSL/CEL. In the hybrid gene, the DNA sequence encoding human BSSL/CEL usually contains its own stop codon, but not its own message cleavage, and polyadenylation site. Downstream of the DNA sequence encoding human BSSL/CEL, mRNA processing sequences for milk proteins are generally retained.

预计许多因子都会影响特定杂交基因的实际表达水平。启动子的能力以及上述的其它调节序列，表达系统在哺乳动物基因组中的整合位点，编码人BSSL/CEL的DNA序列在奶蛋白质编码基因中的整合位点，具有转录后调节的能力的成分以及其它类似因子均对所得的表达水平起极重要作用。在了解各种因子对杂交基因表达水平影响的基础上，本领域专业人员会知道如何设计适用于本发明目的的表达系统。Many factors are expected to affect the actual expression level of a particular hybrid gene. The ability of the promoter and other regulatory sequences mentioned above, the integration site of the expression system in the mammalian genome, the integration site of the DNA sequence encoding human BSSL/CEL in the milk protein coding gene, the components with the ability of post-transcriptional regulation and other similar factors all play an important role in the resulting expression levels. On the basis of understanding the influence of various factors on the expression level of hybrid genes, those skilled in the art will know how to design an expression system suitable for the purpose of the present invention.

由乳腺分泌各种不同的奶蛋白质。主要存在两类奶蛋白质，即酪蛋白和乳清蛋白。来自不同种的奶的组分就这些蛋白质的质量和数量变化。大多数非人哺乳动物产生3种不同的酪蛋白，即α酪蛋白，β-酪蛋白和k酪蛋白。大多数普遍的牛乳清蛋白是α乳清蛋白和β乳清蛋白。Clark等人(1987)进一步描述了各种来源奶的组成。Various milk proteins are secreted by the mammary glands. There are two main classes of milk proteins, caseins and whey proteins. The composition of milk from different species varies with respect to the quality and quantity of these proteins. Most non-human mammals produce 3 different caseins, alpha casein, beta-casein and kappa casein. The most common bovine whey proteins are alpha and beta whey proteins. The composition of milk from various sources is further described by Clark et al. (1987).

可以使用的奶蛋白质基因来自于相同于要插入了该表达系统的种。或者也可以是从另一个种得到的。本文已显示将基因特定地表达到乳腺中的调节成分是种范围内功能交叉的，它们可能起因于共同的祖先(Hennighausen et al.,1990)。Milk protein genes that can be used are from the same species into which the expression system is to be inserted. Or it can be obtained from another species. It has been shown here that the regulatory elements that specifically express genes into the mammary gland are species-wide functional crossovers that likely arose from a common ancestor (Hennighausen et al., 1990).

在构建本发明表达系统时所用的编码奶蛋白质基因或其有效亚序列的适当例子一般是在各种哺乳动物的乳清蛋白中发现的，如乳清酸性蛋白(WAP)基因，最好来自鼠，以及β乳球蛋白基因，最好来自羊。也可以发现适于基因突变生产人BSSL/CEL的各种来源的酪蛋白基因，如牛αS1酪蛋白和鼠β酪蛋白。本发明优选的基因是鼠WAP基因，发现该基因在不同基因突变动物的奶中，能提供高水平表达的大量外源人蛋白质(Hennighausen et al.,1990)。Suitable examples of the milk protein gene or its effective subsequence used when constructing the expression system of the present invention are generally found in various mammalian whey proteins, such as whey acidic protein (WAP) gene, preferably from mouse , and the beta-lactoglobulin gene, preferably from sheep. Casein genes from various sources suitable for genetic mutation to produce human BSSL/CEL can also be found, such as bovine αS1 casein and murine β casein. The preferred gene of the present invention is the murine WAP gene, which was found to provide high levels of expression of a large number of foreign human proteins in the milk of different genetically mutant animals (Hennighausen et al., 1990).

与本发明表达系统相关的较好的另一序列是所谓能介导高水平表达的表达稳定化序列。大量事实说明上述稳定化序列在奶蛋白质基因的附近和下游存在。Another sequence which is preferred in connection with the expression system of the present invention is the so-called expression stabilizing sequence capable of mediating a high level of expression. A large number of facts indicate that the above-mentioned stabilizing sequence exists near and downstream of the milk protein gene.

插入到本发明表达系统的编码人BSSL/CEL的DNA序列可以是基因组的或合成的或其任何方法结合。为了得到令人满意的表达，已经发现一些表达系统需要含有内含子和其它调节因子(Hennighausen et al.,1990)。在一些情况下，最好将基因组结构而不是cDNA元素作为多肽编码成分导入载体构建体(Brinsteret al.)。当使用cDNA为基础的载体时，内含子及外显子结构可能会导致得到较高的稳定状态的mRNA水平。The DNA sequence encoding human BSSL/CEL inserted into the expression system of the present invention may be genomic or synthetic or any combination thereof. For satisfactory expression, some expression systems have been found to require the presence of introns and other regulatory elements (Hennighausen et al., 1990). In some cases, it may be preferable to introduce genomic constructs rather than cDNA elements as polypeptide-encoding components into vector constructs (Brinster et al.). When using cDNA-based vectors, the intron and exon structure may lead to higher steady-state mRNA levels.

另一方面，本发明涉及杂交基因，它含有插入到编码非人哺乳动物奶蛋白质基因中的编码人BSSL/CEL的DNA序列，以及以上述方式插入到奶蛋白质基因中的DNA序列，以便在含该杂交基因的成年雌性哺乳动物的乳腺中使其得到表达。在上文已详细讨论了杂交基因及其组成。在构建如上文所公开的本发明的表达系统中，杂交基因构成了一个重要的中间产物。In another aspect, the present invention relates to a hybrid gene comprising a DNA sequence encoding human BSSL/CEL inserted into a gene encoding a milk protein of a non-human mammal, and a DNA sequence inserted into a milk protein gene in the above-mentioned manner, so as to contain The hybrid gene is expressed in the mammary gland of an adult female mammal. Hybrid genes and their components have been discussed in detail above. In constructing the expression system of the present invention as disclosed above, the hybrid gene constitutes an important intermediate product.

另一方面，本发明涉及含上文所定义的表达系统的非人哺乳动物细胞。哺乳动物细胞最好是胚胎细胞或原核。用下文解释并在下文实施例具体说明的方法，将表达系统适当地插入到哺乳动物细胞中。In another aspect, the invention relates to non-human mammalian cells comprising the expression system defined above. Mammalian cells are preferably embryonic or prokaryotic. The expression system is suitably inserted into mammalian cells by the methods explained below and specified in the Examples below.

在另一重要方面，本发明涉及生产能表达人BSSL/CEL的突变基因非人哺乳动物的方法，包括将上文定义的本发明表达系统注射到哺乳动物的受精卵或胚胎细胞中，以便将表达系统掺入哺乳动物的幼体中并使所得的注射过的受精卵或胚胎发育成成年雌性哺乳动物。In another important aspect, the present invention relates to a method for producing a mutant gene capable of expressing human BSSL/CEL non-human mammals, comprising injecting the expression system of the present invention as defined above into fertilized eggs or embryonic cells of mammals, so as to express The expression system is incorporated into mammalian larvae and the resulting injected fertilized eggs or embryos are allowed to develop into adult female mammals.

可以用任何适当的技术，如在“Mahipulating the MouseEmbryo”；(A Laboratory Manual,Cold Spring HarborLaboratory Perss，1986)中的描述完成将表达系统掺入哺乳动物的幼殖体中。例如，可以直接将数百分子的表达系统注射到受精卵中，如一个受精的卵细胞或其原核，或所选择哺乳动物的一个胚胎，然后将微注射的卵转移到假孕抚育母亲的输卵管中，并使其发育。通常，并不是所有注射过的卵都能发育成表达人BSSL/CEL的成年雌性。因此，从令人乐观的观点看，从下列繁殖孕育雌性动物的方法中，将有一半的哺乳动物是雄性。Incorporation of the expression system into mammalian larvae can be accomplished using any suitable technique, as described in "Mahippulating the Mouse Embryo"; (A Laboratory Manual, Cold Spring Harbor Laboratory Perss, 1986). For example, an expression system of hundreds of molecules can be injected directly into a fertilized egg, such as a fertilized egg cell or its pronucleus, or an embryo in a mammal of choice, and the microinjected egg is then transferred into the oviduct of a pseudopregnant nursing mother , and allow it to develop. Often, not all injected eggs develop into adult females expressing human BSSL/CEL. Thus, from an optimistic point of view, half of all mammals will be males from the following methods of breeding females.

一旦整合到幼殖体中，就可以以高水平表达编码人BSSL/CEL的DNA序列以便在所述哺乳动物的稳定种系中生产正确加工过的并有功能的人BSSL/CEL。Once integrated into the larvae, the DNA sequence encoding human BSSL/CEL can be expressed at high levels to produce correctly processed and functional human BSSL/CEL in the stable germline of the mammal.

另外所需的是生产基因突变的非人哺乳动物的方法，所述的突变非人哺乳动物能表达人BSSL/CEL，而基本不能表达其自身的BSSL/CEL，该方法包括(a)破坏哺乳动物表达其自身的BSSL/CEL能力以便基本上不表达该哺乳动物的BSSL/CEL并用上述方法将上述的本发明表达系统或编码人BSSL/CEL的DNA序列插入到该哺乳动物的幼殖体中以便在该哺乳动物中表达人BSSL/CEL；和/或(b)用上述的本发明的表达系统或编码人BSSL/CEL的DNA序列取代哺乳动物的BSSL/CEL或其部分。Also needed is a method of producing a genetically mutated non-human mammal capable of expressing human BSSL/CEL but substantially unable to express its own BSSL/CEL comprising (a) disrupting the mammalian The ability of the animal to express its own BSSL/CEL so as not to express the BSSL/CEL of the mammal substantially and insert the above-mentioned expression system of the present invention or the DNA sequence encoding human BSSL/CEL into the larvae of the mammal by the above-mentioned method In order to express human BSSL/CEL in the mammal; and/or (b) replace mammalian BSSL/CEL or part thereof with the above-mentioned expression system of the present invention or DNA sequence encoding human BSSL/CEL.

通过将突变导入负责BSSL/CEL表达的DNA序列中可以很方便地破坏哺乳动物的BSSL/CEL表达能力。上述突变含有使DNA序列没有阅读框的突变，或导入终止密码子或缺失DNA序列的一个或多个核苷酸。The ability to express BSSL/CEL in mammals can be conveniently disrupted by introducing mutations into the DNA sequence responsible for BSSL/CEL expression. The above-mentioned mutations include mutations that render the DNA sequence without a reading frame, or introduce stop codons, or delete one or more nucleotides of the DNA sequence.

用已知的同源重组理论，用上述的表达系统或编码人BSSL/CEL的DNA序列取代哺乳动物的BSSL/CEL基因或其部分。Using the known theory of homologous recombination, the mammalian BSSL/CEL gene or part thereof is replaced with the above-mentioned expression system or the DNA sequence encoding human BSSL/CEL.

在另一方面，本发明涉及用上述描述的方法制备的基因突变非人哺乳动物。In another aspect, the invention relates to a genetically mutant non-human mammal produced by the method described above.

在最广泛的方面，本发明的基因突变哺乳动物并不限于任何特定的哺乳动物，哺乳动物通常选自于含有小鼠、大鼠、兔、羊、猪、山羊和牛的组中。为了大规模生产人BSSL/CEL，较大的动物如羊，山羊、猪及特别是牛由于其高产量生产奶，所以通常是优选的。然而，由于小鼠、兔和大鼠的操作更简单，且比牛能更快地得到基因突变动物，所以这些动物也是令人感兴趣的。In the broadest aspect, the genetically mutant mammal of the present invention is not limited to any particular mammal, and the mammal is generally selected from the group consisting of mouse, rat, rabbit, sheep, pig, goat and cow. For large scale production of human BSSL/CEL, larger animals such as sheep, goats, pigs and especially cattle are generally preferred due to their high milk production. However, mice, rabbits and rats are also of interest because they are easier to work with and can produce genetically mutated animals more quickly than cattle.

能生产人BSSL/CEL的上述基因突变动物的子代也在本发明范围内。Progeny of the above-mentioned genetically mutant animals capable of producing human BSSL/CEL are also within the scope of the present invention.

本发明另一方面也包括含有重组人BSSL/CEL的非人哺乳动物的奶。Another aspect of the invention also includes milk of a non-human mammal comprising recombinant human BSSL/CEL.

在本发明另一方面，本发明涉及含有重组人BSSL/CEL，特别是上述本发明多肽的婴儿代乳品。可以通过将重组人BSSL/CEL或多肽以纯化或部分纯化的形式加到婴儿代乳品的正常组分中来制备婴儿代乳品。然而，优选的是从本发明上述的奶，特别是来源于牛的奶制备婴儿代乳品可以用传统方法制备婴儿代乳品，并且可含有任何必需的添加剂如矿物质，维生素等。In another aspect of the present invention, the present invention relates to an infant formula containing recombinant human BSSL/CEL, especially the above-mentioned polypeptide of the present invention. Infant formulas can be prepared by adding recombinant human BSSL/CEL or polypeptides in purified or partially purified form to the normal components of infant formulas. However, it is preferred to prepare infant formula from the above-mentioned milk of the present invention, especially milk of bovine origin. Infant formula can be prepared by conventional methods and may contain any necessary additives such as minerals, vitamins and the like.

实施例实施例1:CEL基因的基因组结构，序列分析及染色体定位Embodiment Example 1: the genome structure of CEL gene, sequence analysis and chromosomal location

如果没有另外提到，就使用标准分子生物学技术(Maniatis etal.,1982,Ausubel et al.,1987；Sambrook et al.,1989)。分离基因组重组体If not mentioned otherwise, standard molecular biology techniques were used (Maniatis et al., 1982, Ausubel et al., 1987; Sambrook et al., 1989). Isolation of genomic recombinants

用各种亚克隆的cDNA限制片段(Nilsson et al.,1990)作为探针，用[α-³²P]dCTP经寡聚标记技术(Feinberg etal.,1983)标记，通过噬菌斑杂交筛选两种不同的人基因组噬菌体文库，λDASH(Clonentech Laboratories Inc.,PaloAlto,Ca,USA)和λEMBL-3SP6/T7(Stratagene,La Jolla,CA,USA)。基因组克隆的图谱，亚克隆和定序Using various subcloned cDNA restriction fragments (Nilsson et al., 1990) as probes, [α- ³² P]dCTP was labeled with oligomeric labeling technology (Feinberg et al., 1983), and the two species were screened by plaque hybridization. Two different human genomic phage libraries, λDASH (Clonentech Laboratories Inc., Palo Alto, Ca, USA) and λEMBL-3SP6/T7 (Stratagene, La Jolla, CA, USA). Mapping, subcloning and sequencing of genomic clones

用各种限制酶消化阳性克隆，在1％琼脂糖凝胶上电泳，然后将其真空转移(Pharmacia LKB BTG,Uppsala,Sweden)到尼龙膜上。将膜与各种cDNA探针杂交。用等速电泳方法(fverstedt et al.,1984)分离与探针杂交的限制片段。将＜800bp的较小片段直接插人到M13mp 18,M13mp19,M13BM20或M13BM21载体中并用大肠杆菌TG1作为宿主细胞将其定序，而用大肠杆菌DH5α作为宿主细菌，而将较大的片段亚克隆到pTZ18R或pTZ19R墳体中，并将其进一步消化。(相应地生产在下面实施例2中所用的质粒pS309,pS310,和pS451)。在杂交中，将一些分离的片段也用作探针。用Klenow酶通过双脱氧链末端方法(Sanger et al.,1977)以及或者特定寡核苷酸M13通用定序引物确定全部核苷酸序列。用Sjberg等人(1989)描述的软件MS-Edseq从放射自显影得到序列资料。用从UWGCG软件包装得到的程序(Devereux et al.,1984)分析该序列。引物延伸Positive clones were digested with various restriction enzymes, electrophoresed on a 1% agarose gel, and vacuum transferred (Pharmacia LKB BTG, Uppsala, Sweden) to a nylon membrane. The membranes were hybridized with various cDNA probes. Restriction fragments that hybridized to the probe were separated by isotachophoresis (fverstedt et al., 1984). Insert smaller fragments <800bp directly into M13mp18, M13mp19, M13BM20 or M13BM21 vectors and sequence them using E. coli TG1 as host cells, and use E. coli DH5α as host bacteria to subclone larger fragments into pTZ18R or pTZ19R grave bodies and further digest them. (Plasmids pS309, pS310, and pS451 used in Example 2 below were produced accordingly). In hybridization, some of the isolated fragments are also used as probes. The full nucleotide sequence was determined with Klenow enzyme by the dideoxy chain-end method (Sanger et al., 1977) and or specific oligonucleotide M13 universal sequencing primer. Sequence data were obtained from autoradiography using the software MS-Edseq described by Sjöberg et al. (1989). The sequence was analyzed with a program derived from the UWGCG software package (Devereux et al., 1984). primer extension

用异硫氰酸盐鈲-CsCl方法(Chirgwin et al,1979)从人胰腺，哺乳乳腺和脂肪组织中分离总RNA。用总RNA和一个反义26-体寡核苷酸(5＇-AGGTGAGGCCCAACACAACCAGTTGC-3＇),nt 33-58位点，按照Ausubel等人(1987)所述完成引物延伸。在30μl的0.9M NaCl,0.15M Hepes PH7.5和0.3M EDTA中于30℃过夜完成引物与20μg总RNA的杂交。在用反转录酶的延伸反应后，通过6％变性聚丙烯酰胺凝胶电泳分析延伸产物。体细胞杂交Total RNA was isolated from human pancreas, mammalian mammary gland and adipose tissue using the guanidinium isothiocyanate-CsCl method (Chirgwin et al, 1979). Primer extension was performed as described by Ausubel et al. (1987) using total RNA and an antisense 26-mer oligonucleotide (5'-AGGTGAGGCCCAACACAACCAGTTGC-3'), nt 33-58. Hybridization of primers to 20 μg of total RNA was done overnight at 30°C in 30 μl of 0.9M NaCl, 0.15M Hepes pH7.5 and 0.3M EDTA. After the extension reaction with reverse transcriptase, the extension products were analyzed by 6% denaturing polyacrylamide gel electrophoresis. somatic cell hybridization

将从NIGMS Human Genetic Mutant Cell Repository(CoriellInstitute for Medical Research Camden,NJ)得到的16个人-啮齿类体细胞杂交细胞系的DNA用于CEL基因的染色体测定。经由GM09940得到的人-小鼠体细胞杂交物GM09925是从人胎男性成纤维母细胞(IMR-91)与胸苷激酶缺陷型小鼠细胞系B-82(Taggart et al.,1985；Mohandas et al.,1986)融合得到的。GM10324与GM02860与HPRT和APRT缺陷型小鼠细胞系A9(Callen et al.,1986)杂交，而杂交物GM10611是从反转录载体SP-1感染的人淋巴母细胞细胞系GM07890与中国仓鼠卵巢细胞系UV-135(Warburton et al.,1990)微细胞融合得到的。杂交物GM10095是将带有平衡的46,x,t(x；9)(q13；34)核型的雌性淋巴细胞与中国仓鼠细胞系CHW1102(Mohandas etal.,1979)融合得到的。表1中列出了经细胞遗传学分析以及Southern印迹分析和原位杂交分析确定的杂交细胞系的人染色体内含物。用EcoRⅠ消化从小鼠，中国仓鼠和人亲代细胞系以及16个杂交细胞系中分离的高分子量DNAs，在0.8％的琼脂糖凝胶中分馏并将其转移到尼龙滤膜上。经寡聚物标记(Feinberg andVogelstein,1983)制备[α-³²P]dCTP标记的CELcDNA探针并将其与滤膜杂交。在65℃，分别于6×SSC/0.5％SDS和2×SSC/0.5％SDS中将滤膜各洗涤60分钟。多聚酶链反应DNA from 16 human-rodent somatic hybrid cell lines obtained from the NIGMS Human Genetic Mutant Cell Repository (Coriell Institute for Medical Research Camden, NJ) was used for chromosomal determination of the CEL gene. The human-mouse somatic cell hybrid GM09925 obtained via GM09940 was obtained from human fetal male fibroblasts (IMR-91) and thymidine kinase-deficient mouse cell line B-82 (Taggart et al., 1985; Mohandas et al. al., 1986) fusion. GM10324 was crossed with GM02860 with the HPRT and APRT-deficient mouse cell line A9 (Callen et al., 1986), while the hybrid GM10611 was obtained from the reverse transcription vector SP-1 infected human lymphoblastoid cell line GM07890 with Chinese hamster ovary The cell line UV-135 (Warburton et al., 1990) was obtained by fusion of minicells. The hybrid GM10095 was obtained by fusing female lymphocytes with a balanced 46,x,t(x;9)(q13;34) karyotype with the Chinese hamster cell line CHW1102 (Mohandas et al., 1979). The human chromosomal inclusions of the hybrid cell lines as determined by cytogenetic analysis as well as by Southern blot analysis and in situ hybridization analysis are listed in Table 1. High molecular weight DNAs isolated from mouse, Chinese hamster and human parental cell lines and 16 hybrid cell lines were digested with EcoRI, fractionated in 0.8% agarose gel and transferred to nylon filters. [α- ³² P]dCTP-labeled CEL cDNA probe was prepared by oligo-labeling (Feinberg and Vogelstein, 1983) and hybridized to the filter. The filters were washed in 6xSSC/0.5% SDS and 2xSSC/0.5%SDS at 65°C for 60 minutes each. polymerase chain reaction

用从白细胞分离的全部人基因组DNA，来自体细胞杂交物和部分阳性基因组重组体的DNA以及来自哺乳期人乳腺和人胰腺的总RNA扩增其外显子10和外显子11。使用2μg的DNA，在表2中列出了所用的引物。在100μl体积[10mM Tris-HCl,PH8.3,50mM KCl,1.5mM MgCl₂,各200μM的dNTP,100μg/ml明胶，各100pmol的引物，1.5U Taq DNA多聚酶(Perkin-Elmer Cetus,Norwalk,CT,USA)]及全部引物对的退火温度55℃完成了30个循环的PCR。用结合的互补DNA(cDNA)及PCR方法扩增RNA序列。在42℃，经30分钟,在含50mM Tris-HCl,PH8.3,50mM KCl,10mMMgCl₂,10μg/ml BSA,各1mM dNTP,500ng寡(dt)_12-13,40U核糖核酸酶抑制剂，及200U反转录酶(MoMuLV),(BRL,Bethesda Research Laborataries,N.Y.,USA)的40μl溶液中，用共10μg RNA合成cDNA。将该cDNA沉淀并再悬浮于25μl H₂O中；按上文所述扩增其中的2μl。在2％琼脂糖凝胶上分析扩增的片段。进一步亚克隆其中的一些片段，并将其定序。人CEL基因的基因结构Exons 10 and 11 were amplified with whole human genomic DNA isolated from white blood cells, DNA from somatic cell hybrids and partial positive genomic recombinants, and total RNA from lactating human breast and human pancreas. Using 2 μg of DNA, the primers used are listed in Table 2. In 100 μl volume [10mM Tris-HCl, PH8.3, 50mM KCl, 1.5mM MgCl ₂ , each 200μM dNTP, 100μg/ml gelatin, each 100pmol primer, 1.5U Taq DNA polymerase (Perkin-Elmer Cetus, Norwalk, CT , USA)] and the annealing temperature of all primer pairs was 55°C to complete 30 cycles of PCR. RNA sequences are amplified using combined complementary DNA (cDNA) and PCR methods. At 42°C, after 30 minutes, in 50mM Tris-HCl, pH8.3, 50mM KCl, 10mM MgCl ₂ , 10μg/ml BSA, 1mM dNTP each, 500ng oligo(dt) _12-13 , 40U ribonuclease inhibitor, and 200 U reverse transcriptase (MoMuLV), (BRL, Bethesda Research Laborataries, NY, USA) in 40 μl solution, a total of 10 μg RNA was used to synthesize cDNA. The cDNA was precipitated and resuspended in 25 [mu]l _H2O ; 2 [mu]l of this was amplified as described above. The amplified fragments were analyzed on a 2% agarose gel. Some of these fragments were further subcloned and sequenced. Gene structure of human CEL gene

在每个基因组文库中，筛选10⁸个重组体并且这些筛选物中产物数个阳性克隆，将它们全部分离出来并作图。进一步分析称作λBSSL1和λBSSL5A的两个克隆。用几种酶进行限制酶消化Southern印迹分析，接着用cDNA探针杂交，说明λBSSL5A克隆覆盖了全部CEL基因，λBSSL1克隆覆盖了5＇-的一半和5＇侧区域的10kb(图1)。这两个克隆一起共覆盖了约25kb的人基因组。In each genome library, 10 ⁸ recombinants were screened and several positive clones were produced in these screens, all of them were isolated and plotted. Two clones called λBSSL1 and λBSSL5A were further analyzed. Restriction enzyme digestion Southern blot analysis with several enzymes, followed by cDNA probe hybridization, showed that the λBSSL5A clone covered the entire CEL gene, and the λBSSL1 clone covered the 5'-half and 10kb of the 5' side region (Figure 1). Together, these two clones cover approximately 25 kb of the human genome.

亚克隆及限制酶消化后，得到适宜定序的片段，并可以确定CEL基因的全部序列，包括1640bp的5＇侧区域的41bp的3＇侧区域。这些资料说明，人CEL基因(SEQ ID NO:1)跨9850bp的区域，含由10个内含子间隔的11个外显子(图1)。这意味着外显子和特定的内含子是相当小的。事实上，外显子1-10的大小范围分别从87-204bp，外显子11是841bp长。内含子的大小范围分别从85-2.343bp。在表3中可以注意到，所有的外显子/内含子界线都遵守AG/GT法则，并与Mount等人(1982)指出的一致序列符合得很好。将CEL基因的编码部分与cDNA(Nilsson et al.,1990)相比，仅发现核苷酸序列中有一个不同，在外显子1中第2nt是ac，在cDNA序列中是aT。由于该位置是在转译起始密码子ATG上游的10nt，所以这种差别并不影响氨基酸序列。重复DNA因子的7个Alu类成员居定序区，标记为Alu1-Alu7(5＇-3＇)(图1)，其中一个在5＇侧区域，其它6个在CEL基因内。转录起始位点和5＇侧区域After subcloning and restriction enzyme digestion, fragments suitable for sequencing were obtained, and the entire sequence of the CEL gene could be determined, including the 1640 bp 5' side region and the 41 bp 3' side region. These data indicate that the human CEL gene (SEQ ID NO: 1) spans a region of 9850 bp and contains 11 exons separated by 10 introns (Figure 1). This means that exons and specifically introns are quite small. In fact, exons 1-10 range in size from 87-204 bp, respectively, and exon 11 is 841 bp long. The size of the introns ranged from 85-2.343bp, respectively. It can be noted in Table 3 that all exon/intron boundaries follow the AG/GT rule and fit well with the consensus sequence indicated by Mount et al. (1982). Comparing the coding part of the CEL gene with the cDNA (Nilsson et al., 1990), only one difference was found in the nucleotide sequence, the 2nd nt in exon 1 was ac, and in the cDNA sequence it was aT. Since this position is 10 nt upstream of the translation initiation codon ATG, this difference does not affect the amino acid sequence. The seven members of the Alu class of repetitive DNA factors reside in the sequenced region, labeled Alu1-Alu7 (5'-3') (Figure 1), one of which is in the 5' side region, and the other six are in the CEL gene. Transcription start site and 5' side region

为了绘制人CEL基因转录起始位点的图谱，用来自人胰腺、哺乳期乳腺和脂肪组织的总RNA完成引物延伸分析。结果表明，主要的转录起始位点位于起始因子甲硫氨酸上游的12bp，而一个较小的起始位点位于8个碱基处。在胰腺和哺乳期乳腺中的转录起始位点相同，而在脂肪组织中未检测到信号(图2)。定序区包括5＇侧DNA的1640nt。以序列相似性为基础，在转录起始位点的上游发现了一个TATA盒的类似序列，CATAAAT(图4)。在该区域既没有CAAT盒结构也没有GC盒。To map the transcription start sites of the human CEL gene, primer extension assays were performed using total RNA from human pancreas, lactating mammary gland, and adipose tissue. The results showed that a major transcription initiation site was located 12 bp upstream of the initiation factor methionine, while a minor initiation site was located 8 bases away. The transcription start site was identical in pancreas and lactating mammary gland, while no signal was detected in adipose tissue (Fig. 2). The sequenced region includes 1640 nt of the 5' side DNA. On the basis of sequence similarity, a TATA box-like sequence, CATAAAT, was found upstream of the transcription initiation site (Fig. 4). There are neither CAAT box structures nor GC boxes in this region.

计算机筛选两个链的5＇侧序列作为其它乳腺-和胰腺特异性基因转录子结合序列的核苷酸序列，发现几个推测的识别序列，见图4。CEL基因的染色体定位Computer screening of the 5' side sequences of the two strands as nucleotide sequences for binding sequences of other mammary- and pancreas-specific gene transcripts revealed several putative recognition sequences, see Figure 4. Chromosomal location of CEL gene

在人对照DNA中，CEL cDNA探针检测到4个约13kb,10kb,2.2kb及2.0kb的EcoRⅠ片段，在小鼠及仓鼠对照DNA中，分别检测到约25kb和8.6kb的单一片段。在杂交克隆中，人CEL基因序列的存在仅与人第9染色体的存在有关(表1)。被分析的1b个杂交物仅有1个是人CEL基因阳性；该杂交物含有仅作为人染色体的第9染色体。没有发现有关该染色体定位的不一致性，而有关其它任何染色体定位的不一致性至少有两个(表1)。为了进一步亚定位CEL基因，我们利用一个人-中国仓鼠杂交体(GM10095)，它保留一个作为唯一人DNA的der(9)易位染色体(9pter→9q34:Xq13→Xqter)。在该杂交体中，用Southern印迹检测不到任何CEL基因序列，这说明，CEL基因保留在9q34-qter区域。实施例2 构建表达载体In the human control DNA, four EcoRI fragments of about 13kb, 10kb, 2.2kb and 2.0kb were detected by the CEL cDNA probe, and a single fragment of about 25kb and 8.6kb was detected in the mouse and hamster control DNA, respectively. In hybrid clones, the presence of the human CEL gene sequence was only associated with the presence of human chromosome 9 (Table 1). Only 1 of the 1b hybrids analyzed was positive for the human CEL gene; this hybrid contained chromosome 9 as a human chromosome only. No inconsistencies were found for this chromosomal location, while at least two were found for any other chromosomal location (Table 1). To further submap the CEL gene, we utilized a human-Chinese hamster hybrid (GM10095) that retained a der(9) translocation chromosome (9pter→9q34:Xq13→Xqter) as the only human DNA. In this hybrid, no CEL gene sequence could be detected by Southern blotting, which indicated that the CEL gene remained in the 9q34-qter region. Example 2 Construction of expression vector

为了构建用于在来自基因突变动物的奶中生产重组人CEL，使用下列方案(图5)。To construct a recombinant human CEL for production in milk from genetically mutant animals, the following protocol was used (Figure 5).

用上文所述的方法得到三个含人CEL基因不同部分，以pTZ为基础的质粒，pS309,pS310,和pS311。质粒pS309含一个SphⅠ片段，该片段覆盖从5＇非转录区到部分第4内含子的CEL基因。质粒pS310含一个SacⅠ片段，该片段含从部分第1内含子到部分第6内含子的CEL基因。第三，质粒pS311含一个BamHⅠ片段，该片段含第5内含子主要部分及内含子/外显子结构剩余部分的CEL基因变异体。在该质粒中，将一般编码16个重复片段的外显子11的重复序列突变，以编码有9个重复的平截变异体。Three pTZ-based plasmids containing different parts of the human CEL gene, pS309, pS310, and pS311, were obtained by the method described above. Plasmid pS309 contains a SphI fragment, which covers the CEL gene from the 5' untranscribed region to part of the fourth intron. Plasmid pS310 contains a SacI fragment containing the CEL gene from part of the first intron to part of the sixth intron. Third, plasmid pS311 contains a BamHI fragment containing the CEL gene variant for the major part of the fifth intron and the remainder of the intron/exon structure. In this plasmid, the repeat sequence of exon 11, which normally encodes 16 repeats, was mutated to encode a truncated variant with 9 repeats.

另一个质粒pS283含有克隆到pUC 19 HindⅢ和SacⅠ位点的部分人CEL cDNA，用该质粒融合基因序列。用pS283也可得到一个很方便的限制酶位点，Kpn1，该位点位于CEL5＇非转译的前导序列。然后用NcoⅠ和SacⅠ消化质粒pS283，分离一个约2.7kb的片段。用NcoⅠ和BspE1消化质粒pS309，分离出一个含CEL基因5＇部分的约2.3kb的片段。用BspEⅠ和SacⅠ消化质粒pS310分离出一个含CEL基因部分中间区域的约2.7kb的片段。连接这3个片段，并将其转化到感受态大肠杆菌菌株TG2中，用氨苄青霉素选择分离转化体。从大量的转化体中制备质粒，含有所需的构建体的一个称为pS312的质粒(图6)用于进一步实验。Another plasmid, pS283, containing part of the human CEL cDNA cloned into the HindIII and SacI sites of pUC19, was used to fuse the gene sequence. A convenient restriction enzyme site, Kpn1, is also available with pS283, which is located in the CEL5' non-translated leader sequence. Plasmid pS283 was then digested with NcoI and SacI to isolate a fragment of about 2.7 kb. Plasmid pS309 was digested with NcoI and BspE1, and an approximately 2.3 kb fragment containing the 5' portion of the CEL gene was isolated. Digestion of plasmid pS310 with BspEI and SacI isolated a fragment of about 2.7 kb containing part of the middle region of the CEL gene. The three fragments were ligated and transformed into competent E. coli strain TG2, and transformants were isolated by ampicillin selection. Plasmids were prepared from a large number of transformants and one plasmid called pS312 (Figure 6) containing the desired construct was used for further experiments.

为了得到pS311的修饰物，其中将位于终止密码子下游的BamHⅠ位点变成SalⅠ位点以便于进二步克隆，使用下列方法。用部分BamHⅠ消化，使pS311线性化。分离线性化的片段，并插入将BamHⅠ转变成SalⅠ位点(5＇-GATCGTCGAC-3＇)并由此破坏RamHⅠ位点的一个合成DNA接头。由于有两个整合合成接头的潜在位置，所以用限制酶裂解分析所得的质粒。分离在外显子11下游所需位置插入有接头的质粒并命名为pS313。To obtain a modification of pS311 in which the BamHI site located downstream of the stop codon was changed to a SalI site for further secondary cloning, the following method was used. pS311 was linearized by partial BamHI digestion. The linearized fragment was isolated and inserted into a synthetic DNA linker converting BamHI to a SalI site (5'-GATCGTCGAC-3') and thereby disrupting the RamHI site. The resulting plasmid was analyzed by restriction enzyme cleavage since there were two potential sites for the integration of the synthetic linker. A plasmid with the adapter inserted at the desired position downstream of exon 11 was isolated and named pS313.

为了得到含CEL基因组序列并编码平截CEL变异体的表达载体构建体，使用一个用于中间步骤并在定位期间在乳腺细胞中进行组织特异性表达的质粒pS314。质粒pS314含有一个作为NotⅠ片段克隆的来自鼠乳清酸性蛋白质(WAP)基因(Campbell etal,1984)的基因组片段。该基因组片段有约4.5 kb的上游调节序列(URS)，完整的可转录的外显子/内含子区域以及约3kb的最后一个外显子的下游序列。唯一的KpnⅠ位点位于天然WAP转译起始密码子上游的第一外显子24bp处。另一个唯一的限制酶位点是位于外显子3中的SalⅠ位点，在pS314中，通过消化破坏该SalⅠ位点，用klenow填平并再连接。代之以在外显子1中KpnⅠ位点的下游直接导入一个新的SalⅠ位点。通过KpnⅠ消化，并在该位置导入退火的合成寡聚物SYM24015＇-CGTCGACGTAC-3＇和SYM24025＇-GTCGACGGTAC-3＇完成上述方案(图8)。用下列方法将人CEL基因组序列插入这两个位点KpnⅠ和SalⅠ之间。首先，用KpnⅠ和SalⅠ消化pS314，并电泳分离代表裂解质粒的片段。第二步，用KpnⅠ和BamHⅠ消化pS312并分离人CEL基因的5＇部分的约4.7Kb片段。第三步，用BamHⅠ和SalⅠ消化pS313并分离人CEL基团的3＇部分。连接这三个片段，并转化到感受态大肠杆菌细菌中，在氨苄青霉素选择后分离转化体。从几个转化体制备质粒并用限制酶图谱和序列分析仔细分析。确定出代表所需表达载体的质粒并命名为pS317。To obtain an expression vector construct containing the CEL genomic sequence and encoding a truncated CEL variant, a plasmid pS314 was used for an intermediate step and for tissue-specific expression in mammary cells during targeting. Plasmid pS314 contains a genomic fragment from the mouse whey acidic protein (WAP) gene (Campbell et al, 1984) cloned as a NotI fragment. This genomic fragment has about 4.5 kb of upstream regulatory sequences (URS), complete transcribable exon/intron regions and about 3 kb of downstream sequence of the last exon. The only KpnI site is located at 24 bp of the first exon upstream of the native WAP translation start codon. Another unique restriction enzyme site is the SalI site located in exon 3. In pS314, this SalI site was destroyed by digestion, filled in with klenow and religated. Instead, a new SalI site was introduced directly downstream of the KpnI site in exon 1. This protocol was completed by KpnI digestion and introduction of the annealed synthetic oligos SYM24015'-CGTCGACGTAC-3' and SYM24025'-GTCGACGGTAC-3' at this position (Figure 8). The human CEL genome sequence was inserted between these two sites KpnI and SalI by the following method. First, pS314 was digested with KpnI and SalI, and the fragments representing the cleaved plasmid were separated by electrophoresis. In the second step, pS312 was digested with KpnI and BamHI and an approximately 4.7 kb fragment of the 5' portion of the human CEL gene was isolated. In the third step, pS313 was digested with BamHI and SalI and the 3' portion of the human CEL group was isolated. The three fragments were ligated and transformed into competent E. coli bacteria, and transformants were isolated after ampicillin selection. Plasmids were prepared from several transformants and carefully analyzed by restriction enzyme mapping and sequence analysis. A plasmid representing the desired expression vector was identified and named pS317.

为了构建编码全长CEL的基因组CEL表达载体，按下列步骤修饰pS317(图5)。首先，用HindⅢ和SacⅠ消化含第5内含子到第11外显子下游人CEL基因的约5.2kb BamHⅠ片段的pTZ18R质粒(Pharmacia),pS451。该消化产生一个约1.7kb的片段，它包括从内含子9中的HindⅢ位点到外显子11中的SacⅠ位点。第二步，用SaCⅠ和SalⅠ消化质粒pS313，分离含外显子113＇部分和所产生的SalⅠ位点的71bp片段。第三步，从pS317中分离作为约20kb SalⅠ/HindⅢ片段的WAP/CEL重组体基因以及质粒序列的剩余部分。连接这3个片段并转化到细菌中。从几个转化体制备质粒。用各种限制酶消化质粒并进行序列分析。鉴定出一个含所需重组基因的质粒。将该最终的表达载体命名为pS452(图7)。To construct a genomic CEL expression vector encoding full-length CEL, pS317 was modified as follows (Fig. 5). First, pTZ18R plasmid (Pharmacia), pS451 containing about 5.2 kb BamHI fragment of human CEL gene downstream from intron 5 to exon 11 was digested with HindIII and SacI. This digestion yielded an approximately 1.7 kb fragment that included the HindIII site in intron 9 to the SacI site in exon 11. In the second step, plasmid pS313 was digested with SaCl and SalI to isolate a 71 bp fragment containing the 113' portion of exon and the resulting SalI site. In the third step, the WAP/CEL recombinant gene and the remainder of the plasmid sequence were isolated from pS317 as an approximately 20 kb SalI/HindIII fragment. The 3 fragments were ligated and transformed into bacteria. Plasmids were prepared from several transformants. Plasmids were digested with various restriction enzymes and subjected to sequence analysis. A plasmid containing the desired recombinant gene was identified. The final expression vector was named pS452 (Figure 7).

为了除去原核质粒序列，用NotⅠ消化pS452。然后用琼脂糖电泳分离重组载体元素，该重组载体元素由位于人CEL基因组片段侧面的鼠WAP序列组成。在注射到小鼠胚胎中之前，用电洗脱进一步纯化分离的片段。To remove prokaryotic plasmid sequences, pS452 was digested with NotI. The recombinant vector element consisting of murine WAP sequences flanking the human CEL genomic fragment was then isolated by agarose electrophoresis. The isolated fragments were further purified by electroelution before injection into mouse embryos.

在图8中列出了用于在基因突变动物乳腺中表达的重组WAP/CEL基因。寄存Recombinant WAP/CEL genes for expression in the mammary gland of genetically mutant animals are listed in FIG. 8 . deposit

根据布达佩斯条约，已经将下列质粒寄存于DSM(DeutscheSammlung von Mikroorganismenund Zellkulturen)：质粒寄存号寄存日期 pS309 DSM 7101 1991,6,121993,2,26 pS310 DSM 7102 pS451 DSM 7498 pS452 DSM 7499 实施例3 基因突变动物的繁殖The following plasmids have been deposited with DSM (Deutsche Sammlung von Mikroorganismenund Zellkulturen) according to the Budapest Treaty: Plasmid Accession No. storage date pS309 DSM 7101 1991,6,121993,2,26 pS310 DSM 7102 pS451 DSM 7498 pS452 DSM 7499 Example 3 Breeding of Gene Mutant Animals

按照实施例2，从质粒pS452中分离NotⅠ片段。该DNA片段含有与编码人BSSL/CEL的基因组序列相连的鼠WAP启动子。将分离的片段以3ng/μl的浓度注射到350C57B1/6JXCBA/2J-T₂胚胎的前核中，该胚胎是从为了排卵过盛而用5IU怀孕母马的血清促性腺激素灌注的供体小鼠中得到的。C57B1/6JXCBA/2J-f₁动物是从BomholtgardBreeding和Research Ceutre LTD,Ry,Denmark得到的。从输卵管收集胚胎后，在M2培养基(Hogan et al.,1986)中从用透明质酸酶处理的一群细胞中分离它们。洗涤胚胎后，将其转移到培养基M16(Hogan et al.,1986)中，并保持在含5％CO₂气体的孵育箱中。用Narishigi水压显微操作器和装配Nomarski镜片的Nikon倒转显微镜，在轻石蜡油下，M2微滴中完成注射。注射后，将看起来健康的胚胎植入腹腔给过0.37ml 2.5％阿佛丁的假孕C57B1/6JXCBA/2J-f₁受体中。用来自尾活体解剖标本DNA的PCR分析鉴别已整合转移基因的小鼠，该标本是从出生3个星期的动物得到的。用Southern印迹分析确认阳性结果。实施例4：在基因突变小鼠中表达BSSL/CELAccording to Example 2, the NotI fragment was isolated from plasmid pS452. This DNA fragment contains the murine WAP promoter linked to the genomic sequence encoding human BSSL/CEL. The isolated fragments were injected at a concentration of 3 ng/μl into the pronuclei of 350C57B1/6JXCBA/2J-T ₂ embryos obtained from donor mice perfused with 5 IU serum gonadotropin from a pregnant mare for hyperovulation. obtained from mice. C57B1/6JXCBA/2J- _f1 animals were obtained from Bomholtgard Breeding and Research Ceutre LTD, Ry, Denmark. After embryos were collected from the oviduct, they were isolated from a population of cells treated with hyaluronidase in M2 medium (Hogan et al., 1986). After washing the embryos were transferred to medium M16 (Hogan et al., 1986) and kept in an incubator with 5% CO ₂ gas. Injections were accomplished in M2 droplets under light paraffin oil using a Narishigi hydraulic micromanipulator and a Nikon inverted microscope fitted with Nomarski optics. After injection, healthy-looking embryos were implanted into pseudopregnant C57B1/6JXCBA/2J- _f1 recipients intraperitoneally administered 0.37 ml of 2.5% avertin. Mice that had integrated the transgene were identified using PCR analysis of DNA from tail biopsies obtained from animals at 3 weeks of age. Positive results were confirmed by Southern blot analysis. Example 4: Expression of BSSL/CEL in Gene Mutant Mice

通过分析从解剖尾样品制备的DNA，鉴别基因突变小鼠。将组织样品与蛋白酶κ一起培养并用酚/氯仿提取。如果存在代表表达载体片段的异源导入DNA，测将分离的DNA用于带引物的多聚酶链反应以扩增特异性片段。再用DNA杂交试验分析动物以确定PCR数据并检测可能的重排，整合载体元素的结构，并获得关于整合载体元素拷贝数的资料。Genetic mutant mice were identified by analysis of DNA prepared from dissected tail samples. Tissue samples were incubated with protease kappa and extracted with phenol/chloroform. If there is heterologous introduced DNA representing fragments of the expression vector, the isolated DNA is used in a PCR with primers to amplify the specific fragments. Animals are then analyzed using DNA hybridization assays to confirm PCR data and detect possible rearrangements, the structure of the integrated vector element, and to obtain information on the copy number of the integrated vector element.

在一组实验中，用两种方法分析18个小鼠所得的结果说明了1个小鼠携带来自pS452的异源DNA载体元素。PCR分析和杂交实验的结果是相同的(图9)。In one set of experiments, 18 mice were analyzed by two methods showing that 1 mouse carried the heterologous DNA vector element from pS452. The results of PCR analysis and hybridization experiments were identical (Fig. 9).

然后将鉴别的携带载体DNA元素的小鼠(建立者动物)配对，并用同样方法分析F₁仔的基因空变。The identified mice carrying the vector DNA elements (founder animals) were then paired and _F1 offspring were analyzed for gene gaps in the same way.

用2IU催产素腹腔注射雌性哺乳期动物，10分钟后，用0.40ml 2.5％阿佛丁腹腔麻醉。经一硅化管，将一个奶收集装置与奶头相连，缓缓按摩乳腺，将奶收集到1.5ml Eppendorf管中。每只小鼠的奶量在0.1和0.5之间变化，这取决于哺乳的天数。Inject female lactating animals intraperitoneally with 2IU oxytocin, and 10 minutes later, intraperitoneally anesthetize with 0.40ml 2.5% Avertin. Through a siliconized tube, connect a milk collection device to the teat, gently massage the mammary gland, and collect the milk into a 1.5ml Eppendorf tube. The amount of milk per mouse was varied between 0.1 and 0.5, depending on the number of days of lactation.

经SDS-PAGE，转移到硝酸纤维索滤膜上，并与产生抗天然人BSSL/CEL的多克隆抗体一起培养来分析重组人BSSL/CEL的存在。得到的结果说明，在基因突变小鼠奶中表达有重组人BSSL/CEL。图10说明在基因突变小鼠奶中存在重组人BSSL/CEL：该谱带在约116.5。The presence of recombinant human BSSL/CEL was analyzed by SDS-PAGE, transferred to nitrocellulose filters, and incubated with polyclonal antibodies raised against native human BSSL/CEL. The obtained results indicate that the recombinant human BSSL/CEL is expressed in the milk of the gene mutant mice. Figure 10 illustrates the presence of recombinant human BSSL/CEL in the milk of mutant mice: the band is at about 116.5.

产生稳定的基因突变动物种系。Generation of stable genetically mutated animal germlines.

用相似的方法，也可以制备能表达人BSSL/CEL的其它基因突变动物如牛或羊。In a similar way, other genetic mutant animals such as cattle or sheep that can express human BSSL/CEL can also be prepared.

本发明涉及的大肠杆菌菌株TG2 ps 452,TG2 ps 451,TG2 ps 310,TG2ps 309已经于1993年6月11日分别以保藏号CCTCCNo:M93026,CCTCCNo:M93025,CCTCC No:M93024,CCTCC No:M93023在中国典型培养物保藏中心保藏。参考文献：Abouakil, N., Rogalska, E., Bonicel, J. & Lombardo, D.(1988): Biochim. Biophys. Acta 961, 299-308.Ausubel, F.M., Brent,R.E., Moore, D.D., Smiyh, J.A.,Seidman, J.G. and Struhl, K.: Current Protocols inMolecular Biology.(Wiley Interscience, New York 1987)Baba, T., Downs, D, Jackson, K.W., Tang, J. and Wang,C.S. (1991): Biochemistry 30, 500-510.Beato, M.(1989): Cell 56, 335-344.Bernbāck, S., Blāckberg, L. & Hernell, O.(1990): J.Clin. Invest. 221-226.Bjōrksten,B.,Burman,L.G.,deChateau,P.,FreorlKzon,B., Gothefors, L. & Hernell, O.(1980): Br. Med. J.201, 267-272.Blāckberg, L., ngquist, K.A, & Hernell, O.(1987):FEBS Lett. 217, 37-41.Blāckberg, L. & Hernell, O. (1981): Eur. J. Biochem116, 221-225.Blāckberg, L. Lombardo, D., Hernell, O., Guy, O. &Olivecrona, T.(1981): FEBS Lett. 136, 284-288.Boulet, A.M., Erwin, C.R. and Rutter, W.J.(1986):Proc. Natl. Acad. Sci. U.S.A. 83, 3599-3603.Brinster, R.L., Allen, J.M., Behringer, R.R., Gelinas,R.E. & Palmiter, R.D.(1988): Proc. Natl. Acad. Sci.U.S.A. 85, 836-840.Callen, D.F.(1986): Ann. Genet. 29, 235-239.Campbell, S.M., Rosen, J.M., Hennighausen, L.G.,Strech-Jurk, U. and Sippel, A.E.(1984): Nucleic AcidRes.12, 8685-8697.chirgwin, J.M., Przybyla, A.E., MacDonald, R.J. andRutter, W.J.(1979): Biochemistry 18, 5294-5299.Clark, A.J., Simons, P., Wilmut, I. and Lahte, R.(1987): TIBTECH 5, 20-24.Devereux, J., Haeberli, P. and Smithies. (1984):Nucleic Acids Res. 12, 387-395.Feinberg, A. and vogelstein, B. (1983): Anal. Biochem.132, 6-13.Hennighausen, L., Ruiz, L. & Wall, R. (1990): CurrentOpinion in Biotechnology 1,74-78.Hernell, O. & Blāckberg, L.(1982): Pediatr. Res. 16,882-885.Hogan, B., Constantini, F. andLacy, E. (1986):Manipulating the mouse embryo. A Laboratory Manual.Cold Spring Harbor Laboratory Press.Hui, D. and Kissel, J.A. (1990): Febs Lett. 276, 131-134.Lombardo, D., Guy, O. & Figarella, C. (1978): Biochim.Biophys. Acta 527, 142-149.Maniatis, T., Fritsch, E.F. & Sambrook, J.: MolecularCloning. A Laboratory Manual. (Cold Spring Harbor, NY,1982)Mohandas, T., Sparkes, R.S., Sparkes, M.C., Shulkin,J.D., Toomey, K.E. and Funderburk, S.J. (1979): Am. J.Hum. Genet. 31, 586-600.Mohandas, T., Heinzmann, C., Sparkes, R.S. Wasmuth, J.,Edwards,P. and Lusis, A.J.(1986): Somatic Cell. Mol.Genet. 12, 89-94.Mount, S.M.(1982): Nucleic Acids Res. 10, 459-472.Nilsson, J., Blāckberg, L., Carlsson, P., Enerbāck, S.,Hernell, O. and Bjursell, G.(1990): Eur. J. Biochem.192,543-550.Qasba, M., and Safaya, S.K.(1984): Nature 308, 377-380.Reue, K.,Zambaux, J., Wong, H., Lee, G., Leete, T.H.,Ronk, M., Shively, J.E., Sternby, B., Borgstrōm, B.,Ameis, D. and Schotz, M.C.(1991): J. Lipid. Res. 32,267-276.Sambrook, J., Fritsch, E.F. and Maniatis, T.E.:Molecular Cloning. A Laboratory Manual.(Cold SpringHarbor, NY, 1989)Sanger, F., Nicklen, S. and Coulson, A.R.(1977): Proc.Natl. Acad. Sci. U.S.A. 74, 5463-5467.Sjōberg, S., Carlsson, P., Enerbāck, S. and Bjursell,G. (1989): Comput. Appl. Biol. Sci. 5,41-46.Taggart. R.T., Mohandas, T., Shows, T.B. and Bell, G.I.(1985): Proc. Natl. Acad. sci. U.S.A. 82, 6240-6244.Warburton, D., Gersen, S., Yu, M.T., Jackson, C.,Handelin, B. and Housman, D.(1990): Genomics 6, 358-366.Whitelaw et al..(1991): Transgenic Research 1, 3-13.Yu-Lee, L., Richter-Mann, L., Couch, C.,Stewart, F.,Mackinlay, G. and Rosen, J.(1986): Nucleic. Acid. Res.14,1883-1902.ōfverstedt, L.G., Hammarstrōm, K., Balgobin, N.,Hjerten, S., Petterson, U. and Chattopadhyaya, J.(1984): Biochim. Biophys. Acta 782, 120-126.The Escherichia coli strains TG2 ps 452, TG2 ps 451, TG2 ps 310, and TG2ps 309 involved in the present invention have been respectively registered as CCTCC No: M93026, CCTCC No: M93025, CCTCC No: M93024, and CCTCC No: M93023 on June 11, 1993. Preserved in China Center for Type Culture Collection. References: Abouakil, N., Rogalska, E., Bonicel, J. & Lombardo, D.(1988): Biochim. Biophys. Acta 961, 299-308. Ausubel, F.M., Brent, R.E., Moore, D.D., Smiyh , J.A.,Seidman, J.G. and Struhl, K.: Current Protocols in Molecular Biology.(Wiley Interscience, New York 1987) Baba, T., Downs, D, Jackson, K.W., Tang, J. and Wang,C.S. (1991): Biochemistry 30, 500-510. Beato, M.(1989): Cell 56, 335-344. Bernbāck, S., Blāckberg, L. & Hernell, O.(1990): J.Clin. Invest. 221-226. Bjōrksten, B., Burman, L.G., de Chateau, P., Freorl, Kzon, B., Gothefors, L. & Hernell, O. (1980): Br. Med. J.201, 267-272. Blāckberg, L.,  ngquist, K.A, & Hernell, O.(1987): FEBS Lett. 217, 37-41. Blāckberg, L. & Hernell, O. (1981): Eur. J. Biochem116, 221-225. Blāckberg, L. Lombardo , D., Hernell, O., Guy, O. & Olivecrona, T.(1981): FEBS Lett. 136, 284-288. Boulet, A.M., Erwin, C.R. and Rutter, W.J.(1986): Proc. Natl. Acad . Sci. U.S.A. 83, 3599-3603. Brinster, R.L., Allen, J.M., Behringer, R.R., Gelinas, R.E. & Palmiter, R.D.(1988): Proc. Natl. Acad. Sci. U.S.A. 85, 836, 840. Callen D.F.(1986): Ann. Genet. 29, 235-239. Campbell, S.M., Rosen, J.M., Hennighausen, L.G., Strech-Jurk, U. and Sippel, A.E.(1984): Nucleic Acid Res.12, 8685-8697. chirgwin, J.M., Przybyla, A.E., MacDonald, R.J. and Rutter, W.J.(1979): Biochemistry 18, 5294-5299. Clark, A.J., Simons, P., Wilmut, I. and Lahte, R.(1987): TIBTECH 5, 20-24.Devereux, J., Haeberli, P. and Smithies. (1984):Nucleic Acids Res. 12, 387-395.Feinberg, A. and vogelstein, B. (1983):Anal.Biochem.132, 6 -13.Hennighausen, L., Ruiz, L. & Wall, R. (1990): Current Opinion in Biotechnology 1,74-78.Hernell, O. & Blāckberg, L.(1982): Pediatr. Res. 16,882-885 .Hogan, B., Constantini, F. andLacy, E. (1986): Manipulating the mouse embryo. A Laboratory Manual. Cold Spring Harbor Laboratory Press. Hui, D. and Kissel, J.A. (1990): Febs Lett. 276, 131-134. Lombardo, D., Guy, O. & Figarella, C. (1978): Biochim. Biophys. Acta 527, 142-149. Maniatis, T., Fritsch, E.F. & Sambrook, J.: Molecular Cloning. A Laboratory Manual. (Cold Spring Harbor, NY, 1982) Mohandas, T., Sparkes, R.S., Sparkes, M.C., Shulkin, J.D., Toomey, K.E. and Funderburk, S.J. (1979): Am. J.Hum. Genet. 31, 586-600. Mohandas, T., Heinzmann, C., Sparkes, R.S. Wasmuth, J., Edwards, P. and Lusis, A.J.(1986): Somatic Cell. Mol. Genet. 12, 89-94. Mount, S.M. (1982): Nucleic Acids Res. 10, 459-472. Nilsson, J., Blāckberg, L., Carlsson, P., Enerbāck, S., Hernell, O. and Bjursell, G.(1990): Eur. J . Biochem. 192, 543-550. Qasba, M., and Safaya, S.K.(1984): Nature 308, 377-380. Reue, K., Zambaux, J., Wong, H., Lee, G., Leete, T.H. , Ronk, M., Shively, J.E., Sternby, B., Borgstrōm, B., Ameis, D. and Schotz, M.C.(1991): J. Lipid. Res. 32, 267-276. Sambrook, J., Fritsch, E.F. and Maniatis, T.E.: Molecular Cloning. A Laboratory Manual. (Cold Spring Harbor, NY, 1989) Sanger, F., Nicklen, S. and Coulson, A.R. (1977): Proc. Natl. Acad. Sci. U.S.A. 74, 5463- 5467. Sjōberg, S., Carlsson, P., Enerbāck, S. and Bjursell, G. (1989): Comput. Appl. Biol. Sci. 5,41-46.Taggart. R.T., Mohandas, T., Shows, T.B. and Bell, G.I.(1985): Proc. Natl. Acad. sci. U.S.A. 82, 6240-6244. Warburton, D., Gersen, S., Yu, M.T., Jackson, C., Handelin, B. and Housman, D.(1990): Genomics 6, 358-366. Whitelaw et al..(1991): Transgenic Research 1, 3-13. Yu-Lee, L., Richter-Mann, L., Couch, C., Stewart , F.,Mackinlay, G. and Rosen, J.(1986): Nucleic. Acid. Res.14,1883-1902.ōfverstedt, L.G., Hammarstrōm, K., Balgobin, N.,Hjerten, S., Petterson, U. and Chattopadhyaya, J.(1984): Biochim. Biophys. Acta 782, 120-126.

表1 在16个人-啮齿体细胞杂交中，CEL序列与人染色体的关系细胞与人染色体的百分比^a染色体 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y CEL杂交GM09925 74 24 0 74 76 60 82 78 0 0 4 68 6 86 78 14 98 96 46 84 0 76 0 0 -GM09927 69 83 75 77 0 93 79 73 0 82 0 0 77 79 90 0 81 73 87 89 0 0 0 0 -GM09929 0 0 61 59 0 43 2 49 0 0 33 49 0 59 2 0 96 0 2 31 0 0 2 0 -GM09930A 0 34 62 4 12 0 26 4 0 0 6 22 56 82 12 0 86 78 0 22 82 76 6 8 -GM09932 0 0 0 68 86 46 0 80 0 2 28 26 0 0 0 0 96 0 2 0 92 0 0 0 -GM09933 50 0 84 16 54 76 92 54 0 6 0 50 84 78 92 0 88 70 80 32 94 88 0 32 -GM09934 0 50 0 0 83 79 4 87 0 0 77 87 0 2 89 0 90 89 0 91 89 2 0 0 -GM09935A 0 0 52 10 28 12 0 0 0 8 0 22 74 72 0 0 93 59 0 9 91 71 0 0 -GM09936 0 0 0 18 0 46 70 10 0 16 34 0 2 88 2 0 100 0 44 24 0 18 0 0 -GM09937 0 0 54 38 0 62 54 70 0 4 0 42 0 70 60 0 96 66 0 0 0 0 0 0 -GM09938 0 0 2 88 60 88 86 4 0 0 36 92 0 80 4 0 92 0 4 80 76 60 0 2 -GM09940 0 0 46 0 0 0 84 62 0 0 0 0 0 0 62 0 100 0 0 0 0 0 0 0 -GM10324 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 -GM10567 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 98 0 0 0 0 0 0 0 0 -GM10611 0 0 0 0 0 0 0 0 69 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +GM10095 0 0 0 0 0 0 0 0 94^b 0 0 0 0 0 0 0 0 0 0 0 0 0 94b 0 -不谐率Table 1 In 16 human-rodent somatic cell hybrids, the relationship between CEL sequences and human chromosomes The percentage of cells and human chromosomes ^a Chromosome 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y CEL Hybrid GM09925 74 24 0 74 76 60 82 78 0 0 4 68 6 86 78 14 98 96 46 84 0 76 0 0 -GM09927 69 83 75 77 0 93 79 73 0 82 0 0 87 79 90 93 0 8 0 0 0 -GM09929 0 0 61 59 0 43 2 49 0 0 33 49 0 59 2 0 96 0 2 31 0 0 2 0 -GM09930A 0 34 62 4 12 0 26 4 0 0 6 22 56 82 12 0 0 86 78 22 82 76 6 8 -GM09932 0 0 0 68 86 46 0 80 0 2 28 26 0 0 0 0 96 0 2 0 92 0 0 0 -GM09933 50 0 84 16 54 76 92 54 0 6 0 50 84 0 8 8 92 70 80 32 94 88 0 32 -GM09934 0 50 0 0 83 79 4 87 0 0 77 87 0 2 89 0 90 89 0 91 89 2 0 0 -GM09935A 0 0 52 10 28 12 0 0 0 84 0 722 7 0 93 59 0 9 91 71 0 0 -GM09936 0 0 0 18 0 46 70 10 0 16 34 0 2 88 2 0 100 0 44 24 0 18 0 0 -GM09937 0 0 54 38 0 62 54 70 0 2 4 0 0 4 70 60 0 96 66 0 0 0 0 0 0 -GM09938 0 0 2 88 60 88 86 4 0 0 36 92 0 80 4 0 92 0 4 80 76 60 0 2 -GM09940 0 0 46 0 0 0 84 62 0 0 0 0 0 0 62 0 100 0 0 0 0 0 0 0 -GM10324 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 -GM10567 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 98 0 0 0 0 0 0 0 0 -GM10611 0 0 0 0 0 0 0 0 69 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +GM10095 0 0 0 0 0 0 0 0 ^94b 0 0 0 0 0 0 0 0 0 0 0 0 0 94b 0 - Dissonance

4 5 88 7 7 100 9 9 0 2 6 10 5 10 7 2 13 8 5 9 7 6 3 24 5 88 7 7 100 9 9 0 2 6 10 5 10 7 2 13 8 5 9 7 6 3 2

16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16^a 通常，用Southerm印迹分析，在至少20到22％的检测细胞中存在人染色体^b含9pter-＞q34和Xq13-＞qter表2用于DNA扩增的引物寡核苷酸 nt位置^a 扩增的序列P1:5＇-AGACCTACGCCTACCTG-3＇ 8492-8508 外显子10P2:5＇-TCCAGTAGGCGATCATG-3＇ 8646-8662P4:5＇-GACCGATGTcCTCTTCCTGG-3＇ 7220-7239 有引物的外显子10P5: 5＇-CAGCCGAGTCGCCCATGTTG-3＇ 9016-9035 外显子10周围的外显子^bP6:5＇-ACCAAGAAGATGGGCAGCAGC-3＇ 9089-9109 外显子11中的重复P7:5＇-GACTGCAGGCATCTGAGCTTC-3＇ 9722-974216 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 ^a Usually, human chromosome ^b containing 9pter-> q34 and Xq13->qter Table 2 Primers used for DNA amplification oligonucleotide nt position ^a Amplified sequence P1: 5'-AGACCTACGCCTACCTG-3' 8492-8508 Exon 10P2: 5'-TCCAGTAGGCGATCATG-3' 8646-8662P4: 5'-GACCGATGTcCTCTTCCTGG-3' 7220-7239 Exon 10P5 with primer: 5'-CAGCCGAGTCGCCCATGTTG-3' 9016-9035 Exon ^b P6 around exon 10: 5'-ACCAAGAAGATGGGCAGCAGC-3 ' 9089-9109 Duplication P7 in exon 11: 5'-GACTGCAGGCATCTGAGCTTC-3' 9722-9742

^a将核苷酸位置给作从第1外显子开始的碱基数。为了与SEQ ID NO:1比较核苷酸位置，在柱的数中加了1640个碱基^b从cDNA扩增“外显子10”表3CEL基因的外显子-内含子结构 ^a Nucleotide positions are given as base numbers from exon 1. To compare nucleotide positions with SEQ ID NO: 1, 1640 bases were added to the number of columns ^b Amplified from cDNA "Exon 10" Table 3 Exon-intron structure of the CEL gene

外显子内含子编号核苷酸长度氨基酸在外显子-内含子接点的序列编号长度The length of the outer sub-inrticon number nucleotide length amino acids in the outer-inner sub-containing sequence number length

位置^a (nt) 位置编号 5＇拼接供体 3＇拼接受体 : (nt)1 1-87 87 1- 25 (25) CCC GCG AAG gtaaga....gtgtctccctcgcag CTG GGC CCC Ⅰ 23432 2431-2581 151 26- 75 (50) TGG CAA G gtggga....tcctgccacctgcag GG ACC CTG Ⅱ 853 2667-2789 123 76-116 (41) AAG CAA G gtctgc....gctcccccatctcag TC TCC CGG Ⅲ 2774 3067-3264 198 117-182 (66) CTG CCA G gtgcgt....ctgccctgcccccag GT AAC TAT Ⅳ 12885 4553-4683 131 183-226 (44) TCT CTG CAG gtctcg....ttctgggtcccgtag ACC CTC TCC Ⅴ 1776 4861-4968 108 227-262 (36) GCC AAA AAG gtaaac....tggttctgcccccag GTG GCT GAG Ⅵ 1427 5111-5228 118 263-301 (39) CTG GAG T gtgagt....ggctctcccacccag AC CCC ATG Ⅶ 14668 6695-6881 187 302-364 (63) GTC ACG GA gtaagc....acttgattcccccag G GAG GAC Ⅷ 1979 7079-7282 204 365-432 (68) AAT GCC AA gtgagg....gtctctcccctccag G AGT GCC Ⅸ 120110 8484-8681 198 433-498 (66) AAA ACA GG gtaaga....cttctcactctgcag G GAC CCC Ⅹ 32811 9010-9850 841 499-745 (247)Position ^a (nt) Position number 5' splice donor 3' splice acceptor: (nt)1 1-87 87 1- 25 (25) CCC GCG AAG gtaaga....gtgtctccctcgcag CTG GGC CCC Ⅰ 23432 2431-2581 151 26- 75 (50) TGG CAA G gtggga....tcctgccacctgcag GG ACC CTG Ⅱ 853 2667-2789 123 76-116 (41) AAG CAA G gtctgc....gctcccccatctcag TC TCC CGG Ⅲ 2774 3067-3264 198 117- 227 36) GCC AAA AAG gtaaac....tggttctgcccccag GTG GCT GAG Ⅵ 1427 5111-5228 118 263-301 (39) CTG GAG T gtgagt....ggctctcccacccag AC CCC ATG Ⅶ 14668 6695-6881 187 302-364 (63) GTC ACG GA gtaagc....acttgattcccccag G GAG GAC Ⅷ 1979 7079-7282 204 365-432 (68) AAT GCC AA gtgagg....gtctctcccctccag G AGT GCC IX 120110 8484-8681 198 433-498 (66) AAA ACA GG gtaaga....cttctcactctgcag G GAC CCC Ⅹ 32811 9010-9850 841 499-745 (247)

^a将核苷酸位置给成从第1外显子开始的碱基数。为了与SEQ ID NO:1比较核苷酸位置，在柱中数中加入1640个碱基。 ^a Nucleotide positions are given as base numbers from exon 1. For comparison of nucleotide positions with SEQ ID NO: 1, 1640 bases were added to the column count.

附图说明：图1CEL基因位点。表示两个部分重叠的克隆λBSSL1和λBSSL5A的位置及限制酶图谱。在下面表示外显子-内含子结构以及所用的限制酶位点。编号-1-11的盒表示外显子。Asp=Asp700,B=BamHⅠ,E=EcoRⅠ,S=SacⅠ,Sa=SalⅠ,Sp=SphⅠ以及X=XbaⅠ。用粗箭头表示Alu重复元素的位置和方向。a-h表示不同的亚克隆片段。图2来自人哺乳期乳腺，胰腺和脂肪组织的RNA的引物延伸分析。用一个末端放射标记的26聚体聚寡核苷酸(与CEL基因的33到58nt位置互补)引发RNA的反转录。A泳道是分子大小标记物(测序梯子),B泳道是胰RNA,C泳道是脂肪组织RNA,D泳道是乳腺RNA。图3人CEL及大鼠CEL基因5＇侧区域的点标绘分析。将同源区标记为A-H，并写出了代表这些部分的序列，上面的是人的，下面是大鼠的。图4人CEL基因5＇侧序列的分析。推测的识别序列是代表互补链的重点面线或画线的。粗字表示与rCEL同源的位置(区域A-H)。用点画线表示TATA盒。有两个与糖皮质激素受体结合位点一致序列，GGTACANNNTGTTCT有80％相似性的序列(Bato,M.,1989)，第一个在互补链上-231nt位(1A)，第2个在-811nt位(1B)。此外，在nt位一861(2)有一个与雌激索受体结合位点的一致序列AGGTCANNNTGACCT有87％相似性的序列(Beato,M.,1989)。Lubon和Henninghausen(1987)已经分析了乳清酸性蛋白质(WAP)基因的启动子和5＇侧序列，并建立了有关哺乳乳腺细胞核蛋白质结合位点。其中之一，在所研究的大量奶蛋白质基因中，如大鼠α-乳清蛋白基因(Qasba et al.,1984)，及大鼠α-酪蛋白基因(Yu-Lee et al.,1986)，存在11bp的保守序列，AAGAAGGAAGT。在CEL基因的5＇侧区域，在互补链nt位-1299(3)上有一个与该保守序列有82％相似性的序列。 Description of the drawings: Figure 1 CEL gene loci. The positions and restriction enzyme maps of two partially overlapping clones λBSSL1 and λBSSL5A are indicated. The exon-intron structure and the restriction enzyme sites used are shown below. Boxes numbered -1-11 represent exons. Asp=Asp700, B=BamHI, E=EcoRI, S=SacI, Sa=SalI, Sp=SphI and X=XbaI. The position and orientation of Alu repeat elements are indicated by thick arrows. a–h indicate different subcloned fragments. Figure 2 Primer extension analysis of RNA from human lactating mammary gland, pancreas and adipose tissue. Reverse transcription of the RNA was primed with a terminal radiolabeled 26-mer polyoligonucleotide (complementary to the 33 to 58 nt position of the CEL gene). Lane A is a molecular size marker (sequencing ladder), lane B is pancreatic RNA, lane C is adipose tissue RNA, and lane D is breast RNA. Fig. 3 Dot plot analysis of the 5' side region of human CEL and rat CEL gene. Regions of homology are labeled A-H and the sequences representing these portions are written, human at the top and rat at the bottom. Fig. 4 Analysis of the 5' side sequence of human CEL gene. The putative recognition sequence is underlined or underlined representing the emphasis of the complementary strand. Bold text indicates the position of homology to rCEL (regions A-H). TATA boxes are indicated by dotted lines. There are two consensus sequences with the glucocorticoid receptor binding site, GGTACANNNTGTTCT has 80% similarity sequence (Bato, M., 1989), the first one is at -231nt position (1A) on the complementary strand, the second one at -811nt position (1B). In addition, at nt position-861(2) there is a sequence with 87% similarity to the consensus sequence AGGTCANNNNTGACCT of the estrogen receptor binding site (Beato, M., 1989). Lubon and Henninghausen (1987) have analyzed the whey acidic protein (WAP) gene promoter and 5' side sequence, and established the nuclear protein binding site of the relevant mammalian mammary gland. One of them, among the large number of milk protein genes studied, such as the rat α-lactalbumin gene (Qasba et al., 1984), and the rat α-casein gene (Yu-Lee et al., 1986) , there is an 11bp conserved sequence, AAGAAGGAAGT. In the 5' side region of the CEL gene, there is a sequence with 82% similarity to the conserved sequence at nt position -1299(3) of the complementary chain.

在对β-酪蛋白基因调节的研究中，在怀孕或泌乳的小鼠核提取物中发现了一个组织特异性的乳腺因子(MGF)，并鉴定了它的识别序列(ANTTCTTGGNA)。在人CEL基因5＇侧区域，有两个序列，一个在互补链nt位-368(4A)，另一个在nt位-1095(4B)，它们两者与MGF结合位点的一致序列均有82％的相似性。除了在5＇侧区有这两个推测的MGF结合位点外，在内含子Ⅰnt 275的互补链上有一个与MGF结合位点的一致序列有100％同一性的序列，AGTTCTTGGCA。In a study of β-casein gene regulation, a tissue-specific mammary gland factor (MGF) was identified and its recognition sequence (ANTTCTTGGNA) identified in nuclear extracts of pregnant or lactating mice. In the 5' side region of the human CEL gene, there are two sequences, one is at nt position -368 (4A) of the complementary chain, and the other is at nt position -1095 (4B), both of them have consensus sequences with the binding site of MGF 82% similarity. In addition to these two putative MGF-binding sites in the 5' side region, there is a sequence, AGTTCTTGGCA, which has 100% identity to the consensus sequence of the MGF-binding site on the complementary strand of intron Int 275.

此外，有四个与大鼠胰特异性增强子元素GTCACCTGTGCTTTTCCCTG有65％相似性的序列(Boulet et al.,1986),一个在nt位-359(5A)，第二个在nt位-718(5B)，第3个在nt位-1140(5C)，最后一个在nt位-1277(5D)。图5生产质粒pS452的方法。有关详细说明参见实施例2。图6质粒pS312的图示结构。图7质粒pS452的图示结构。图8代表如在实施例2中所述，在WAP基因的第1外显子中自然导入人BSSL/CEL基因组结构的真实结构图。图9A．图示说明用于鉴别基因突变动物的PCR引物的定位。 5＇引物位于WAP与BSSL/CEL之间融合的上游-148bp位开始的WAP序列内。3＇引物位于融合点下游终点398bp的第1BSSL/CEL内含子中。B．所用的PCR引物序列。C．琼脂糖凝胶表明潜在建立者动物PCR分析的典型分析。M：分子量标记物。泳道1：从质粒pS452生产的对照PCR产物。泳道2-13：用来自潜在建立者动物的DNA制品所进行的PCR反应。图10免疫印迹分析来自基因突变小鼠种的奶中有关pS452重组鼠WAP/人CEL基因。在SDS-PAGE上分离蛋白质，转移到Immobilon膜(Millipor)上，用高纯度人天然CEL产生的多克隆兔抗体，接着用碱性磷酸酶标记的猪抗兔IgG(Dakopatts)反应，肉眼观察。泳道1：低分子量标记物，分别106,80,49.5,32.5,27.5和18.5KDa。泳道2：高分子量标记物：分别205,116.5,80和49.5KDa。泳道3:25ng来自人奶的纯化非重组CEL。泳道4：来自CEL基因组突变小鼠，以1∶10稀释的2μl奶样品。泳道5和6：来自两个不同非CEL基因突变小鼠以1∶10稀释的2μl奶样品，它们作为对照样品。In addition, there are four sequences with 65% similarity to the rat pancreas-specific enhancer element GTCACCTGTGCTTTTTCCCTG (Boulet et al., 1986), one at nt position -359 (5A), and the second at nt position -718 ( 5B), the third at nt position -1140 (5C), and the last at nt position -1277 (5D). Figure 5. Method for producing plasmid pS452. See Example 2 for details. Figure 6 Schematic structure of plasmid pS312. Figure 7 Schematic structure of plasmid pS452. FIG. 8 represents the actual structural diagram of the genome structure of human BSSL/CEL naturally introduced in exon 1 of WAP gene as described in Example 2. FIG. Figure 9A. Schematic illustrating the positioning of PCR primers used to identify genetically mutant animals. The 5'primer is located in the WAP sequence starting at the upstream-148bp position of the fusion between WAP and BSSL/CEL. The 3'primer is located in the 1st BSSL/CEL intron 398bp downstream of the fusion point. B． The PCR primer sequences used. C． Agarose gels indicate a typical analysis of potential founder animal PCR assays. M: molecular weight marker. Lane 1: Control PCR product produced from plasmid pS452. Lanes 2-13: PCR reactions performed with DNA preparations from potential founder animals. Figure 10 Western blot analysis of pS452 recombinant mouse WAP/human CEL gene in milk from genetically mutant mouse species. Proteins were separated on SDS-PAGE, transferred to Immobilon membrane (Millipor), and polyclonal rabbit antibody produced by high-purity human natural CEL was used, followed by reaction with alkaline phosphatase-labeled pig anti-rabbit IgG (Dakopatts), and observed with naked eyes. Lane 1: Low molecular weight markers, 106, 80, 49.5, 32.5, 27.5 and 18.5 KDa, respectively. Lane 2: High molecular weight markers: 205, 116.5, 80 and 49.5 KDa, respectively. Lane 3: 25 ng of purified non-recombinant CEL from human milk. Lane 4: 2 [mu]l milk samples diluted 1:10 from CEL genome mutant mice. Lanes 5 and 6: 2 μl milk samples diluted 1:10 from two different non-CEL mutant mice, which served as control samples.

序列目录(1)一般资料：Sequence Catalog (1) General information:

(ⅰ)申请人：(i) Applicant:

(A)名字：ABASTRA(A) Name: ABASTRA

(B)街道：Kvarnbergagatan 16(B) Street: Kvarnbergagatan 16

(C)城市：Sodertalje(C) City: Sodertalje

(E)国家：Sweden(E) Country: Sweden

(F)邮政编码(ZIP)：S-151 85(F) Zip Code (ZIP): S-151 85

(G)电话：+46-8-553 26000(G) Tel: +46-8-553 26000

(H)传真：+46-8-553 28820(H) Fax: +46-8-553 28820

(I)电传：19237 astra S(I) Telex: 19237 astra S

(ⅱ)发明题目：新的DNA序列(ii) Title of Invention: New DNA Sequence

(ⅲ)序列数：1(iii) Number of sequences: 1

(ⅳ)计算机可读形式：(iv) Computer-readable form:

(A)介质类型：Floppy盘(A) Media type: Floppy disk

(B)计算机：IBM PC兼容机(B) Computer: IBM PC compatible

(C)操作程序：PC-DOS/MS-DOS(C) Operating program: PC-DOS/MS-DOS

(D)软件：Patentln Release #1.0,Version #1.25(EPO)(D) Software: Patentln Release #1.0, Version #1.25(EPO)

(ⅵ)在先申请资料：(ⅵ) Prior application materials:

(A)申请号：SE 9201809-2(A) Application number: SE 9201809-2

(B)申请日：1992,6,11(B) Application date: 1992, June, 11

(ⅵ)在先申请资料：(ⅵ) Prior application materials:

(A)申请号：SE 9201826-6(A) Application number: SE 9201826-6

(B)申请日：1992,6,12(B) Application date: 1992, June, 12

(ⅵ)在先申请资料：(ⅵ) Prior application materials:

(A)申请号：SE 9202088-2(A) Application number: SE 9202088-2

(B)申请日：1992,7,3(B) Application date: July, 3, 1992

(ⅵ)在先申请资料：(ⅵ) Prior application materials:

(A)申请号：SE 9300902-5(A) Application number: SE 9300902-5

(B)申请日：1992,3,19(2)SEQ：Q ID NO:1的资料：(B) Application date: 1992,3,19(2) Information of SEQ: Q ID NO:1:

(ⅰ)序列特征：(i) Sequence features:

(A)长度：11531..个碱基对(A) Length: 11531.. base pairs

(B)类型：核酸(B) Type: nucleic acid

(C)链型：双链(C) chain type: double chain

(D)拓扑结构：线性(D) Topology: linear

(ⅱ)分子类型：DNA(基因组)(ii) Molecular type: DNA (genome)

(ⅵ)来源：(ⅵ) Source:

(A)生物：人类(A) Biology: Human

(F)组织种类：乳腺(F) Tissue Type: Breast

(ⅸ)特征：(ⅸ) Features:

(A)名字/关键：CDS(A) Name/key: CDS

(B)位置：连接(1653..1727,4071..4221,4307..4429,(B) Location: Connection (1653..1727,4071..4221,4307..4429,

4707..4904,6193..6323,6501..6608, 4707..4904,6193..6323,6501..6608,

6751..6868,8335..8521,8719..8922,6751..6868,8335..8521,8719..8922,

10124..10321,10650..11394) 10124..10321,10650..11394)

(ⅸ)特征：(ⅸ) Features:

(A)名字/关键：不光滑-肽(A) Name/key: matte-peptide

(B)位置：连接(1722..1727,4071..4221,4307..4429,(B) Location: Connection (1722..1727,4071..4221,4307..4429,

4707..4904,6193..6323,6501..6608, 4707..4904,6193..6323,6501..6608,

6751..6868,8335..8521,8719..8922,6751..6868,8335..8521,8719..8922,

10124..10321,10650..11391) 10124..10321,10650..11391)

(D)其它资料：/EC_数=3.1.1.1(D) Other information: /EC_number=3.1.1.1

/产品=“胆汁盐刺激的脂酶”/product = "Bile salt-stimulated lipase"

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：5＇UTR(A) Name/key: 5'UTR

(B)位置：1…1..640(B) Position: 1...1..640

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：TATA-信号(A) Name/Key: TATA-Signal

(B)位置：1611…1617(B) Position: 1611...1617

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：1641…1727(B) Position: 1641...1727

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：4071…4221(B) Position: 4071...4221

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：4307…4429(B) Position: 4307...4429

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：4707…4904(B) Position: 4707...4904

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：6193…6323(B) Position: 6193...6323

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：6501…6608(B) Position: 6501...6608

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：675.1…6868(B) Position: 675.1...6868

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：8335…8521(B) Location: 8335...8521

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：8719…8922(B) Position: 8719...8922

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：10124…10321(B) Position: 10124...10321

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：外显子(A) Name/Key: Exon

(B)位置：10650…11490(B) Position: 10650...11490

(ⅸ)特征：(ⅸ) Features:

(A)名称/关键：3＇UTRR(A) Name/Key: 3'UTRR

(B)位置：11491…11531(B) Position: 11491...11531

(Ⅹⅰ)序列描述：SEQ ID NO:1GGATCCCTCG AACCCAGGAG TTCAAGACTG CAGTGAGCTA TGATTGTGCC ACTGCACTCT 60AGCCTGGGTG ACAGAGACCC TGTCTCAAAA AAACAAACAA ACAAAAAACC TCTGTGGACT 120CCGGGTGATA ATGACATGTC AATGTGGATT CATCAGGTGT TAACAGCTGT ACCCCCTGGT 180GGGGGATGTT GATAACGGGG GAGACTGGAG TGGGGCGAGG ACATACGGGA AATCTCTGTA 240ATCTTCCTCT AATTTTGCTG TGAACCTAAA GCTGCTCTAA AAATGTACAT AGATATAAAC 300TGGGGCCTTC CTTTCCCTCT GCCCTGCCCC AGCCCTCCCC CACCTCCTTC CTCTCCCTGC 360TGCCTCCCCT CTGCCCTCCC CTTTCCTCCT TAGCCACTGT AAATGACACT GCAGCAAAGG 420TCTGAGGCAA ATGCCTTTGC CCTGGGGCGC CCCAGCCACC TGCAGGCCCC TTATTTCCTG 480TGGCCGAGCT CCTCCTCCCA CCCTCCAGTC CTTTCCCCAG CCTCCCTCGC CCACTAGGCC 540TCCTGAATTG CTGGCACCGG CTGTGGTCGA CAGACAGAGG GACAGACGTG GCTCTGCAGG 600TCCACTCGGT CCCTGGCACC GGCCGCAGGG GTGGCAGAAC GGGAGTGTGG TTGGTGTGGG 660AAGCACAGGC CCCAGTGTCT CCTGGGGGAC TGTTGGGTGG GAAGGCTCTG GCTGCCCTCA 720CCCTGTTCCC ATCACTGCAG AGGGCTGTGC GGTGGCTGGA GCTGCCACTG AGTGTCTCGG 780TGAGGGTGAC CTCACACTGG CTGAGCTTAA AGGCCCCATC TGAAGACTTT GTTCGTCGTG 840TTCTTTCACT TCTCAGAGCC TTTCCTGGCT CCAGGATTAA TACCTGTTCA CAGAAAATAC 900GAGTCGCCTC CTCCTCCACA ACCTCACACG ACCTTCTCCC TTCCCTCCCG CTGGCCTCTT 960TCCCTCCCCT TCTGTCACTC TGCCTGGGCA TGCCCCAGGG CCTCGGCTGG GCCCTTTGTT 1020TCCACAGGGA AACCTACATG GTTGGGCTAG ATGCCTCCGC ACCCCCCCAC CCACACCCCC 1080TGAGCCTCTA GTCCTCCCTC CCAGGACACA TCAGGCTGGA TGGTGACACT TCCACACCCT 1140TGAGTGGGAC TGCCTTGTGC TGCTCTGGGA TTCGCACCCA GCTTGGACTA CCCGCTCCAC 1200GGGCCCCAGG AAAAGCTCGT ACAGATAAGG TCAGCCACAT GAGTGGAGGG CCTGCAGCAT 1260GCTGCCCTTT CTGTCCCAGA AGTCACGTGC TCGGTCCCCT CTGAAGCCCC TTTGGGGACC 1320TAGGGGACAA GCAGGGCATG GAGACATGGA GACAAAGTAT GCCCTTTTCT CTGACAGTGA 1380CACCAAGCCC TGTGAACAAA CCAGAAGGCA GGGCACTGTG CACCCTGCCC GGCCCCACCA 1440TCCCCCTTAC CACCCGCCAC CTTGCCACCT GCCTCTGCTC CCAGGTAAGT GGTAACCTGC 1500ACAGGTGCAC TGTGGGTTTG GGGAAAACTG GATCTCCCTG CACCTGAGGG GGTAGAGGGG 1560AGGGAGTGCC TGAGAGCTCA TGAACAAGCA TGTGACCTTG GATCCAGCTC CATAAATACC 1620CGAGGCCCAG GGGGAGGGCC ACCCAGAGGC TG ATG CTC ACC ATG GGG CGC CTG 1673(Ⅹⅰ)序列描述：SEQ ID NO:1GGATCCCTCG AACCCAGGAG TTCAAGACTG CAGTGAGCTA TGATTGTGCC ACTGCACTCT 60AGCCTGGGTG ACAGAGACCC TGTCTCAAAA AAACAAACAA ACAAAAAACC TCTGTGGACT 120CCGGGTGATA ATGACATGTC AATGTGGATT CATCAGGTGT TAACAGCTGT ACCCCCTGGT 180GGGGGATGTT GATAACGGGG GAGACTGGAG TGGGGCGAGG ACATACGGGA AATCTCTGTA 240ATCTTCCTCT AATTTTGCTG TGAACCTAAA GCTGCTCTAA AAATGTACAT AGATATAAAC 300TGGGGCCTTC CTTTCCCTCT GCCCTGCCCC AGCCCTCCCC CACCTCCTTC CTCTCCCTGC 360TGCCTCCCCT CTGCCCTCCC CTTTCCTCCT TAGCCACTGT AAATGACACT GCAGCAAAGG 420TCTGAGGCAA ATGCCTTTGC CCTGGGGCGC CCCAGCCACC TGCAGGCCCC TTATTTCCTG 480TGGCCGAGCT CCTCCTCCCA CCCTCCAGTC CTTTCCCCAG CCTCCCTCGC CCACTAGGCC 540TCCTGAATTG CTGGCACCGG CTGTGGTCGA CAGACAGAGG GACAGACGTG GCTCTGCAGG 600TCCACTCGGT CCCTGGCACC GGCCGCAGGG GTGGCAGAAC GGGAGTGTGG TTGGTGTGGG 660AAGCACAGGC CCCAGTGTCT CCTGGGGGAC TGTTGGGTGG GAAGGCTCTG GCTGCCCTCA 720CCCTGTTCCC ATCACTGCAG AGGGCTGTGC GGTGGCTGGA GCTGCCACTG AGTGTCTCGG 780TGAGGGTGAC CTCACACTGG CTGAGCTTAA AGGCCCCATC TGAAGACTTT GTTCGTCGTG 840TTCTTTCACT TCTCAGAGCC TTTCCTGGCT CCAGGATTAA TACCTGTTCA CAGAAAATAC 900GAGTCGCCTC CTCCTCCACA ACCTCACACG ACCTTCTCCC TTCCCTCCCG CTGGCCTCTT 960TCCCTCCCCT TCTGTCACTC TGCCTGGGCA TGCCCCAGGG CCTCGGCTGG GCCCTTTGTT 1020TCCACAGGGA AACCTACATG GTTGGGCTAG ATGCCTCCGC ACCCCCCCAC CCACACCCCC 1080TGAGCCTCTA GTCCTCCCTC CCAGGACACA TCAGGCTGGA TGGTGACACT TCCACACCCT 1140TGAGTGGGAC TGCCTTGTGC TGCTCTGGGA TTCGCACCCA GCTTGGACTA CCCGCTCCAC 1200GGGCCCCAGG AAAAGCTCGT ACAGATAAGG TCAGCCACAT GAGTGGAGGG CCTGCAGCAT 1260GCTGCCCTTT CTGTCCCAGA AGTCACGTGC TCGGTCCCCT CTGAAGCCCC TTTGGGGACC 1320TAGGGGACAA GCAGGGCATG GAGACATGGA GACAAAGTAT GCCCTTTTCT CTGACAGTGA 1380CACCAAGCCC TGTGAACAAA CCAGAAGGCA GGGCACTGTG CACCCTGCCC GGCCCCACCA 1440TCCCCCTTAC CACCCGCCAC CTTGCCACCT GCCTCTGCTC CCAGGTAAGT GGTAACCTGC 1500ACAGGTGCAC TGTGGGTTTG GGGAAAACTG GATCTCCCTG CACCTGAGGG GGTAGAGGGG 1560AGGGAGTGCC TGAGAGCTCA TGAACAAGCA TGTGACCTTG GATCCAGCTC CATAAATACC 1620CGAGGCCCAG GGGGAGGGCC ACCCAGAGGC TG ATG CTC ACC ATG GGG CGC CTG 1673

Met Leu Thr Met Gly Arg Leu ,

-23 -20CAA CTG GTT GTG TTG GGC CTC ACC TGC TGC TGG GCA GTG GCG AGT GCC 1721Gln Leu Val Val Leu Gly Leu Thr Cys Cys Trp Ala Val Ala Ser Ala-23 -20CAA CTG GTG TTG GGC CTC CTC ACC TGC TGC TGG GCA GCG GCG AGT GCC 1721GLN Leu Val Leu GLY

-15 -10 -5GCG AAG GTAAGAGCCC AGCAGAGGGG CAGGTCCTGC TGCTCTCTCG CTCAATCAGA 1777Ala Lys1TCTGGAAACT TCGGGCCAGG CTGAGAAAGA GCCCAGCACA GCCCCCCAGC AGATCCCGGG 1837-15 -10-5GCG AAG GTAGCCCCCCCCCCAGGGGGGGGGGTCTGC TGCTCGCG CTCAATCAGA 1777ALA LYS1TCTGGAACT TCGGGGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGGG

CACTCACGCT CATTTCTATG GGGACAGGTG CCAGGTAGAA CACAGGATGC CCAATTCCAT 1897TTGAATTTCA GATAAACTGC CAAGAACTGC TGTGTAAGTA TGTCCCATGC AATATTTGAA 1957ACAAATTTCT ATGGGCCGGG CGCAGTGGCT CACACCTGCA ATCCCACCAG TTTGGGAGGC 2017CGAGGTGGGT GGATCACTTG AGGTCAGGAG TTGGAGACCA GCCTGGCCAA CATC GTGAAA 2077CCCCGTCTCT ACTAAAAATA CAAATATTAA TCGGGCGTGG TGGTGGGTGC CTGTAATCCC 2137AGCTACTCGG GAGGCTGAGG CAGGAGAACC GCTTGAAGCT GGGAGGTGGA GATTGCGGTG 2197AGCTGAGATC ACGCTACTGC ACTCCAGCCT GGGTGACAGG GCGAGACTCT GTCTCAAAAA 2257ATAGAAAAAG AAAAAAATGA AACATACTAA AAAACAATTC ACTGTTTACC TGAAATTCAA 2317ATGTAACTGG GCCTCTTGAA TTTACATTTG CTAATCCTGG TGATTCCACC TACCAACCTC 2377TCTGTTGTTC CCATTTTACA GAAGGGGAAA CGGGCCCAGG GGCAGGGAGT GTGGAGAGCA 2437GGCAGACGGG TGGAGAGAAG CAGGCAGGCA GTTTGCCCAG CATGGCACAG CTGCTGCCTC 2497CTATTCCTGT GCAGGAAGCT GAAAGCCGGG CTACTCCACA CCCGGGTCCG GGTCCCTCCA 2557GAAAGAGAGC CGGCAGGCAG GAGCTCTCTC GAGGCATCCA TAAATTCTAC CCTCTCTGCC 2617TGTGAAGGAG AAGCCACAGA AACCCCAAGC CCCACAGGAA GCCGGTGTCG GTGCCCGGCC 2677CAGTCCCTGC CCCCAGCAGG AGTCACACAG GGGACCCCAG ATCCCAACCA CGCTGTTCTG 2737CTGCCTGCGG TGTCTCAGGC CCTGGGGACT CCTGTCTCCA CCTCTGCTGC CTGCTCTCCA 2797CACTCACGCT CATTTCTATG GGGACAGGTG CCAGGTAGAA CACAGGATGC CCAATTCCAT 1897TTGAATTTCA GATAAACTGC CAAGAACTGC TGTGTAAGTA TGTCCCATGC AATATTTGAA 1957ACAAATTTCT ATGGGCCGGG CGCAGTGGCT CACACCTGCA ATCCCACCAG TTTGGGAGGC 2017CGAGGTGGGT GGATCACTTG AGGTCAGGAG TTGGAGACCA GCCTGGCCAA CATC GTGAAA 2077CCCCGTCTCT ACTAAAAATA CAAATATTAA TCGGGCGTGG TGGTGGGTGC CTGTAATCCC 2137AGCTACTCGG GAGGCTGAGG CAGGAGAACC GCTTGAAGCT GGGAGGTGGA GATTGCGGTG 2197AGCTGAGATC ACGCTACTGC ACTCCAGCCT GGGTGACAGG GCGAGACTCT GTCTCAAAAA 2257ATAGAAAAAG AAAAAAATGA AACATACTAA AAAACAATTC ACTGTTTACC TGAAATTCAA 2317ATGTAACTGG GCCTCTTGAA TTTACATTTG CTAATCCTGG TGATTCCACC TACCAACCTC 2377TCTGTTGTTC CCATTTTACA GAAGGGGAAA CGGGCCCAGG GGCAGGGAGT GTGGAGAGCA 2437GGCAGACGGG TGGAGAGAAG CAGGCAGGCA GTTTGCCCAG CATGGCACAG CTGCTGCCTC 2497CTATTCCTGT GCAGGAAGCT GAAAGCCGGG CTACTCCACA CCCGGGTCCG GGTCCCTCCA 2557GAAAGAGAGC CGGCAGGCAG GAGCTCTCTC GAGGCATCCA TAAATTCTAC CCTCTCTGCC 2617TGTGAAGGAG AAGCCACAGA AACCCCAAGC CCCACAGGAA GCCGGTGTCG GTGCCCGGCC 2677CAGTCCCTGC CCCCAGCAGG AGTCACACAG GGGACCCCAG ATCCCAACCA CGCTGTTCTG 2737CTGCCTGCGG TGTCTCAGGC CCTGGGGACT CCTGTCTCCA CCTCTGCTGC CTGCTCTCCA 2797

CACTCCCTGG CCCTGGGACC GGGAGGTTTG GGCAGTGGTC TTGGGCTCCT GACTCAAAGG 2857CACTCCCTGG CCCTGGGACC GGGAGGTTTG GGCAGTGGTC TTGGGCTCCT GACTCAAAGG 2857

AGAGGTCACC TTCTTCTTGG GCGAGCTCTT CTTGGGGTGC TGAGAGGCCT TCGGCAGGTC 2917ATCACGACCC CTCCCCATTT CCCCACCCTG AGGCCCTCTG GCCAGTCTCA ATTGCACAGG 2977GATCACGCCA CTGGCACAAG GAGACACAGA TGCCTCGCAG GGGATGCCCA CGATGCCTGC 3037ATGTGTTGCT TCTGGTTCCT TTCCTCCAGT TCCAACCGCC GCACTCTCCC ACACCAGTGT 3097GACAGGGGGC CCATCACCCT AGACTTCAGA GGGCTGCTGG GACCCTGGCT GGGCCTGGGG 3157GTGTAGGGCC ACCCTGCCCT TCCCCACCTG GAACCTGGCA CAGGTGACAG CCAGCAAGCA 3217ATGACCTGGT CCCACCATGC ACCACGGGAA GAGGGAGCTG CTGCCCAAGA TGGACAGGAG 3277GTGGCACTGG GGCAGACAGC TGCTTCTCAA CAGGGTGACT TCAAGCCCAA AAGCTGCCCA 3337GCCTCAGTTC CGTCAGGGAC AGAGGGTCGGA TGAGCACCAA CCTCCAGGCC CCTCGTGGGG 3397GTGGACAGCT TGGTGCACAG AGGCCATTTT CATGGCACAG GGAAGCGTGG CGGGGGTGGG 3457AGGTGTCGTC CCTAGGGGGT TCTTTACCAG CAGGGGGCTC AGGAACTGTG GGGACTTGGG 3517CATGGGGCCA TCGACTTTGT GCCCAGCCAG CTAGGCCCTG TGCAGGGAGA TGGGAGGAGG 3577GAAAAGCAGG CCCCACCCCT CAGAAAGGAG GAAGGTTGGT GTGAAACATC CCGGGTACAC 3637TGAGCATTGG GTACACTCCT CCCGGGAGCT GGACAGGCCT CCCATGTGAT GGCAAACAGG 3697CCGACAGGAG ACACGGCTGT TGCTCGTCTT CCACATGGGG AAACTGAGGA TCGGAGTCAA 3757AGCTGGGCGG CCATAGCCAG AACCCAAACC TCCATCCCAC CTCTTGGCCG GCTTCCCTAG 3817TGGGAACACT GGTTGAACCA GTTTCCTCTA AGATTCTGGG AGCAGGACAC CCCCAGGGAT 3877AAGGAGAGGA ACAGGAATCC TAAAGCCCTG AGCATTGCAG GGCAGGGGGT GCTGCCTGGG 3937TCTCCTGTGC AGAGCTGTCC TGCTTTGAAG CTGTCTTTGC CTCTGGCCAC GCGGAGTCGG 3997CTTGCCTTGC CCCCTCCGGA TTCAGGCCGA TGCGGCTTGA GCCCCCCTGA CCCTGCCCGT 4057GTCTCCCTCG CAG CTG GGC GCC GTG TAC ACA GAA GGT GGG TTC GTG GAA 4106AGAGGTCACC TTCTTCTTGG GCGAGCTCTT CTTGGGGTGC TGAGAGGCCT TCGGCAGGTC 2917ATCACGACCC CTCCCCATTT CCCCACCCTG AGGCCCTCTG GCCAGTCTCA ATTGCACAGG 2977GATCACGCCA CTGGCACAAG GAGACACAGA TGCCTCGCAG GGGATGCCCA CGATGCCTGC 3037ATGTGTTGCT TCTGGTTCCT TTCCTCCAGT TCCAACCGCC GCACTCTCCC ACACCAGTGT 3097GACAGGGGGC CCATCACCCT AGACTTCAGA GGGCTGCTGG GACCCTGGCT GGGCCTGGGG 3157GTGTAGGGCC ACCCTGCCCT TCCCCACCTG GAACCTGGCA CAGGTGACAG CCAGCAAGCA 3217ATGACCTGGT CCCACCATGC ACCACGGGAA GAGGGAGCTG CTGCCCAAGA TGGACAGGAG 3277GTGGCACTGG GGCAGACAGC TGCTTCTCAA CAGGGTGACT TCAAGCCCAA AAGCTGCCCA 3337GCCTCAGTTC CGTCAGGGAC AGAGGGTCGGA TGAGCACCAA CCTCCAGGCC CCTCGTGGGG 3397GTGGACAGCT TGGTGCACAG AGGCCATTTT CATGGCACAG GGAAGCGTGG CGGGGGTGGG 3457AGGTGTCGTC CCTAGGGGGT TCTTTACCAG CAGGGGGCTC AGGAACTGTG GGGACTTGGG 3517CATGGGGCCA TCGACTTTGT GCCCAGCCAG CTAGGCCCTG TGCAGGGAGA TGGGAGGAGG 3577GAAAAGCAGG CCCCACCCCT CAGAAAGGAG GAAGGTTGGT GTGAAACATC CCGGGTACAC 3637TGAGCATTGG GTACACTCCT CCCGGGAGCT GGACAGGCCT CCCATGTGAT GGCAAACAGG 3697CCGACAGGAG ACACGGCTGT TGCTCGTCTT CCACATGGGG AAACTGAGGA TCGGAGTCAA 3757AGCTGGGCGG CCATAGCCAG AACCCAAACC TCCATCCCAC CTCTTGGCCG GCTTCCCTAG 3817TGGGAACACT GGTTGAACCA GTTTCCTCTA AGATTCTGGG AGCAGGACAC CCCCAGGGAT 3877AAGGAGAGGA ACAGGAATCC TAAAGCCCTG AGCATTGCAG GGCAGGGGGT GCTGCCTGGG 3937TCTCCTGTGC AGAGCTGTCC TGCTTTGAAG CTGTCTTTGC CTCTGGCCAC GCGGAGTCGG 3997CTTGCCTTGC CCCCTCCGGA TTCAGGCCGA TGCGGCTTGA GCCCCCCTGA CCCTGCCCGT 4057GTCTCCCTCG CAG CTG GGC GCC GTG TAC ACA GAA GGT GGG TTC GTG GAA 4106

Leu Gly Ala Val Tyr Thr Glu Gly Gly Phe Val GluLeu Gly Ala Val Tyr Thr Glu Gly Gly Phe Val Glu

5 10GGC GTC AAT AAG AAG CTC GGC CTC CTG GGT GAC TCT GTG GAC ATC TTC 4154Gly Val Ash Lys Lys Leu Gly Leu Leu Gly Asp Ser Val Asp Ile Phe15 20 25 30AAG GGC ATC CCC TTC GCA GCT CCC ACC AAG GCC CTG GAA AAT CCT CAG 4202Lys Gly Ile Pro Phe Ala Ala Pro Thr Lys Ala Leu Glu Asn Pro Gln5 10GGC GTC AAT AAG AAG CTC GGC CTC CTG GGT GAC TCT GTG GAC ATC TTC 4154Gly Val Ash Lys Lys Leu Gly Leu Leu Gly Asp Ser Val Asp Ile Phe15 20 25 30AAG GGC ATC CCC TTC GCA GCT CCC ACC AAG GCC CTG GAA AAT CCT CAG 4202Lys Gly Ile Pro Phe Ala Ala Pro Thr Lys Ala Leu Glu Asn Pro Gln

35 40 45CCA CAT CCT GGC TGG CAA G GTGGGAGTGG GTGGTGCCGG ACTGGCCCTG 4251Pro His Pro Gly Trp Gln35 40 45CCA CAT CCT GGC TGG CAA G GTGGGAGTGG GTGGTGCCGG ACTGGCCCTG n 4251 Tr p l ProlyHis G Pro G

50CGGCGGGGCG GGTGAGGGCG GCTGCCTTCC TCATGCCAAC TCCTGCCACC TGCAG GG 430850CGGCGGGGCG GGTGAGGGCG GCTGCCTTCC TCATGCCAAC TCCTGCCACC TGCAG GG 4308

GlyACC CTG AAG GCC AAG AAC TTC AAG AAG AGA TGC CTG CAG GCC ACC ATC 4358Thr Leu Lys Ala Lys Asn Phe Lys Lys Arg Cys Leu Gln Ala Thr IleGlyacc CTG AAG GC AAG AAC TTC AAG AAG AAG AGA TGC CTG CAG GCC ACC ATC 4358thr Leu Lys Ala Lysn PHE LYS Leu Gln Ala Thr Ile

55 60 65ACC CAG GAC AGC ACC TAC GGG GAT GAA GAC TGC CTG TAC CTC AAC ATT 4404Thr Gln Asp Ser Thr Tyr Gly Asp Glu Asp Cys Leu Tyr Leu Asn Ile70 75 80 85TGG GTG CCC CAG GGC AGG AAG CAA G GTCTGCCTCC CCTCTACTCC 4449Trp Val Pro Gln Gly Arg Lys Gln55 60 65ACC CAG GAC AGC ACC TAC GGG GAT GAA GAC TGC CTG TAC CTC AAC ATT 4404Thr Gln Asp Ser Thr Tyr Gly Asp Glu Asp Cys Leu Tyr Leu Asn Ile70 75 80 85TGG GTG CCC CAG GGC AGG AAG CAA G GTCTGCCTCC CCTCTACTCC 4449Trp Val Pro Gln Gly Arg Lys Gln

90CCAAGGGACC CTCCCATGCA GCCACTGCCC CGGGTCTACT CCTGGCTTGA GTCTGGGGGC 4509TGCAAAGCTG AACTTCCATG AAATCCCACA GAGGCGGGGA GGGGAGCGCC CACTGCCGTT 4569GCCCAGCCTG GGGCAGGGCA GCGCCTTGGA GCACCTCCCT GTCTTGGCCC CAGGCACCTG 4629CTGCACAGGG ACAGGGGACC GGCTGGAGAC AGGGCCAGGC GGGGCGTCTG GGGTCACCAG 4689CCGCTCCCCC ATCTCAG TC TCC CGG GAC CTG GCC GTT ATG ATC TGG ATC 4738 90CCAAGGGACC CTCCCATGCA GCCACTGCCC CGGGTCTACT CCTGGCTTGA GTCTGGGGGC 4509TGCAAAGCTG AACTTCCATG AAATCCCACA GAGGCGGGGA GGGGAGCGCC CACTGCCGTT 4569GCCCAGCCTG GGGCAGGGCA GCGCCTTGGA GCACCTCCCT GTCTTGGCCC CAGGCACCTG 4629CTGCACAGGG ACAGGGGACC GGCTGGAGAC AGGGCCAGGC GGGGCGTCTG GGGTCACCAG 4689CCGCTCCCCC ATCTCAG TC TCC CGG GAC CTG GCC GTT ATG ATC TGG ATC 4738

Val Ser Arg Asp Leu Pro Val Met Ile Trp IleVal Ser Arg Asp Leu Pro Val Met Ile Trp Ile

95 100TAT GGA GGC GCC TTC CTC ATG GGG TCC GGC CAT GGG GCC AAC TTC CTC 4786Tyr Gly Gly Ala Phe Leu Met Gly Ser Gly His Gly Ala Asn Phe Leu105 110 115 120AAC AAC TAC CTG TAT GAC GGC GAG GAG ATC GCC ACA CGC GGA AAC GTC 4834Asn Asn Tyr Leu Tyr Asp Gly Glu Glu Ile Ala Thr Arg Gly Asn Val95 100TAT GGA GGC GCC TTC CTC ATG GGG TCC GGC CAT GGG GCC AAC TTC CTC 4786Tyr Gly Gly Ala Phe Leu Met Gly Ser Gly His Gly Ala Asn Phe Leu105 110 115 120AAC AAC TAC CTG TAT GAC GGC GAG GAG ATC GCC ACA CGC GGA AAC GTC 4834Asn Asn Tyr Leu Tyr Asp Gly Glu Glu Ile Ala Thr Arg Gly Asn Val

125 130 135ATC GTG GTC ACC TTC AAC TAC CGT GTC GGC CCC CTT GGG TTC CTC AGC 4882Ile Val Val Thr Phe Asn Tyr Arg Val Gly Pro Leu Gly Phe Leu Ser125 135ATC GTG GTC ACC TTC AAC CGT GGC CCC CCC CCC CTT GGG TTC CTC 4882ile Val THR PHER ARG Val Gly Phe Leu Ser

140 145 150ACT GGG GAC GCC AAT CTG CCA G GTGCGTGGGT GCCTTCGGCC CTGAGGTGGG 4934Thr Gly Asp Ala Asn Leu Pro

155GCGACCAGCA TGCTGAGCCC AGCAGGGAGA TTTTCCTCAG CACCCCTCAC CCCAAACAAC 4994CAGTGGCGGT TCACAGAAAG ACCCGGAAGC TGGAGTAGAA TCATGAGATG CAGGAGGCCC 5054TTGGTAGCTG TAGTAAAATA AAAGATGCTG CAGAGGCCGG GAGAGATGGC TCACGCCTGT 5114AATCCCAGCA CTTTAGGAGG CCCACACAGG TGGGTCACTT GAGCGCAGAA GTTCAAGACC 5174AGCCTGAAAA TCACTGGGAG ACCCCCATCT CTACACAAAA ATTAAAAATT AGCTGGGGAC 5234TGGGCGCGGC GGCTCACCTC TGTAATCCCA GCACGTTGGG AGCCCAAGGT GGGTAGATCA 5294CCTGAGGTCA GGAGTTTGAG ACCAGCCTGA CTAAAATGGA GAAACCTCTT CTCTACTAAA 5354AATACAAAAT TAGCCAGGCG TGGTGGCGCT TGCCTGTAAT CCCAGCTACT CGGGAGGCTG 5414AGGCAGGAGA ATCGCTTGAA CTCAGGAGGC GGAGGTTGCG GTGAGCCGAG ATCATGCCAC 5474TGCACTCCAG CCTGGAGAAC AAGAGTAAAA CTCTGTCTCA AAAAAAAAAA AAAlAAAAAAA 5534ATAGCCAGGC GTGGTATCTC ATGCCTCTGT CCTCAGCTAC CTGGGAGGCA GAGGTGGAAG 5594GATCGCTTGA GCCCAGGGGT TCAAAGCTGC AGTGAGCCGT GGTCGTGCCA CTGCACTCCA 5654GCCTGGGCGA CAGAGTGAGG CCCCATCTCA AAAATAAGAG GCTGTGGGAC AGACAGACAG 5714GCAGACAGGC TGAGGCTCAG AGAGAAACCA GGAGAGCAGA GCTGAGTGAG AGACAGAGAA 5774CAATACCTTG AGGCAGAGAC AGCTGTGGAC ACAGAAGTGG CAGGACACAG ACAGGAGGGA 5834CTGGGGCAGG GGCAGGAGAG GTGCATGGGC CTGACCATCC TGCCCCCGAC AAACACCACC 5894CCCTCCAGCA CCACACCAAC CCAACCTCCT GGGGACCCAC CCCATACAGC ACCGCACCCG 5954ACTCAGCCTC CTGGGACCCA CCCACTCCAG CAACCAACGT GACCTAGTCT CCTGGGACCC 6014ACccccTCCA GCACCCTACC CGACCCAGCT TCTTAGGGAC CCACCATTTG CCAACTGGGC 6074TCTGCCATGG CCCCAACTCT GTTGAGGGCA TTTCCACCCC ACCTATGCTG ATCTCCCCTC 6134CTGGAGGCCA GGCCTGGGCC ACTGGTCTCT AGCACCCCCT CCCCTGCCCT GCCCCCAG GT 6194155GCGACCAGCA TGCTGAGCCC AGCAGGGAGA TTTTCCTCAG CACCCCTCAC CCCAAACAAC 4994CAGTGGCGGT TCACAGAAAG ACCCGGAAGC TGGAGTAGAA TCATGAGATG CAGGAGGCCC 5054TTGGTAGCTG TAGTAAAATA AAAGATGCTG CAGAGGCCGG GAGAGATGGC TCACGCCTGT 5114AATCCCAGCA CTTTAGGAGG CCCACACAGG TGGGTCACTT GAGCGCAGAA GTTCAAGACC 5174AGCCTGAAAA TCACTGGGAG ACCCCCATCT CTACACAAAA ATTAAAAATT AGCTGGGGAC 5234TGGGCGCGGC GGCTCACCTC TGTAATCCCA GCACGTTGGG AGCCCAAGGT GGGTAGATCA 5294CCTGAGGTCA GGAGTTTGAG ACCAGCCTGA CTAAAATGGA GAAACCTCTT CTCTACTAAA 5354AATACAAAAT TAGCCAGGCG TGGTGGCGCT TGCCTGTAAT CCCAGCTACT CGGGAGGCTG 5414AGGCAGGAGA ATCGCTTGAA CTCAGGAGGC GGAGGTTGCG GTGAGCCGAG ATCATGCCAC 5474TGCACTCCAG CCTGGAGAAC AAGAGTAAAA CTCTGTCTCA AAAAAAAAAA AAAlAAAAAAA 5534ATAGCCAGGC GTGGTATCTC ATGCCTCTGT CCTCAGCTAC CTGGGAGGCA GAGGTGGAAG 5594GATCGCTTGA GCCCAGGGGT TCAAAGCTGC AGTGAGCCGT GGTCGTGCCA CTGCACTCCA 5654GCCTGGGCGA CAGAGTGAGG CCCCATCTCA AAAATAAGAG GCTGTGGGAC AGACAGACAG 5714GCAGACAGGC TGAGGCTCAG AGAGAAACCA GGAGAGCAGA GCTGAGTGAG AGACAGAGAA 5774CAATACCTTG AGGCAGAGAC AGCTGTGGAC ACAGAAGTGG CAGGACACAG ACAGGAGGGA 5834CTGGGGCAGG GGCAGGAGAG GTGCATGGGC CTGACCATCC TGCCCCCGAC AAACACCACC 5894CCCTCCAGCA CCACACCAAC CCAACCTCCT GGGGACCCAC CCCATACAGC ACCGCACCCG 5954ACTCAGCCTC CTGGGACCCA CCCACTCCAG CAACCAACGT GACCTAGTCT CCTGGGACCC 6014ACccccTCCA GCACCCTACC CGACCCAGCT TCTTAGGGAC CCACCATTTG CCAACTGGGC 6074TCTGCCATGG CCCCAACTCT GTTGAGGGCA TTTCCACCCC ACCTATGCTG ATCTCCCCTC 6134CTGGAGGCCA GGCCTGGGCC ACTGGTCTCT AGCACCCCCT CCCCTGCCCT GCCCCCAG GT 6194

GlyGly

160AAC TAT GGC CTT CGG GAT CAG CAC ATG GCC ATT GCT TGG GTG AAG AGG 6242Asn Tyr Gly Leu Arg Asp Gln His Met Ala Ile Ala Trp Val Lys Arg160AAC TAT GGC CT CT CGG GAT CAG CAC ATG GCC ATT GCT GCT TGG GTG AAG AAG 6242ASN TYR GLY Leu ARG ARG ASP GLN HIS MET ALA ILE ALA TR LY LYS ARS

165 170 175AAT ATC GCG GCC TTC GGG GGG GAC CCC AAC AAC ATC ACG CTC TTC GGG 6290Asn Ile Ala Ala Phe Gly Gly Asp Pro Asn Asn Ile Thr Leu Phe Gly165 170 175Aat ATC GCG GCC TTC GGG GGG GGG GAC CCC AAC ACG ACG CTC GGG 6290asn Ile Ala Ala PHE GLY GLY GLY ASN iLe THR Leu Phe Gly Gly Geu Ge Geu Gry GLY

180 185 190GAG TCT GCT GGA GGT GCC AGC GTC TCT CTG CAG GTCTCGGGAT CCCTGTGGGG 6343Glu Ser Ala Gly Gly Ala Ser Val Ser Leu Gln180 185 190GAG TCT GGA GGT GCC AGC GTC GTC TCT CTG GTCTCGGGGGGGGGG 6343GLU Ser Ala Gly GLY Ala Ser Val Leu Gln

195 200AGGGCCTGCC CCACAGGTTG AGAGGAAGCT CAAACGGGAA GGGGAGGGTG GGAGGAGGAG 6403CGTGGAGCTG GGGCTGTGGT GCTGGGGTGT CCTTGTCCCA GCGTGGGGTG GGCAGAGTGG 6463GGAGCGGCCT TGGTGACGGG ATTTCTGGGT CCCGTAG ACC CTC TCC CCC TAC AAC 6518 195 200AGGGCCTGCC CCACAGGTTG AGAGGAAGCT CAAACGGGAA GGGGAGGGTG GGAGGAGGAG 6403CGTGGAGCTG GGGCTGTGGT GCTGGGGTGT CCTTGTCCCA GCGTGGGGTG GGCAGAGTGG 6463GGAGCGGCCT TGGTGACGGG ATTTCTGGGT CCCGTAG ACC CTC TCC CCC TAC AAC 6518

Thr Leu Ser Pro Tyr AsnThr Leu Ser Pro Tyr Asn

205AAG GGC CTC ATC CGG CGA GCC ATC AGC CAG AGC GGC GTG GCC CTG AGT 6566Lys Gly Leu Ile Arg Arg Ala Ile Ser Gln Ser Gly Val Ala Leu Ser210 215 220 225CCC TGG GTC ATC CAG AAA AAC CCA CTC TTC TGG GCC AAA AAG 6608Pro Trp Val Ile Gln Lys Asn Pro Leu Phe Trp Ala Lys Lys205AAG GGC CTC ATC CGG CGA GCC ATC AGC CAG AGC GGC GTG GCC CTG AGT 6566Lys Gly Leu Ile Arg Arg Ala Ile Ser Gln Ser Gly Val Ala Leu Ser210 215 220 225CCC TGG GTC ATC CAG AAA AAC CCA CTC TTC TGG GCC AAA AAG 6608Pro Trp Val Ile Gln Lys Asn Pro Leu Phe Trp Ala Lys Lys

230 235GTAAACGGAG GAGGGCAGGG CTGGGCGGGG TGGGGGCTGT CCACATTTCC GTTCTTTATC 6668CTGGACCCCA TCCTTGCCTT CAAATGGTTC TGAGcCCTGA GCTCCGGCCT CACCTACCTG 6728CTGGCCTTGG TTCTGCCCCC AG GTG GCT GAG AAG GTG GGT TGC CCT GTG GGT 6780 230 235GTAAACGGAG GAGGGCAGGG CTGGGCGGGG TGGGGGCTGT CCACATTTCC GTTCTTTATC 6668CTGGACCCCA TCCTTGCCTT CAAATGGTTC TGAGcCCTGA GCTCCGGCCT CACCTACCTG 6728CTGGCCTTGG TTCTGCCCCC AG GTG GCT GAG AAG GTG GGT TGC CCT GTG GGT 6780

Val Ala Glu Lys Val Gly Cys Pro Val GlyVal Ala Glu Lys Val Gly Cys Pro Val Gly

240 245GAT GCC GCC AGG ATG GCC CAG TGT CTG AAG GTT ACT GAT CCC CGA GCC 6828Asp Ala Ala Arg Met Ala Gln Cys Leu Lys Val Thr Asp Pro Arg Ala250 255 260 265CTG ACG CTG GCC TAT AAG GTG CCG CTG GCA GGC CTG GAG T GTGAGTAGCT 6878Leu Thr Leu Ala Tyr Lys Val Pro Leu Ala Gly Leu Glu240 245GAT GCC GCC AGG ATG GCC CAG TGT CTG AAG GTT ACT GAT CCC CGA GCC 6828Asp Ala Ala Arg Met Ala Gln Cys Leu Lys Val Thr Asp Pro Arg Ala250 255 260 265CTG ACG CTG GCC TAT AAG GTG CCG CTG GCA GGC CTG GAG T GTGAGTAGCT 6878Leu Thr Leu Ala Tyr Lys Val Pro Leu Ala Gly Leu Glu

270 275GCTCGGGTTG GCCCATGGGG TCTCGAGGTG GGGGTTGAGG GGGGTACTGC CAGGGAGTAC 6938TCCGGAGGAG AGAGGAAGGT GCCAGAGCTG CGGTCTTGTC CTGTCACCAA CTAGCTGGTG 6998TCTCCCCTCG AAGGCCCCAG CTGTAAGGGA GAGGGGGTGC CGTTTCTTCT TTTTTTTTGA 7058GATGGAGTCT CACTGTTGCC CAGGCTGGAG TGCAGTGTCA CGATCTCAGC TCACTGCAAC 7118CTCCACCTCC TGGGTTCAAG TGATTCTCTG ACTCAACCTC CCATGTAGCT GGGACTACAG 7178GCACATGCCA CCATGCCCAG ATAATTTTTC TGTGTGTTTA GTAGGGATGG AGTTTCATCG 7238TGTTAGCTAG GATGATCTCG GTCTTGGGAC CTCATGATCT GCCCACCTCG GCCTCCCAAA 7298GTGCTGGAAT TACAGGCGTG AGCCACTGTG CCCGGCCCCT TCTTTATTCT TATCTCCCAT 7358GAGTTACAGA CTCCCCTTTG AGAAGCTGAT GAACATTTGG GGCCCCCTCC CCCACCTCAT 7418GCATTCATAT GCAGTCATTT GCATATAATT TTAGGGAGAC TCATAGACCT CAGACCAAGA 7478GCCTTTGTGC TAGATGACCG TTCATTCATT CGTTCATTCA TTCAGCAAAC ATTTACTGAA 7538CCGTAGCACT GGGGCCCAGC CTCCAGCTCC ACTATTCTGT ACCCCGGGAA GGCCTGGGGA 7598CCCATTCCAC AAACACCTCT GCATGTCAGC CTTACCAGCT TGCTACGCTA AGGCTGTCCC 7658TCACTCATTC TTCTATGGCA ACATGCCATG AAGCCAAGTC ATCTGCACGT TTACCTGACA 7718TGAGCTCAAC TGCACGGGCT GGACAAGCCC AAACAAAGCA ACCCCCACGG CCCCGCTAGA 7778AGCAAAACCT GCTGTGCTGG GCCCAGTGAC AGCCAGGCCC CGCCTGCCTC AGCAGCCACT 7838GGGTCCTCTA GGGGCCCGTC CAGGGGTCTG GAGTACAATG CAGACCTCCC ACCATTTTTG 7898GCTGATGGAC TGGAACCCAG CCCTGAGAGA GGGAGCTCCT TCTCCATCAG TTCCCTCAGT 7958GGCTTCTAAG TTTCCTCCTT CCTGCTTCAG GCCCAGCAAA GAGAGAGAGG AGAGGGAGGG 8018GCTGCCGCTG AAGAGGACAG ATCTGGCCCT AGACAGTGAC TCTCAGCCTG GGGACGTGTG 8078GCAGGGCCTG GAGACATCTG TGATTGTCAC AGCTGGGGAG GGGGTGCTCC TGGCACCTCG 8138TGGGTCGAGG CCGGGGATGC TCTAAACATC CTACAGGGCA CAGGATGCCC CTGATGGTGC 8198AGAATCAACC CTGCCCCAAG TGTCCATAGA TCAGAGAAGG GAGGACATAG CCAATTCCAG 8258CCCTGAGAGG CAAGGGGCGG CTCAGGGGAA ACTGGGAGGT ACAAGAACCT GCTAACCTGC 8318TGGCTCTCCC ACCCAG AC CCC ATG CTG CAC TAT GTG GGC TTC GTC CCT 8366270 275GCTCGGGTTG GCCCATGGGG TCTCGAGGTG GGGGTTGAGG GGGGTACTGC CAGGGAGTAC 6938TCCGGAGGAG AGAGGAAGGT GCCAGAGCTG CGGTCTTGTC CTGTCACCAA CTAGCTGGTG 6998TCTCCCCTCG AAGGCCCCAG CTGTAAGGGA GAGGGGGTGC CGTTTCTTCT TTTTTTTTGA 7058GATGGAGTCT CACTGTTGCC CAGGCTGGAG TGCAGTGTCA CGATCTCAGC TCACTGCAAC 7118CTCCACCTCC TGGGTTCAAG TGATTCTCTG ACTCAACCTC CCATGTAGCT GGGACTACAG 7178GCACATGCCA CCATGCCCAG ATAATTTTTC TGTGTGTTTA GTAGGGATGG AGTTTCATCG 7238TGTTAGCTAG GATGATCTCG GTCTTGGGAC CTCATGATCT GCCCACCTCG GCCTCCCAAA 7298GTGCTGGAAT TACAGGCGTG AGCCACTGTG CCCGGCCCCT TCTTTATTCT TATCTCCCAT 7358GAGTTACAGA CTCCCCTTTG AGAAGCTGAT GAACATTTGG GGCCCCCTCC CCCACCTCAT 7418GCATTCATAT GCAGTCATTT GCATATAATT TTAGGGAGAC TCATAGACCT CAGACCAAGA 7478GCCTTTGTGC TAGATGACCG TTCATTCATT CGTTCATTCA TTCAGCAAAC ATTTACTGAA 7538CCGTAGCACT GGGGCCCAGC CTCCAGCTCC ACTATTCTGT ACCCCGGGAA GGCCTGGGGA 7598CCCATTCCAC AAACACCTCT GCATGTCAGC CTTACCAGCT TGCTACGCTA AGGCTGTCCC 7658TCACTCATTC TTCTATGGCA ACATGCCATG AAGCCAAGTC ATCTGCACGT TTACCTGACA 7718TGAGCTCAAC TGCACGGGCT GGACAAGCCC AAACAAAGCA ACCCCCACGG CCCCGCTAGA 7778AGCAAAACCT GCTGTGCTGG GCCCAGTGAC AGCCAGGCCC CGCCTGCCTC AGCAGCCACT 7838GGGTCCTCTA GGGGCCCGTC CAGGGGTCTG GAGTACAATG CAGACCTCCC ACCATTTTTG 7898GCTGATGGAC TGGAACCCAG CCCTGAGAGA GGGAGCTCCT TCTCCATCAG TTCCCTCAGT 7958GGCTTCTAAG TTTCCTCCTT CCTGCTTCAG GCCCAGCAAA GAGAGAGAGG AGAGGGAGGG 8018GCTGCCGCTG AAGAGGACAG ATCTGGCCCT AGACAGTGAC TCTCAGCCTG GGGACGTGTG 8078GCAGGGCCTG GAGACATCTG TGATTGTCAC AGCTGGGGAG GGGGTGCTCC TGGCACCTCG 8138TGGGTCGAGG CCGGGGATGC TCTAAACATC CTACAGGGCA CAGGATGCCC CTGATGGTGC 8198AGAATCAACC CTGCCCCAAG TGTCCATAGA TCAGAGAAGG GAGGACATAG CCAATTCCAG 8258CCCTGAGAGG CAAGGGGCGG CTCAGGGGAA ACTGGGAGGT ACAAGAACCT GCTAACCTGC 8318TGGCTCTCCC ACCCAG AC CCC ATG CTG CAC TAT GTG GGC TTC GTC CCT 8366

Tyr Pro Met Leu His Tyr Val Gly Phe Val ProTyr Pro Met Leu His Tyr Val Gly Phe Val Pro

280 285GTC ATT GAT GGA GAC TTC ATC CCC GCT GAC CCG ATC AAC CTG TAC GCC 8414Val Ile Asp Gly Asp Phe Ile Pro Ala Asp Pro Ile Asn Leu Tyr Ala290 295 300 305AAC GCC GCC GAC ATC GAC TAT ATA GCA GGC ACC AAC AAC ATG GAC GGC 8462Asn Ala Ala Asp Ile Asp Tyr Ile Ala Gly Thr Asn Asn Met AsP Gly280 285GTC ATT GAT GGA GAC TTC ATC CCC GCT GAC CCG ATC AAC CTG TAC GCC 8414Val Ile Asp Gly Asp Phe Ile Pro Ala Asp Pro Ile Asn Leu Tyr Ala290 295 300 305AAC GCC GCC GAC ATC GAC TAT ATA GCA GGC ACC AAC AAC ATG GAC GGC 8462Asn Ala Ala Asp Ile Asp Tyr Ile Ala Gly Thr Asn Asn Met AsP Gly

310 315 320CAC ATC TTC GCC AGC ATC GAC ATG CCT GCC ATC AAC AAG GGC AAC AAG 8510His Ile Phe Ala Ser Ile Asp Met Pro Ala Ile Asn Lys Gly Asn Lys310 320CAC ATC TTC GCC AGC AGC ATC ATC ATC AAC AAC AAG GGC AAG 8510HIS Ile PHE ALA Serle ALA Ile Asn Lysn Lysn Lysn Lysn Lysn Lysn Lysn Lysn Lys

325 330 335AAA GTC ACG GA GTAAGCAGGG GGCACAGGAC TCAGGGGCGA CCCGTGCGGG 8561Lys Val Thr Glu325 330 335AAA GTC ACG GA GTAAGCAGGG GGCACAGGAC TCAGGGGCGA CCCGTGCGGG 8561Lys Val Thr Glu

340AGGGCCGCCG GGAAAGCACT GGCGAGGGGG CCAGCCTGGA GGAGGAAGGC ATTGAGTGGA 8621GGACTGGGAG TGAGGAAGTT AGCACCGGTC GGGGTGAGTA TGCACACACC TTCCTGTTGG 8681CACAGGCTGA GTGTCAGTGC CTACTTGATT CCCCCAG G GAG GAC TTC TAC AAG 8734 340AGGGCCGCCG GGAAAGCACT GGCGAGGGGG CCAGCCTGGA GGAGGAAGGC ATTGAGTGGA 8621GGACTGGGAG TGAGGAAGTT AGCACCGGTC GGGGTGAGTA TGCACACACC TTCCTGTTGG 8681CACAGGCTGA GTGTCAGTGC CTACTTGATT CCCCCAG G GAG GAC TTC TAC AAG 8734

Glu Asp Phe Tyr LysGlu Asp Phe Tyr Lys

345CTG GTC AGT GAG TTC ACA ATC ACC AAG GGG CTC AGA GGC GCC AAG ACG 8782Leu Val Ser Glu Phe Thr Ile Thr Lys Gly Leu Arg Gly Ala Lys Thr345CTG GTC AGT GAT GAA ACC ACC AAG GGG CTC AGA GGC GCC AAG AAG 8782leu Val GLU PHR Ile ThR LYS GLY Leu ARG GLY ALA LYS THR.

350 355 360ACC TTT GAT GTC TAC ACC GAG TCC TGG GCC CAG GAC CCA TCC CAG GAG 8830Thr Phe Asp Val Tyr Thr Glu Ser Trp Ala Gln Asp Pro Ser Gln Glu 350 355 360ACC TTT GAT GTC TAC ACC GAG TCC TGG GCC CAG GAC CCA TCC CAG GAG 8830Thr Phe Asp Val Tyr Thr Glu Ser Trp Ala Gln Asp Pro Ser Gln Glu

365 370 375AAT AAG AAG AAG ACT GTG GTG GAC TTT GAG ACC GAT GTC CTC TTC CTG 8878Asn Lys Lys Lys Thr Val Val Asp Phe Glu Thr Asp Val Leu Phe Leu365 370 375Aat AAG AAG AAG ACT GTG GAC TTT GAG ACC GAG ACC GAT GTC CTC CTG 8878asn LYS LYS THR Val Val Val ASP Val Leu Phe Leu Phe Leu

380 385 390GTG CCC ACC GAG ATT GCC CTA GCC CAG CAC AGA GCC AAT GCC AA 8922Val Pro Thr Glu Ile Ala Leu Ala Gln His Arg Ala Asn Ala Lys395 400 405GTGAGGATCT GGGCAGCGGG TGGCTCCTGG GGGCCTTCCT GGGGTGCTGC ACCTTCCAGC 8982CGAGGCCTCG CTGTGGGTGG CTCTCAGGTG TCTGGGTTGT CTCGGAAAGT GGTGCTTGAG 9042TCCCCACCTG TGCCTGCCTG ATCCACTTTG CTGAGGCCTG GCAAGACTTG AGGGCCTCTT 9102TTTACCTCCC AGCCTACAGG GCTTTACAAA CCCTATGATC CTCTGCCCTG CTCAGCCCTG 9162CACCCCATGG TCCTTCCCAC TGGAGAGTTC TTGAGCTACC TTCCATCCCC CATGCTGTGT 9222GCACTGAGAG AACACTGGAC AATAGTTTCT ATCCACTGAC TCTTATGGGC CTCAACTTTG 9282380 385 390GTG CCC ACC GAG ATT GCC CTA GCC CAG CAC AGA GCC AAT GCC AA 8922Val Pro Thr Glu Ile Ala Leu Ala Gln His Arg Ala Asn Ala Lys395 400 405GTGAGGATCT GGGCAGCGGG TGGCTCCTGG GGGCCTTCCT GGGGTGCTGC ACCTTCCAGC 8982CGAGGCCTCG CTGTGGGTGG CTCTCAGGTG TCTGGGTTGT CTCGGAAAGT GGTGCTTGAG 9042TCCCCACCTG TGCCTGCCTG ATCCACTTTG CTGAGGCCTG GCAAGACTTG AGGGCCTCTT 9102TTTACCTCCC AGCCTACAGG GCTTTACAAA CCCTATGATC CTCTGCCCTG CTCAGCCCTG 9162CACCCCATGG TCCTTCCCAC TGGAGAGTTC TTGAGCTACC TTCCATCCCC CATGCTGTGT 9222GCACTGAGAG AACACTGGAC AATAGTTTCT ATCCACTGAC TCTTATGGGC CTCAACTTTG 9282

CCCATAATTT CAGCCCACCA CCACATTAAA AATCTTCATG TAATAATAGC CAATTATAAT 9342AAAAAATAAG GCCAGACACA GTAGCTCATG CCTGTAATCC CAGCACATTG GGAGGTCAAG 9402GTGGGAGGAT CACTTGAGGT CAGGAGTCTG AGACTAGTCT GGCCAACATG GCAAAACCCC 9462ATCTCTACTA AAAATACAAA AATTATCCAG GCATGGTGGT GCATGCCTAT AATCCTAGCT 9522ACTCAGGAGG CTGAGGTAGC AGAATTGATT GACCCAGGGA GGTGGAGGTT GCAGTGAGCC 9582GAGATTACGC CACTGCACTC CAGCAGGGGC AACAGAGTGA GACTGTGTCT CGAATAAATA 9642AGTAAATAAA TAATAAAAAT AAAAAATAAG TTAGGAATAC GAAAAAGATA GGAAGATAAA 9702AGTATACCTA GAAGTCTAGG ATGAAAGCTT TGCAGCAACT AAGCAGTACA TTTAGCTGTG 9762AGCCTCCTTT CAGTCAAGGC AAAAAGGGAA ACAGTTGAGG GCCTATACCT TGTCCAATCT 9822AATTGAAGAA TGCACATTCA CTTGGAGAGC AAAATATTTC TTGATACTGA ATTCTAGAAG 9882GAAGGTGCCT CACAATGTTT TGTGGAGGTG AAGTATAAAT TCAGCTGAAA TTGTGGAACC 9942CATGAATCCA TGAATTTGGT TCTCAGCTTT CCCTTCCCTG GGTGTAAGAA GCCCCATCTC 10002TTCATGTGAA TTCCCCAGAC ACTTCCCTGC CCACTGCCCG GGACCTCCCT CCAAGTCCGG 10062TCTCTGGGCT GATCGGTCCC CAGTGAGCAC CCTGCCTACT TGGGTGGTCT CTCCCCTCCA 10122G G AGT GCC AAG ACC TAC GCC TAC CTG TTT TCC CAT CCC TCT CGG ATG 10169CCCATAATTT CAGCCCACCA CCACATTAAA AATCTTCATG TAATAATAGC CAATTATAAT 9342AAAAAATAAG GCCAGACACA GTAGCTCATG CCTGTAATCC CAGCACATTG GGAGGTCAAG 9402GTGGGAGGAT CACTTGAGGT CAGGAGTCTG AGACTAGTCT GGCCAACATG GCAAAACCCC 9462ATCTCTACTA AAAATACAAA AATTATCCAG GCATGGTGGT GCATGCCTAT AATCCTAGCT 9522ACTCAGGAGG CTGAGGTAGC AGAATTGATT GACCCAGGGA GGTGGAGGTT GCAGTGAGCC 9582GAGATTACGC CACTGCACTC CAGCAGGGGC AACAGAGTGA GACTGTGTCT CGAATAAATA 9642AGTAAATAAA TAATAAAAAT AAAAAATAAG TTAGGAATAC GAAAAAGATA GGAAGATAAA 9702AGTATACCTA GAAGTCTAGG ATGAAAGCTT TGCAGCAACT AAGCAGTACA TTTAGCTGTG 9762AGCCTCCTTT CAGTCAAGGC AAAAAGGGAA ACAGTTGAGG GCCTATACCT TGTCCAATCT 9822AATTGAAGAA TGCACATTCA CTTGGAGAGC AAAATATTTC TTGATACTGA ATTCTAGAAG 9882GAAGGTGCCT CACAATGTTT TGTGGAGGTG AAGTATAAAT TCAGCTGAAA TTGTGGAACC 9942CATGAATCCA TGAATTTGGT TCTCAGCTTT CCCTTCCCTG GGTGTAAGAA GCCCCATCTC 10002TTCATGTGAA TTCCCCAGAC ACTTCCCTGC CCACTGCCCG GGACCTCCCT CCAAGTCCGG 10062TCTCTGGGCT GATCGGTCCC CAGTGAGCAC CCTGCCTACT TGGGTGGTCT CTCCCCTCCA 10122G G AGT GCC AAG ACC TAC GCC TAC CTG TTT TCC CAT CCC TCT CGG ATG 10169

Ser Ala Lys Thr Tyr Ala Tyr Leu Phe Ser His Pro Ser Arg MetSer Ala Lys Thr Tyr Ala Tyr Leu Phe Ser His Pro Ser Arg Met

410 415 420CCC GTC TAC CCC AAA TGG GTG GGG GCC GAC CAT GCA GAT GAC ATT CAG 10217Pro Val Tyr Pro Lys Trp Val Gly Ala Asp His Ala Asp Asp Ile Gln425 430 435 440TAC GTT TTC GGG AAG CCC TTC GCC ACC CCC ACG GGC TAC CGG CCC CAA 10265Tyr Val Phe Gly Lys Pro Phe Ala Thr Pro Thr Gly Tyr Arg Pro Gln410 415 420CCC GTC TAC CCC AAA TGG GTG GGG GCC GAC CAT GCA GAT GAC ATT CAG 10217Pro Val Tyr Pro Lys Trp Val Gly Ala Asp His Ala Asp Asp Ile Gln425 430 435 440TAC GTT TTC GGG AAG CCC TTC GCC ACC CCC ACG GGC TAC CGG CCC CAA 10265Tyr Val Phe Gly Lys Pro Phe Ala Thr Pro Thr Gly Tyr Arg Pro Gln

445 450 455GAC AGG ACA GTC TCT AAG GCC ATG ATC GCC TAC TGG ACC AAC TTT GCC 10313Asp Arg Thr Val Ser Lys Ala Met Ile Ala Tyr Trp Thr Asn Phe Ala445 450 455GAC AGG ACA GTC TCT AAG GCC ATC GCC GCC TAC TGG ACC AAC TTT GCC 10313ASP ARG THR Val Sero Met Ile Ala Tyr THR Asn PHE ALA

460 465 470AAA ACA GG GTAAGACGTG GGTTGAGTGC AGGGCGGAGG GCCACAGCCG 10361Lys Thr Gly460 465 470AAA ACA GG GTAAGACGTG GGTTGAGTGC AGGGCGGAGG GCCACAGCCG 10361Lys Thr Gly

475AGAAGGGCCT CCCACCACGA GGCCTTGTTC CCTCATTTGC CAGTGGAGGG ACTTTGGGCA 10421AGTCACTTAA CCTCCCCCTG CATCGGAATC CATGTGTGTT TGAGGATGAG AGTTACTGGC 10481AGAGCCCCAA GCCCATGCAC GTGCACAGCC AGTGCCCAGT ATGCAGTGAG GGGCATGGTG 10541CCCAGGGCCA GCTCAGAGGG CGGGGATGGC TCAGGCGTGC AGGTGGAGAG CAGGGCTTCA 10601GCCCCCTGGG AGTCCCCAGC CCCTGCACAG CCTCTTCTCA CTCTGCAG G GAC CCC 10656 475AGAAGGGCCT CCCACCACGA GGCCTTGTTC CCTCATTTGC CAGTGGAGGG ACTTTGGGCA 10421AGTCACTTAA CCTCCCCCTG CATCGGAATC CATGTGTGTT TGAGGATGAG AGTTACTGGC 10481AGAGCCCCAA GCCCATGCAC GTGCACAGCC AGTGCCCAGT ATGCAGTGAG GGGCATGGTG 10541CCCAGGGCCA GCTCAGAGGG CGGGGATGGC TCAGGCGTGC AGGTGGAGAG CAGGGCTTCA 10601GCCCCCTGGG AGTCCCCAGC CCCTGCACAG CCTCTTCTCA CTCTGCAG G GAC CCC 10656

Asp ProAAC ATG GGC GAC TCG GCT GTG CCC ACA CAC TGG GAA CCC TAC ACT ACG 10704Asn Met Gly Asp Ser Ala Val Pro Thr His Trp Glu Pro Tyr Thr Thr Asp ProAAC ATG GGC GAC TCG GCT GTG CCC ACA CAC TGG GAA CCC TAC ACT ACG 10704Asn Met Gly Asp Ser Ala Val Pro Thr His Trp Glu Pro Tyr Thr Thr

480 485 490GAA AAC AGC GGC TAC CTG GAG ATC ACC AAG AAG ATG GGC AGC AGC TCC 10752Glu Asn Ser Gly Tyr Leu Glu Ile Thr Lys Lys Met Gly Ser Ser Sar480 485 490GAA AGC GGC GGC TAC CTG GAG ACC AAG AAG AAG AAG GGC AGC AGC AGC TCC 10752GLU Asn Sergr Leu GLU

495 500 505ATG AAG CGG AGC CTG AGA ACC AAC TTC CTG CGC TAC TGG ACC CTC ACC 10800Met Lys Arg ser Leu Arg Thr Asn Phe Leu Arg Tyr Trp Thr Leu Thr510 515 520 525TAT CTG GCG CTG CCC ACA GTG ACC GAC CAG GAG GCC ACC CCT GTG CCC 10848Tyr Leu Ala Leu Pro Thr Val Thr Asp Gln Glu Ala Thr Pro Val Pro495 500 505ATG AAG CGG AGC CTG AGA ACC AAC TTC CTG CGC TAC TGG ACC CTC ACC 10800Met Lys Arg ser Leu Arg Thr Asn Phe Leu Arg Tyr Trp Thr Leu Thr510 515 520 525TAT CTG GCG CTG CCC ACA GTG ACC GAC CAG GAG GCC ACC CCT GTG CCC 10848Tyr Leu Ala Leu Pro Thr Val Thr Asp Gln Glu Ala Thr Pro Val Pro

530 535 540CCC ACA GGG GAC TCC GAG GCC ACT CCC GTG CCC CCC ACG GGT GAC TCC 10896Pro Thr Gly Asp Ser Glu Ala Thr Pro Val Pro Pro Thr Gly Asp Ser530 535 540CCC ACA GGG GAC TCC GCC GCC ACT CCC CCC CCC CCC ACG GGT GAC TCC 10896Pro ThR GLY As GLU ALA THR PRO PRLY ASPR GLY ASSP that PLR's Pro's Pro

545 550 555GAG ACC GCC CCC GTG CCG CCC ACG GGT GAC TCC GGG GCC CCC CCC GTG 10944Glu Thr Ala Pro Val Pro Pro Thr Gly Asp Ser Gly Ala Pro Pro Val545 550 555GAG ACC GCC CCC GTG CCG CCC ACG GGT GCC GCC CCC CCC CCC CCC GTG 10944GLU ThR Ala Val Pro ThR Gly Ala Pro Pro Pro Val

560 565 570CCG CCC ACG GGT GAC TCC GGG GCC CCC CCC GTG CCG CCC ACG GGT GAC 10992Pro Pro Thr Gly Asp Ser Gly Ala Pro Pro Val Pro Pro Thr Gly Asp560 565 570ccg CCC ACG GGT GAC GCC GCC CCC CCC CCC GTG CCC ACG GGT GAC 10992PRO Pro Thr Gly Ala Pro Val Pro Pro THR GLY ASP

575 580 585TCC GGG GCC CCC CCC GTG CCG CCC ACG GGT GAC TCC GGG GCC CCC CCC 11040Ser Gly Ala Pro Pro Val Pro Pro Thr Gly Asp Ser Gly Ala Pro Pro590 595 600 605GTG CCG CCC ACG GGT GAC TCC GGG GCC CCC CCC GTG CCG CCC ACG GGT 11088Val Pro Pro Thr Gly Asp Ser Gly Ala Pro Pro Val Pro Pro Thr Gly575 580 585TCC GGG GCC CCC CCC GTG CCG CCC ACG GGT GAC TCC GGG GCC CCC CCC 11040Ser Gly Ala Pro Pro Val Pro Pro Thr Gly Asp Ser Gly Ala Pro Pro590 595 600 605GTG CCG CCC ACG GGT GAC TCC GGG GCC CCC CCC GTG CCG CCC ACG GGT 11088Val Pro Pro Thr Gly Asp Ser Gly Ala Pro Pro Val Pro Pro Thr Gly

610 615 620GAC TCC GGG GCC CCC CCC GTG CCG CCC ACG GGT GAC TCC GGC GCC CCC 11136Asp Ser Gly Ala Pro Pro Val Pro Pro Thr Gly Asp Ser Gly Ala Pro610 615 620GAC TCC GGG GCC CCC CCC CCC GTG CCC ACG GGT GCC GCC GCC CCC CCC 11136asp Sergial Ala Pro Val Pro ThR Gly Ala Pro Pro Pro

625 630 635CCC GTG CCG CCC ACG GGT GAC GCC GGG CCC CCC CCC GTG CCG CCC ACG 11184Pro Val Pro Pro Thr Gly Asp Ala Gly Pro Pro Pro Val Pro Pro Thr625 635CCC GCC ACG GGT GCC GCC GCC CCC CCC CCC GTG CCC ACG 11184PRO VAL PRO THR GLY ALA GLY Pro Pro Val Pro VAL PRO VAL Pro VAL PRO VAL PRO VAL PRO Val PRO Val PRO Val Pro Pro VR

640 645 650GGT GAC TCC GGC GCC CCC CCC GTG CCG CCC ACG GGT GAC TCC GGG GCC 11232Gly Asp Ser Gly Ala Pro Pro Val Pro Pro Thr Gly Asp Ser Gly Ala640 645 650GGT GAC GCC GCC CCC CCC CCC GTG CCC ACG GGT GAC GCC GCC 11232GLY ASP Ser GLY ALA Pro Val Pro Thr Gly Ala

655 660 665CCC CCC GTG ACC CCC ACG GGT GAC TCC GAG ACC GCC CCC GTG CCG CCC 11280Pro Pro Val Thr Pro Thr Gly Asp Ser Glu Thr Ala Pro Val Pro Pro670 675 680 685ACG GGT GAC TCC GGG GCC CCC CCT GTG CCC CCC ACG GGT GAC TCT GAG 11328Thr Gly Asp Ser Gly Ala Pro Pro Val Pro Pro Thr Gly Asp Ser Glu655 660 665CCC CCC GTG ACC CCC ACG GGT GAC TCC GAG ACC GCC CCC GTG CCG CCC 11280Pro Pro Val Thr Pro Thr Gly Asp Ser Glu Thr Ala Pro Val Pro Pro670 675 680 685ACG GGT GAC TCC GGG GCC CCC CCT GTG CCC CCC ACG GGT GAC TCT GAG 11328Thr Gly Asp Ser Gly Ala Pro Pro Val Pro Pro Thr Gly Asp Ser Glu

690 695 700GCT GCC CCT GTG CCC CCC ACA GAT GAC TCC AAG GAA GCT CAG ATG CCT 11376Ala Ala Pro Val Pro Pro Thr Asp Asp Ser Lys Glu Ala Gln Met Pro690 695 700GCT GCC CCT GTG CCC CCC ACA GAC TCC AAG GAA GCT CAG CCT 11376ALA Ala Val Pro Pro THR ASP Ser La Gln Met Pro

705 710 715GCA GTC ATT AGG TTT TAGCGTCCCA TGAGCCTTGG TATCAAGAGG CCACAAGAGT 11431Ala Val Ile Arg Phe705 710 715GCA GTC ATT AGG TTT TAGCGTCCCA TGAGCCTTGG TATCAAGAGG CCACAAGAGT 11431Ala Val Ile Arg Phe

720GGGACCCCAG GGGCTCCCCT CCCATCTTGA GCTCTTCCTG AATAAAGCCT CATACCCCTG 11491TCGGTGTCTT TCTTTGCTCC CAAGGCTAAG CTGCAGGATC 11531720GGGACCCCAG GGGCTCCCCT CCCATCTTGA GCTCTTCCTG AATAAAGCCT CATACCCCCTG 11491TCGGTGTCTT TCTTTGCTCC CAAGGCTAAG CTGCAGGATC 11531

Claims

1. DNA molecules encoding BSSL/CEL with biological functions and containing intron sequences.

2. The DNA molecule according to claim 1, which is listed as SEQ ID No: 1 in the sequence catalogue.

3. The analog of the DNA molecule according to claim 2, which contains an intron sequence, and under stringent hybridization conditions, hybridizes with the DNA sequence shown in SEQ ID No: 1 or a specific part thereof in the sequence catalog.

4. A method for producing human BSSL/CEL, comprising (a) inserting any one of the DNA molecules defined in claims 1-3 into a vector capable of replicating in a specific host cell; (b) introducing the resulting recombinant vector into the host cell; (c) producing the resulting cells in or on culture medium to express the polypeptide; and (d) recovering the polypeptide.