CN114703168A

CN114703168A - Heparinase III, coding nucleotide sequence thereof, recombinant vector and host cell comprising nucleotide sequence and application

Info

Publication number: CN114703168A
Application number: CN202210203224.7A
Authority: CN
Inventors: 刘颖
Original assignee: Individual
Current assignee: Beijing Ed Hauck International Technology Co ltd
Priority date: 2022-03-02
Filing date: 2022-03-02
Publication date: 2022-07-05
Anticipated expiration: 2042-03-02
Also published as: CN114703168B

Abstract

The invention provides heparinase III, a coding nucleotide sequence thereof, a recombinant vector and a host cell comprising the nucleotide sequence and application, and belongs to the field of genetic engineering and fermentation engineering. The heparinase III comprises an amino acid sequence shown as Seq ID No.2 or Seq ID No.3, and compared with the original heparinase III, the activity of the modified heparinase III is not reduced, and the stability is enhanced.

Description

Heparinase III, coding nucleotide sequence thereof, recombinant vector and host cell comprising nucleotide sequence and application

Technical Field

The invention relates to the field of genetic engineering and fermentation engineering, in particular to heparinase III, a coding nucleotide sequence thereof, a recombinant vector and a host cell comprising the nucleotide sequence and application.

Background

Heparinases (heparinases) are a class of polysaccharide lyases that act on heparin or heparan sulfate and are present in a wide variety of microorganisms, with heparinases from Flavobacterium heparinum being the most common. There are only three heparinases from Flavobacterium heparinum, heparinase I (EC 4.2.2.7), heparinase II (No EC code), and heparinase III (EC 4.2.2.8) (Robert J. Linhardt et al purification and catalysis of heparin from Flavobacterium heparinum JBC 1992 Vol.267: 24347-24355).

Heparinase III acts mainly on heparan sulfate and has a molecular weight of 73 kDa. Compared with the other two heparinases, the study shows that heparinase III can act on heparinoids in extracellular matrix to produce active heparin small molecules which can inhibit the proliferation of capillary epithelial cells, thereby inhibiting the growth of tumor cells and reducing the metastasis and spread of cancer cells (Liu DF, Prjanek K, Shriver Z, et al. heparinase III and uses thermoof United States, US 6869789B2[ P ]. 205-5-22).

Disclosure of Invention

At present, the stability of heparinase III is poor, and few heparinase III which can improve the enzyme stability on the basis of ensuring the enzyme catalytic activity are researched and developed by a genetic engineering means.

In view of the above-mentioned drawbacks, the present invention aims to provide heparinase III and a nucleotide sequence encoding the same, a recombinant vector and a host cell comprising the nucleotide sequence.

The invention provides heparinase III which comprises an amino acid sequence shown as Seq ID No.2 or Seq ID No. 3.

The original amino acid sequence of heparinase III is shown in Seq ID No. 1.

Seq ID No.2 indicates that all of the Q glutamines at positions 85, 114, 403 and 547 in the amino acid sequence of Seq ID No.1 are replaced by A alanine.

The amino acid sequence of Seq ID No.3 is that the 403-position Q glutamine is replaced by V valine, and the 85-, 114-and 547-position Q glutamine is replaced by A alanine in Seq ID No. 1.

Another objective of the invention is to provide a nucleotide sequence encoding the heparinase III.

Further, the nucleotide sequence is shown as Seq ID No.4 or Seq ID No.5, wherein Seq ID No.4 is the nucleotide sequence for coding the heparinase III described by Seq ID No.2, and Seq ID No.5 is the nucleotide sequence for coding the heparinase III described by Seq ID No. 3.

It is a further object of the present invention to provide a recombinant vector comprising at least one nucleotide sequence as described above.

Further, the recombinant vector is a eukaryotic cell recombinant vector.

Furthermore, the eukaryotic cell recombinant vector is any one of pPink-HC, pPICZaA and pPICZ A.

Preferably, the eukaryotic cell recombinant vector is pPink-HC.

It is a further object of the present invention to provide a host cell comprising any of the above recombinant vectors.

Further, the host cell is pichia pastoris.

Further, the pichia pastoris is X33.

The invention also provides a preparation method of the heparinase III, which comprises the following steps:

(a) And synthesizing a nucleotide sequence encoding the heparinase III

(b) Combining the nucleotide sequence in the step (a) with a eukaryotic cell recombinant expression vector to obtain a recombinant vector;

(c) and (c) introducing the recombinant vector in the step (b) into a host cell, then culturing, inducing and expressing, and purifying to obtain the heparinase III.

Preferably, the step (c) uses Buffer W to purify the Srep-Tactin column.

Compared with the prior art, the invention has the beneficial effects that:

the invention modifies protease enzyme cutting sites which may affect the stability of heparinase III in the amino acid sequence (as shown in Seq ID No. 1) of the existing heparinase III, and specifically, Q glutamine at 85, 114, 403 and 547 sites in the amino acid sequence of the existing heparinase III is subjected to site-specific mutagenesis to A alanine or V valine, so that the stability of the heparinase III, especially the thermal stability and the enzyme activity half-life are improved, when the 85, 114, 403 and 547 sites in the original amino acid sequence are replaced by the A alanine, the stability is obviously improved, the enzyme activity half-life at the optimal reaction temperature of 30 ℃ reaches about 47 hours, when the 85, 114 and 547 sites in the original amino acid sequence are replaced by the A alanine and the 403 site is replaced by the V valine, the stability is obviously improved, and the enzyme activity half-life at the optimal reaction temperature of 30 ℃ is longer.

The nucleotide sequence provided by the invention is used for coding the heparinase III, a rare codon optimization method is adopted, DNA sequence optimization is carried out according to codon usage preference of pichia pastoris, and the expression efficiency of the optimized sequence in the pichia pastoris is obviously improved.

The recombinant vector and the host cell provided by the invention comprise the nucleotide sequence and can be used for expressing the heparinase III.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is an SDS-PAGE electrophoresis chart of example 8 of the present invention

FIG. 2 is a graph showing stability measurements before and after modification of heparinase III according to example 10 of the present invention.

Detailed Description

"amino acid" refers to any monomeric unit that can be incorporated into a peptide, polypeptide, or protein. As used herein, the term "amino acid" includes the following 20 naturally or genetically encoded α -amino acids: alanine (Ala or A), arginine (Arg or R), asparagine (Asn or N), aspartic acid (Asp or D), cysteine (Cys or C), glutamine (Gln or Q), glutamic acid (Glu or E), glycine (Gly or G), histidine (His or H), isoleucine (Ile or I), leucine (Leu or L), lysine (Lys or K), methionine (Met or M), phenylalanine (Phe or F), proline (Pro or P), serine (Ser or S), threonine (Thr or T), tryptophan (Trp or W), tyrosine (Tyr or Y) and valine (Val or V). In cases where the "X" residues are undefined, these should be defined as "any amino acid". The structures of these 20 natural amino acids are shown, for example, in Stryer et al, BioChemistry, 5 th edition, Freeman and Company (202). Additional amino acids such as Selenocysteine and pyrrolysine can also be genetically encoded (Stadtman (1996) "Selenocysteine," Annu RevBiochem.65:83-100 and Ibba et al (202) "Genetic code: interconnecting pyrolysine," Curr biol.12(13): R464-R466). The term "amino acid" also includes unnatural amino acids, modified amino acids (e.g., having modified side chains and/or backbones), and amino acid analogs. (see, for example, Zhang et al (204) "selection characterization of 5-hydroxytryptophan proteins in macromolecular cells," Proc. Natl. Acad. Sci. U.S.A.11(24):8882-8887, Anderson et al (204) "An expression code with a functional quartz" Proc. Natl. Acad. Sci. U.S.A.11(20):7566-7571, Ikeda et al (203) "Synthesis of a novel heterologous gene and expression interaction of Protein in vivo," Protein Engineer. Des.16 (9 699, 9, 203) "expression of Protein in cells" 1. 31. Biocoding. C.31, expression of Protein in vivo, "expression of Protein, III. 12, 1. D.201, and D.31, Escherichia coli, and No. 31, No. 5-7, No. 5-amino acids in cells, No. 31, No. 5-amino acids, No. 5, No. 7, No. 1, No. 7, No. 1, 7, No. 1, No. 7, 1, 2, 1, 2, 1, a, bacher et al (201) "Selection and catalysis of Escherichia coli Variants Cap of Growth on and thermal toxin Tryptophan Analyze," J.Bacterol.183 (18): 5414-minus 5425, Hamano-Takaku et al (2000) "A mutation Escherichia coli Tyrosyl-tRNA said Amino acids said Amino Acid molecule affinity ligand, J.biol.Chem.275(51): 4324-minus 4328, and Bursa et al (201)" Property with { factor a } - (thio) ligands Amino acids ligand Amino Acid ligand and polypeptide molecule antigen ligand, molecular Amino Acid ligand and polypeptide 1297 (10) } Process enzyme.

To further illustrate, an amino acid is typically an organic acid that includes a substituted or unsubstituted amino group, a substituted or unsubstituted carboxyl group, and one or more side chains or groups, or analogs of any of these groups. Exemplary side chains include, for example, mercapto, seleno, sulfonyl, alkyl, aryl, acyl, keto, azido, hydroxyl, hydrazine, cyano, halogen, hydrazide, alkenyl, alkynyl, ether, borate, phospho, phosphono, phosphine, heterocycle, enone, imine, aldehyde, ester, thioacid, hydroxylamine, or any combination of these groups. Other representative amino acids include, but are not limited to, amino acids comprising photoactive crosslinkers, metal-binding amino acids, spin-labeled amino acids, fluorescing amino acids, metal-containing amino acids, amino acids containing new functional groups, amino acids that interact covalently or non-covalently with other molecules, photolabile (photocaged) and/or photoisomerizable amino acids, radioactive amino acids, amino acids comprising biotin or biotin analogs, glycosylated amino acids, other carbohydrate-modified amino acids, amino acids comprising polyethylene glycol or polyethers, heavy atom substituted amino acids, chemically cleavable and/or photocleavable amino acids, amino acids comprising carbon-linked sugars, redox-active amino acids, amino acids comprising aminothioacids, and amino acids comprising one or more toxic moieties.

The original amino acid sequence of heparinase III in the invention (as shown in Seq ID No. 1) is the sequence of heparinase III published in NCBI database.

Then, on the basis of the amino acid sequence, according to the preference of the expression host cell for codon usage, the nucleotide sequence corresponding to the amino acid sequence is subjected to codon optimization, so as to improve the translation efficiency of the target sequence in the fermentation process of the host cell, and obtain more target proteins as much as possible.

After a target sequence is synthesized by a target company, a target amino acid is mutated on the basis of the sequence, and the fragments are connected by a Gibson reaction after the product is amplified by fragment PCR to form a mutated vector.

Polymerase Chain Reaction (PCR) is a method of using a DNA fragment as a template, and amplifying the DNA fragment to a sufficient amount for structural and functional analysis in the presence of DNA polymerase and nucleotide substrates. The PCR detection method has extremely important significance in the aspects of clinical rapid diagnosis of bacterial infectious diseases and the like.

The principle of PCR is used to amplify a DNA fragment located between two known sequences, similar to the process of replication of natural DNA. The DNA molecule to be amplified is used as a template, a pair of oligonucleotide fragments which are respectively complementary with the 5 'end and the 3' end of the template are used as primers, under the action of DNA polymerase, the DNA molecule is extended along the template chain according to a semi-reserved replication mechanism until new DNA synthesis is completed, and the process is repeated, so that the target DNA fragment can be amplified.

Gibson assembly was first proposed in 2009 by Daniel Gibson doctor and his colleagues j. Gibson assembly is well suited for splicing multiple linear DNA fragments, as well as for inserting the DNA of interest into a vector. First, it is necessary to add a homologous fragment to the end of a DNA fragment (by PCR); then, these DNA fragments were mixed with a Master Mix (containing three enzymes) and incubated for one hour.

Wherein the Master Mix contains three different types of enzymes:

an exonuclease which digests the DNA starting from the 5' end to produce a cohesive end which grows to facilitate paired binding to another homologous end;

a polymerase for repairing a gap;

a DNA ligase realizes traceless splicing to form a complete DNA molecule.

Designing a primer near a mutation amino acid site, wherein a modified amino acid codon is contained in the primer, and the sequence also comprises a homologous sequence required by connecting adjacent segments, purifying the PCR product after amplification, and connecting the PCR amplified segments by a Gibson reaction reagent to obtain a heparinase III recombinant expression vector sequence with the amino acid mutation at a specified position.

Finally, the optimized amino acid sequence shown as Seq ID No.2 or Seq ID No.3 is obtained. In addition, Seq ID No.2 indicates that all of the Q glutamines at positions 85, 114, 403 and 547 in the amino acid sequence of Seq ID No.1 are replaced with A alanine. The amino acid sequence of Seq ID No.3 is that Q glutamine at position 403 of Seq ID No.1 is replaced by V valine, and Q glutamine at positions 85, 114 and 547 is replaced by A alanine.

A coding sequence: the term "coding sequence" as used herein includes a nucleotide sequence which directly indicates the amino acid sequence of its protein product. The boundaries of the coding sequence are generally determined by an open reading frame, which usually begins with the ATG start codon. The coding sequence typically includes DNA, cDNA, and recombinant nucleotide sequences.

The term "nucleotide", in addition to referring to naturally occurring ribonucleotide or deoxyribonucleotide monomers, is also understood herein to refer to related structural variants thereof, including derivatives and analogs, which are functionally equivalent with respect to the particular context in which the nucleotide is used, unless the context clearly indicates otherwise.

The terms "codon-optimized," "codon-optimized," or "codon-usage bias" refer to the practice of selecting codons (i.e., codon usage) in a manner that optimizes or customizes expression as desired (i.e., a technique that improves protein expression in an organism by increasing the translation efficiency of a gene of interest). In other words, codon optimization is a method of adjusting codons to match host tRNA abundance and has traditionally been used to express heterologous genes. New strategies for optimizing heterologous expression take into account global nucleotide content such as local mRNA folding, codon pair bias, codon ramp (codon ramp) or codon dependence. Codon optimization is possible because codon degeneracy is inherent. Since there are more codons than can encode an amino acid, resulting in degeneracy. Thus, the vast majority of amino acids are encoded by multiple codons, meaning that there are multiple trnas (with different anti-codon loops) that carry any given amino acid. Thus, different codons can be used without changing the encoded amino acid sequence. That is, a gene or fragment of a nucleic acid may be mutated/altered (or de novo synthesized) to change the codon used to encode a particular amino acid, without altering the amino acid sequence of the polypeptide/protein itself. For example, rare codons can be replaced with more abundant codons while keeping the amino acid sequence unchanged.

The term "host cell" refers to unicellular prokaryotic and eukaryotic organisms (e.g., bacteria, yeast, and actinomycetes) as well as unicellular cells from higher plants or animals when grown in cell culture. A "host cell" can be an animal host cell, a plant host cell, a yeast host cell, a fungal host cell, a protozoan host cell and a prokaryotic host cell.

For example, the host cell is selected from: pichia pastoris (Pichia pastoris), Pichia angusta (Pichia angusta) (Hansenula polymorpha)), Pichia finnanensis (Pichia finlandica), Pichia mycorrhiza (Pichia pastoris), Pichia pastoris (Ogataea minuta), Pichia stipitis (Pichia pastoris), Pichia pastoris (Ogataea minuta), Pichia pastoris (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia stipityrium), Pichia pastoris (Pichia pastoris), Pichia pastoris (Aspergillus niger (Pichia pastoris), Pichia pastoris (Pichia pastoris) or (Pichia pastoris), Pichia pastoris (Pichia pastoris) strain (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia pastoris) or (Pichia pastoris), Pichia pastoris) stem (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia pastoris) stem (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia pastoris) or (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia pastoris (Pichia pastoris) or (Pichia pastoris) or (Pichia pastoris) strain (Pichia pastoris), Pichia pastoris) or (Pichia pastoris), Pichia pastoris (Pichia pastoris) strain (Pichia pastoris) or (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia pastoris (Pichia pastoris), Pichia pastoris) or (Pichia pastoris) or (Pichia pastoris), Pichia pastoris) or (Pichia pastoris) or (Pichia pastoris), Pichia pastoris) or (Pichia pastoris) strain (Pichia pastoris (, Arxula adeninivorans, Aspergillus nidulans (Aspergillus nidulans), Aspergillus went (Aspergillus wentii), Aspergillus aureobasidium (Aspergillus aureus), Aspergillus flavus (Aspergillus flavus), Ashbya gossypii (Ashbya gossypii), Methylophilus methylotrophus (Methyphus methylotrophus), Schizosaccharomyces pombe (Schizosaccharomyces pombe), Candida boidinii (Candida boidinii), Candida utilis (Candida utilis), Rhizopus oryzae (Rhizopus oryzae), Debaryomyces hansenii (Debaromyces hansenii) and Saccharomyces cerevisiae. In one embodiment of the invention, the host cell may be pichia or saccharomyces cerevisiae. In one embodiment of the invention, the host cell is of the genus Saccharomyces cerevisiae. In a preferred embodiment of the invention, the host cell is of the genus Pichia (Pichia), using the Pichia pastoris expression system. Compared with a prokaryotic expression system, the yeast as a unicellular eukaryote has the following characteristics: first, yeast is a eukaryote that can make glycosylation of certain proteins more stable, and post-translational modifications such as correct disulfide bond formation and removal of signal peptides, N-and O-binding glycosylation modifications; secondly, the yeast has a unicellular microbial structure and has the advantages of rapid growth of a bacterial system, easy genetic engineering operation and the like; compared with Escherichia coli, yeast has no endotoxin, no lysogenic virus, and has close relationship with human, no pathogenicity, and has been used in bread and wine industry for a long time. Yeast is also an ideal secretory expression system. Normally, no protein is added into the yeast culture medium, and the normal yeast secretion protein is only 0.5 percent of the total protein of the yeast cells; it can also secrete the produced foreign protein into the culture medium, thus facilitating the separation and purification of the product, and greatly reducing the cost. For those proteins that require correct post-translational folding, natural secretion, and foreign proteins that are not stable or toxic intracellularly, expression in such secretory expression systems is well suited.

The pichia pastoris system has several advantages: (1) the system adopts a methanol-induced promoter, a single-copy ethanol oxidase gene can produce 30 percent of ethanol oxidase of the total soluble protein amount of cells under the drive of the promoter, and the promoter is introduced into an expression vector to drive the expression of a foreign gene. (2) High density growth, in the 70 s, pichia was used to produce single cell proteins, thus producing fermentation technology of pichia up to 100 g of dry cells per liter of culture. (3) High level secretion in protein-free culture medium, and the expression vector with Saccharomyces cerevisiae factor a signal peptide can secrete the expression product to outside cell. The medium for expression contains inorganic salts, trace elements, biotin and carbon source, and has no toxin and pyrogen, so that the secreted expression product is easy to purify. (4) The glycosylation form of Pichia is closer to that of mammals, and the expression product N-combined mannose is generally only (8-14), but is as much as (50-150) in Saccharomyces cerevisiae. Furthermore, these products do not contain a combination of terminal a-1, 3-mannose as in Saccharomyces cerevisiae, and this modification can result in immunogenetics. (5) The stability of the plasmid is good, and the pichia pastoris has no plasmid, allows the tandem repetition of the expression components, and is suitable for integrating to a chromosome to express the foreign gene. (6) The permeability of the secretion product is good, and the permeability is further influenced by the fact that in Saccharomyces cerevisiae, due to excessive glycosylation, many mannose residues are present on each N-bonded oligosaccharide chain. Generally, foreign proteins with molecular weight of more than 20KDa can not be secreted into a culture solution, and pichia does not have the phenomenon, has a set of secretion pathways extremely similar to those of eukaryotes in vivo, and is secreted from endoplasmic reticulum to the outside through Golgi bodies and vesicles. Both the diverting enzyme with the molecular weight of more than 50KDa and the human serum protein can be secreted to the outside of cells, and the expression amount is 1 g/L.

In a preferred embodiment of the present invention, the nucleotide sequence corresponding to the above amino acid sequence is codon optimized according to the codon usage preference of pichia pastoris for expression, and the protein sequence of heparinase III is recorded in codon preference analysis software, such as GenSmart, which is self-developed by tsingji biotechnology limited of Nanjing^TMSequence Optimization was performed by Codon Optimization sequence Optimization software (https:// www.genscript.com/tools/generational-code-Optimization); on the premise of ensuring that the protein sequence of heparinase III is unchanged and only utilizing the degeneracy of codons, the codons which are low in use frequency and can influence the passing efficiency of a ribosome in the translation process in pichia pastoris are replaced by the codons with high use frequency to obtain a nucleotide sequence with optimized codons, wherein the nucleotide sequence is shown as Seq ID No.4 or Seq ID No.5 in a sequence table. Wherein, Seq ID No.4 is the nucleotide sequence for coding the heparinase III described by Seq ID No.2, and Seq ID No.5 is the nucleotide sequence for coding the heparinase III described by Seq ID No. 3.

Expressing: the term "expression" in this context includes any step involved in the production of a polypeptide, including but not limited to transcription, post-transcriptional modification, translation, post-translational modification, and secretion.

The term "vector" refers to a piece of DNA, usually double stranded, into which a foreign piece of DNA may have been inserted. The vector may be of plasmid origin, for example. The vector contains a "replicon" polynucleotide sequence that facilitates autonomous replication of the vector in a host cell. Exogenous DNA is defined as heterologous DNA, which is DNA not naturally found in the host cell, which, for example, replicates a vector molecule, encodes a selectable or screenable marker, or encodes a transgene. The vector is used to transport the foreign or heterologous DNA into a suitable host cell. Once in the host cell, the vector may replicate independently of or simultaneously with the host chromosomal DNA, and several copies of the vector and its inserted DNA may be generated. In addition, the vector may contain the necessary elements to permit transcription of the inserted DNA into an mRNA molecule or to otherwise cause replication of the inserted DNA into multiple copies of RNA. Some expression vectors additionally contain sequence elements near the inserted DNA that increase the half-life of the expressed mRNA and/or allow translation of the mRNA into a protein molecule. Thus, many molecules of mRNA and polypeptide encoded by the inserted DNA can be synthesized rapidly.

Expression vector: the term "expression vector" in this context includes a linear or circular DNA molecule comprising a fragment encoding a polypeptide of the present invention, which fragment may be operably linked to other fragments which allow its transcription.

The recombinant expression vector may be any vector (e.g., a plasmid or virus) which can be conveniently treated with recombinant DNA procedures and can express the nucleotide sequence. The choice of vector will generally depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vector may be a linear or closed circular plasmid.

The vector may be an autonomously replicating vector, i.e., a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome.

The vector may contain any means for ensuring self-replication (means). Alternatively, the vector may be integrated into the genome and replicated together with the chromosome(s) into which it has been integrated, when introduced into the host cell. Alternatively, a single vector or plasmid, or two or more vectors or plasmids which collectively contain the entire DNA to be introduced into the genome of the host cell, may be used, or a transposon may be used.

In one embodiment of the invention, the recombinant vector comprises a nucleotide sequence encoding the amino acid sequence shown in Seq ID No.2 or Seq ID No.3, preferably, when the host cell is selected from Pichia pastoris, said recombinant vector comprises the nucleotide sequence shown in Seq ID No.4 or Seq ID No. 5.

In one embodiment of the invention, the recombinant vector is a eukaryotic recombinant vector. The expression vector adopted by the recombinant vector can be pPink-HC, pPICZaA, pPICZ A, pPICZ, pPICZ alpha, pGAPZ, pGAPZ, pHBM905A, pPIC9K, pPIC9K-His, pPIC3.5K, pPIC9, pPICZ alpha A, pAO815, pPIC9k-His, pHIL-S1, pGADT7, pGB 7, pWB980, pT3 and the like; preference is given to pPink-HC, pPICZaA and pPICZ A, particular preference to pPink-HC. The above-mentioned vectors are commercially available.

In a preferred embodiment of the present invention, the eukaryotic cell recombinant vector is pPink-HC. In another preferred embodiment of the present invention, the eukaryotic recombinant vector is pPICZaA.

In another preferred embodiment of the invention, the eukaryotic recombinant vector is pPICZ a.

In one embodiment of the invention, the nucleotide sequence corresponding to the amino acid sequence is codon optimized according to the codon usage preference of pichia pastoris, and the protein sequence of heparinase III is recorded into GenSmart autonomously developed by codon preference analysis software Nanjing Kingsler Biotech Co ^TMSequence Optimization was performed by the Codon Optimization sequence Optimization software (https:// www.genscript.com/tools/genetic-code-Optimization); on the premise of ensuring that the protein sequence of heparinase III is not changed and only utilizing the degeneracy of codons, the codons which are low in use frequency and can influence the passing efficiency of a ribosome in the translation process in pichia pastoris are replaced by the codons with high use frequency, so that the nucleotide sequence of the amino acid sequence shown in Seq ID No.2 or Seq ID No.3 is obtained.

In a preferred embodiment of the invention, the codon-optimized nucleotide sequence is the nucleotide sequence shown in Seq ID No.4 or Seq ID No. 5.

In one embodiment of the present invention, the above nucleotide sequence is combined with a eukaryotic recombinant vector to obtain a recombinant vector. For example, the above nucleotide sequence is combined with pPink-HC to obtain a pPink-HC recombinant vector.

In one embodiment of the present invention, the above recombinant vector is introduced into a host cell, such as a pPink-HC recombinant vector into Pichia pastoris.

In one embodiment of the present invention, the above nucleotide sequence is combined with pPICZaA to obtain a pPICZaA recombinant vector.

In one embodiment of the present invention, the above recombinant vector is introduced into a host cell, such as a pPICZaA recombinant vector into Pichia pastoris.

In one embodiment of the present invention, the above nucleotide sequence is combined with pPICZ A to obtain a pPICZ A recombinant vector.

In one embodiment of the present invention, the above recombinant vector is introduced into a host cell, such as a pPICZ A recombinant vector into Pichia pastoris.

Means for purifying recombinant proteins are well known in the art and may include clarification (e.g., filtration or centrifugation), affinity chromatography, immunoaffinity chromatography, protein a (or G) chromatography, ion exchange (i.e., cation and/or anion) chromatography, size exclusion chromatography, adsorption chromatography, hydrophobic interaction chromatography, reverse phase chromatography, ultracentrifugation, precipitation, immunoprecipitation, extraction, phase separation, and the like.

The methods described herein comprise expressing a recombinant protein in a host cell line engineered to have at least one labeled Host Cell Protein (HCP), wherein the at least one labeled HCP is labeled with at least one purification tag. Typically, labeled HCPs are proteins that are highly abundant, difficult to remove during downstream purification processes, and/or affect product quality (e.g., residual proteases may degrade the biotherapeutic product, thereby reducing its efficacy). An HCP with these characteristics is referred to as a "problematic" HCP. Typically, labeled HCPs are proteins essential for cell survival and/or cell function (and, therefore, are not good candidates for gene knockout strategies). They can be captured in a separation process (e.g., a separation column). Non-limiting embodiments of such additional domains include peptide motifs referred to as Myc tags, HAT tags, HA tags, TAP tags, GST tags, chitin binding domains (CBD tags), maltose binding proteins (MBP tags), Flag tags, Strep tags and variants thereof (e.g., strell tags), and His tags.

In some embodiments, the disclosed polypeptides comprise a Strep tag, e.g., a Strepll tag. Adding a strepII tag sequence at the C end of the heparinase III, purifying a Srep-Tactin column by using Buffer W through a desulfurated biotin purification column by using a crude cell extract, and realizing the purification of the target protein heparinase III by virtue of the interaction between the strepII tag and biotin.

In one embodiment of the invention, a disruption pretreatment is performed prior to purification, specifically as follows:

(a) taking thalli fermentation liquor for induced expression and carrying out low-temperature centrifugation;

(b) collecting the supernatant, and suspending by using Buffer W;

(c) breaking cells on ice-water mixture by ultrasonic;

(d) the suspension was centrifuged.

Preferably, the lysate is e.g.viscous, RNase A (10ug/ml) and DNase I (5ug/ml) can be added and incubated on ice.

In one embodiment of the invention, the purification steps of heparinase III are as follows:

(a) washing the Srep-Tactin column by using Buffer W, and then carrying out column loading operation;

(b) washing the column by using Buffer W, and collecting eluent;

(c) buffer E was added and the eluate was collected.

(d) The D-biotin sulfate was removed in several portions using Buffer R, and after complete removal, the column was washed using Buffer W.

Examples

The invention will be further illustrated with reference to specific embodiments. It should be understood that these embodiments are merely illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the contents of the present invention, and those equivalents may fall within the scope of the present invention defined by the appended claims.

Materials, reagents and the like used in the following examples are commercially available unless otherwise specified. In the following examples, unless otherwise specified, all methods are conventional. The DNA sequence optimization was done by sumizia jingzhi biotechnology limited. (Pichia pastoris, available from Thermo Fisher corporation, product Cat. A11152. Pichia Pink^TMThe vector kit was purchased from Thermo Fisher, inc, product No. a 11152. pPink-HC was purchased from Thermo Fisher, product No. A11152 (both expression vector and strain are included in the kit).

EXAMPLE 1 Synthesis and screening of amino acid III sequences

The original amino acid sequence of heparinase III is a sequence from heparinase III that has been published in the NCBI database.

Then, on the basis of the amino acid sequence, according to the preference of an expression host pichia pastoris for codon use, codon optimization is carried out on the nucleotide sequence corresponding to the amino acid sequence, so as to improve the translation efficiency of the target sequence in the pichia pastoris fermentation process, and obtain more target proteins as far as possible.

After a target sequence is synthesized by Suzhou Jinzhi biotechnology, Inc., mutation of a target amino acid is carried out on the basis of the sequence, and each fragment is connected through Gibson reaction after a fragment PCR amplification product to form a mutated vector.

The amino acid sequence shown in Seq ID No.2 or Seq ID No.3 was obtained.

Example 2 Synthesis of heparinase III nucleotide sequence and construction of recombinant expression vectors

(a) The sequence of the coding region of heparinase III from Flavobacterium heparinum contains a large amount of rare codons, and the translation efficiency of protein is influenced when Pichia pastoris is used as a host cell for expression, so the codon optimization is required. The protein sequence of heparinase III is recorded into codon preference analysis software, GenSmart which is independently developed by Nanjing Kingsler Biotech Co., Ltd ^TMSequence Optimization was performed by the Codon Optimization sequence Optimization software (https:// www.genscript.com/tools/genetic-code-Optimization);

(b) according to the codon usage preference of pichia pastoris in a pichia pastoris codon usage preference data table, optimizing the coding region sequence of heparinase III preliminarily screened in the embodiment 1 by using the software, replacing codons which are low in use frequency and can influence the passing efficiency of a ribosome in the translation process in pichia pastoris with codons with high use frequency on the premise of ensuring that the protein sequence of the heparinase III is unchanged and only utilizing the degeneracy of the codons to obtain a nucleotide sequence after codon optimization, wherein the obtained sequence is shown as Seq ID No.4 or Seq ID No.5 in a sequence table;

and then connecting the obtained codon-optimized nucleotide sequence of the heparinase III with a pPink-HC eukaryotic cell expression vector by a conventional means of molecular biology to construct a pPink-HC-heparinase III recombinant expression vector.

Example 3 eukaryotic cell expression vector is pPICZaA, other operation steps as shown in example 2, construction of pPICZaA-heparinase III recombinant expression vector.

Example 4 construction of pPICZ A-heparinase III recombinant expression vector for eukaryotic expression vector pPICZ A and other procedures as shown in example 2

Example 5 expression of heparinase III

Taking the recombinant expression vectors of example 2, example 3 and example 4, preparing competence andreference for electrotransformation and induction expression of thallus culture medium for Pichia Pink^TMAnd (5) carrying out operation by the carrier kit instruction.

Example 6 disruption pretreatment of heparinase III

(a) Taking the thallus fermentation liquor induced and expressed in the embodiment 5 for low-temperature centrifugation, and centrifuging for 15min at 4500g and 5 ℃;

(b) collecting supernatant, suspending 100ml of the thallus obtained in (a) with 1ml of Buffer W, and adding appropriate amount of protease inhibitor if necessary;

(c) breaking cells on ice-water mixture by ultrasonic;

(d) and (c) centrifuging the suspension obtained in the step (c) for 15min at the temperature of 4 ℃ and at the speed of 13000 rpm.

Preferably, the suspension is e.g.viscous, RNase A (10ug/ml) and DNase I (5ug/ml) can be added and incubated on ice for 10-15 min.

Example 7 purification of heparinase III

(a) The Srep-Tactin column is cleaned by using Buffer W of 2CVs, and then the suspension obtained in the example 6 is subjected to column loading operation;

(b) washing the column by using Buffer W of 5CV, and collecting eluent;

(c) 6 times of Buffer E at 0.5CVs was added and collected every 0.5 CV.

(d) D-Biotin sulfate was removed 3 times using a Buffer R of 15CVs, and after complete removal, the column was washed using 8CVs Buffer W.

Example 8 SDS-PAGE detection of heparinase III

The Buffer E treated sample of example 7 was collected and 20uL of the sample was subjected to heparinase III SDS-PAGE, and the molecular weight of heparinase III was about 73.36kDa as shown in FIG. 1. After the analysis by adopting gel density scanning software Bandscan 5.0, the protein purity of the target heparinase III corresponding to the amino acid sequence shown in Seq ID No.2 or Seq ID No.3 can reach more than 85%.

Example 9 enzyme Activity assay

The scanning wavelength was 232nm and the time was 3 min. Reaction buffer (20mM Tris, 200mM NaCl, pH 7.4) was mixed with the enzyme solution in a total of 1000uL (the ratio was adjusted according to the activity of the enzyme solution)Section), 500uL substrate solution (17mM Tris, 44mM NaCl, 3.5mM CaCl2,25g/L heparin sodium pH 7.4) is placed in a quartz cuvette, mixed uniformly and placed in a spectrophotometer to scan (reaction buffer and substrate solution are preheated to 30 ℃ in a water bath), the scanning time is 70s, data of 40-60s time period are taken, and the slope k (min) of the curve is calculated^-1) The enzyme activity (IU/L) of the heparinase III is calculated as follows (V is the volume of the enzyme solution added into the reaction system):

through determination, the enzyme activity of the modified heparinase III (Seq ID No.3) is 473.68IU/L, and the catalytic activity is not reduced.

Example 10 stability analysis

The purified heparinase III is placed in a constant temperature incubator at 30 ℃, certain enzyme liquid is periodically sucked, the enzyme activity is measured according to the method described in the embodiment 9, and the result is shown in figure 2 by comparing the enzyme activity with the enzyme activity of unmodified heparinase III. In FIG. 2, the enzyme of SEQ ID No. 1 represents a protein consisting of original amino acids, the enzyme of SEQ ID No.2 represents a protein consisting of Seq ID No.2 in the sequence Listing, and the enzyme of SEQ ID No.3 represents a protein consisting of Seq ID No.3 in the sequence Listing.

The comparison shows that the stability of the modified heparinase III is obviously improved, compared with the enzyme with the sequence 1, the half-life period of the enzyme activity of the heparinase III corresponding to the enzyme with the sequence 2 is improved to about 47h from about 30h, and the stability of the heparinase III corresponding to the enzyme with the mutant sequence 3 is obviously improved.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Sequence listing

<110> Liu Ying

<120> heparinase III, nucleotide sequence encoding same, recombinant vector and host cell comprising nucleotide sequence and application

<130> TPE01497

<160> 5

<170> PatentIn version 3.5

<210> 1

<211> 644

<212> PRT

<213> Artificial sequence

<220>

<223> Artificial sequence description: artificially synthesized sequences

<400> 1

Met Gln Ser Ser Ser Ile Thr Arg Lys Asp Phe Asp His Ile Asn Leu

1 5 10 15

Glu Tyr Ser Gly Leu Glu Lys Val Asn Lys Ala Val Ala Ala Gly Asn

20 25 30

Tyr Asp Asp Ala Ala Lys Ala Leu Leu Ala Tyr Tyr Arg Glu Lys Ser

35 40 45

Lys Ala Arg Glu Pro Asp Phe Ser Asn Ala Glu Lys Pro Ala Asp Ile

50 55 60

Arg Gln Pro Ile Asp Lys Val Thr Arg Glu Met Ala Asp Lys Ala Leu

65 70 75 80

Val His Gln Phe Gln Pro His Lys Gly Tyr Gly Tyr Phe Asp Tyr Gly

85 90 95

Lys Asp Ile Asn Trp Gln Met Trp Pro Val Lys Asp Asn Glu Val Arg

100 105 110

Trp Gln Leu His Arg Val Lys Trp Trp Gln Ala Met Ala Leu Val Tyr

115 120 125

His Ala Thr Gly Asp Glu Lys Tyr Ala Arg Glu Trp Val Tyr Gln Tyr

130 135 140

Ser Asp Trp Ala Arg Lys Asn Pro Leu Gly Leu Ser Gln Asp Asn Asp

145 150 155 160

Lys Phe Val Trp Arg Pro Leu Glu Val Ser Asp Arg Val Gln Ser Leu

165 170 175

Pro Pro Thr Phe Ser Leu Phe Val Asn Ser Pro Ala Phe Thr Pro Ala

180 185 190

Phe Leu Met Glu Phe Leu Asn Ser Tyr His Gln Gln Ala Asp Tyr Leu

195 200 205

Ser Thr His Tyr Ala Glu Gln Gly Asn His Arg Leu Phe Glu Ala Gln

210 215 220

Arg Asn Leu Phe Ala Gly Val Ser Phe Pro Glu Phe Lys Asp Ser Pro

225 230 235 240

Arg Trp Arg Gln Thr Gly Ile Ser Val Leu Asn Thr Glu Ile Lys Lys

245 250 255

Gln Val Tyr Ala Asp Gly Met Gln Phe Glu Leu Ser Pro Ile Tyr His

260 265 270

Val Ala Ala Ile Asp Ile Phe Leu Lys Ala Tyr Gly Ser Ala Lys Arg

275 280 285

Val Asn Leu Glu Lys Glu Phe Pro Gln Ser Tyr Val Gln Thr Val Glu

290 295 300

Asn Met Ile Met Ala Leu Ile Ser Ile Ser Leu Pro Asp Tyr Asn Thr

305 310 315 320

Pro Met Phe Gly Asp Ser Trp Ile Thr Asp Lys Asn Phe Arg Met Ala

325 330 335

Gln Phe Ala Ser Trp Ala Arg Val Phe Pro Ala Asn Gln Ala Ile Lys

340 345 350

Tyr Phe Ala Thr Asp Gly Lys Gln Gly Lys Ala Pro Asn Phe Leu Ser

355 360 365

Lys Ala Leu Ser Asn Ala Gly Phe Tyr Thr Phe Arg Ser Gly Trp Asp

370 375 380

Lys Asn Ala Thr Val Met Val Leu Lys Ala Ser Pro Pro Gly Glu Phe

385 390 395 400

His Ala Gln Pro Asp Asn Gly Thr Phe Glu Leu Phe Ile Lys Gly Arg

405 410 415

Asn Phe Thr Pro Asp Ala Gly Val Phe Val Tyr Ser Gly Asp Glu Ala

420 425 430

Ile Met Lys Leu Arg Asn Trp Tyr Arg Gln Thr Arg Ile His Ser Thr

435 440 445

Leu Thr Leu Asp Asn Gln Asn Met Val Ile Thr Lys Ala Arg Gln Asn

450 455 460

Lys Trp Glu Thr Gly Asn Asn Leu Asp Val Leu Thr Tyr Thr Asn Pro

465 470 475 480

Ser Tyr Pro Asn Leu Asp His Gln Arg Ser Val Leu Phe Ile Asn Lys

485 490 495

Lys Tyr Phe Leu Val Ile Asp Arg Ala Ile Gly Glu Ala Thr Gly Asn

500 505 510

Leu Gly Val His Trp Gln Leu Lys Glu Asp Ser Asn Pro Val Phe Asp

515 520 525

Lys Thr Lys Asn Arg Val Tyr Thr Thr Tyr Arg Asp Gly Asn Asn Leu

530 535 540

Met Ile Gln Ser Leu Asn Ala Asp Arg Thr Ser Leu Asn Glu Glu Glu

545 550 555 560

Gly Lys Val Ser Tyr Val Tyr Asn Lys Glu Leu Lys Arg Pro Ala Phe

565 570 575

Val Phe Glu Lys Pro Lys Lys Asn Ala Gly Thr Gln Asn Phe Val Ser

580 585 590

Ile Val Tyr Pro Tyr Asp Gly Gln Lys Ala Pro Glu Ile Ser Ile Arg

595 600 605

Glu Asn Lys Gly Asn Asp Phe Glu Lys Gly Lys Leu Asn Leu Thr Leu

610 615 620

Thr Ile Asn Gly Lys Gln Gln Leu Val Leu Val Pro Trp Ser His Pro

625 630 635 640

Gln Phe Glu Lys

<210> 2

<211> 644

<212> PRT

<213> Artificial sequence

<220>

<223> Artificial sequence description: artificially synthesized sequences

<400> 2

Met Gln Ser Ser Ser Ile Thr Arg Lys Asp Phe Asp His Ile Asn Leu

1 5 10 15

Glu Tyr Ser Gly Leu Glu Lys Val Asn Lys Ala Val Ala Ala Gly Asn

20 25 30

Tyr Asp Asp Ala Ala Lys Ala Leu Leu Ala Tyr Tyr Arg Glu Lys Ser

35 40 45

Lys Ala Arg Glu Pro Asp Phe Ser Asn Ala Glu Lys Pro Ala Asp Ile

50 55 60

Arg Gln Pro Ile Asp Lys Val Thr Arg Glu Met Ala Asp Lys Ala Leu

65 70 75 80

Val His Gln Phe Ala Pro His Lys Gly Tyr Gly Tyr Phe Asp Tyr Gly

85 90 95

Lys Asp Ile Asn Trp Gln Met Trp Pro Val Lys Asp Asn Glu Val Arg

100 105 110

Trp Ala Leu His Arg Val Lys Trp Trp Gln Ala Met Ala Leu Val Tyr

115 120 125

His Ala Thr Gly Asp Glu Lys Tyr Ala Arg Glu Trp Val Tyr Gln Tyr

130 135 140

Ser Asp Trp Ala Arg Lys Asn Pro Leu Gly Leu Ser Gln Asp Asn Asp

145 150 155 160

Lys Phe Val Trp Arg Pro Leu Glu Val Ser Asp Arg Val Gln Ser Leu

165 170 175

Pro Pro Thr Phe Ser Leu Phe Val Asn Ser Pro Ala Phe Thr Pro Ala

180 185 190

Phe Leu Met Glu Phe Leu Asn Ser Tyr His Gln Gln Ala Asp Tyr Leu

195 200 205

Ser Thr His Tyr Ala Glu Gln Gly Asn His Arg Leu Phe Glu Ala Gln

210 215 220

Arg Asn Leu Phe Ala Gly Val Ser Phe Pro Glu Phe Lys Asp Ser Pro

225 230 235 240

Arg Trp Arg Gln Thr Gly Ile Ser Val Leu Asn Thr Glu Ile Lys Lys

245 250 255

Gln Val Tyr Ala Asp Gly Met Gln Phe Glu Leu Ser Pro Ile Tyr His

260 265 270

Val Ala Ala Ile Asp Ile Phe Leu Lys Ala Tyr Gly Ser Ala Lys Arg

275 280 285

Val Asn Leu Glu Lys Glu Phe Pro Gln Ser Tyr Val Gln Thr Val Glu

290 295 300

Asn Met Ile Met Ala Leu Ile Ser Ile Ser Leu Pro Asp Tyr Asn Thr

305 310 315 320

Pro Met Phe Gly Asp Ser Trp Ile Thr Asp Lys Asn Phe Arg Met Ala

325 330 335

Gln Phe Ala Ser Trp Ala Arg Val Phe Pro Ala Asn Gln Ala Ile Lys

340 345 350

Tyr Phe Ala Thr Asp Gly Lys Gln Gly Lys Ala Pro Asn Phe Leu Ser

355 360 365

Lys Ala Leu Ser Asn Ala Gly Phe Tyr Thr Phe Arg Ser Gly Trp Asp

370 375 380

Lys Asn Ala Thr Val Met Val Leu Lys Ala Ser Pro Pro Gly Glu Phe

385 390 395 400

His Ala Ala Pro Asp Asn Gly Thr Phe Glu Leu Phe Ile Lys Gly Arg

405 410 415

Asn Phe Thr Pro Asp Ala Gly Val Phe Val Tyr Ser Gly Asp Glu Ala

420 425 430

Ile Met Lys Leu Arg Asn Trp Tyr Arg Gln Thr Arg Ile His Ser Thr

435 440 445

Leu Thr Leu Asp Asn Gln Asn Met Val Ile Thr Lys Ala Arg Gln Asn

450 455 460

Lys Trp Glu Thr Gly Asn Asn Leu Asp Val Leu Thr Tyr Thr Asn Pro

465 470 475 480

Ser Tyr Pro Asn Leu Asp His Gln Arg Ser Val Leu Phe Ile Asn Lys

485 490 495

Lys Tyr Phe Leu Val Ile Asp Arg Ala Ile Gly Glu Ala Thr Gly Asn

500 505 510

Leu Gly Val His Trp Gln Leu Lys Glu Asp Ser Asn Pro Val Phe Asp

515 520 525

Lys Thr Lys Asn Arg Val Tyr Thr Thr Tyr Arg Asp Gly Asn Asn Leu

530 535 540

Met Ile Ala Ser Leu Asn Ala Asp Arg Thr Ser Leu Asn Glu Glu Glu

545 550 555 560

Gly Lys Val Ser Tyr Val Tyr Asn Lys Glu Leu Lys Arg Pro Ala Phe

565 570 575

Val Phe Glu Lys Pro Lys Lys Asn Ala Gly Thr Gln Asn Phe Val Ser

580 585 590

Ile Val Tyr Pro Tyr Asp Gly Gln Lys Ala Pro Glu Ile Ser Ile Arg

595 600 605

Glu Asn Lys Gly Asn Asp Phe Glu Lys Gly Lys Leu Asn Leu Thr Leu

610 615 620

Thr Ile Asn Gly Lys Gln Gln Leu Val Leu Val Pro Trp Ser His Pro

625 630 635 640

Gln Phe Glu Lys

<210> 3

<211> 644

<212> PRT

<213> Artificial sequence

<220>

<223> Artificial sequence description: artificially synthesized sequences

<400> 3

Met Gln Ser Ser Ser Ile Thr Arg Lys Asp Phe Asp His Ile Asn Leu

1 5 10 15

Glu Tyr Ser Gly Leu Glu Lys Val Asn Lys Ala Val Ala Ala Gly Asn

20 25 30

Tyr Asp Asp Ala Ala Lys Ala Leu Leu Ala Tyr Tyr Arg Glu Lys Ser

35 40 45

Lys Ala Arg Glu Pro Asp Phe Ser Asn Ala Glu Lys Pro Ala Asp Ile

50 55 60

Arg Gln Pro Ile Asp Lys Val Thr Arg Glu Met Ala Asp Lys Ala Leu

65 70 75 80

Val His Gln Phe Ala Pro His Lys Gly Tyr Gly Tyr Phe Asp Tyr Gly

85 90 95

Lys Asp Ile Asn Trp Gln Met Trp Pro Val Lys Asp Asn Glu Val Arg

100 105 110

Trp Ala Leu His Arg Val Lys Trp Trp Gln Ala Met Ala Leu Val Tyr

115 120 125

His Ala Thr Gly Asp Glu Lys Tyr Ala Arg Glu Trp Val Tyr Gln Tyr

130 135 140

Ser Asp Trp Ala Arg Lys Asn Pro Leu Gly Leu Ser Gln Asp Asn Asp

145 150 155 160

Lys Phe Val Trp Arg Pro Leu Glu Val Ser Asp Arg Val Gln Ser Leu

165 170 175

Pro Pro Thr Phe Ser Leu Phe Val Asn Ser Pro Ala Phe Thr Pro Ala

180 185 190

Phe Leu Met Glu Phe Leu Asn Ser Tyr His Gln Gln Ala Asp Tyr Leu

195 200 205

Ser Thr His Tyr Ala Glu Gln Gly Asn His Arg Leu Phe Glu Ala Gln

210 215 220

Arg Asn Leu Phe Ala Gly Val Ser Phe Pro Glu Phe Lys Asp Ser Pro

225 230 235 240

Arg Trp Arg Gln Thr Gly Ile Ser Val Leu Asn Thr Glu Ile Lys Lys

245 250 255

Gln Val Tyr Ala Asp Gly Met Gln Phe Glu Leu Ser Pro Ile Tyr His

260 265 270

Val Ala Ala Ile Asp Ile Phe Leu Lys Ala Tyr Gly Ser Ala Lys Arg

275 280 285

Val Asn Leu Glu Lys Glu Phe Pro Gln Ser Tyr Val Gln Thr Val Glu

290 295 300

Asn Met Ile Met Ala Leu Ile Ser Ile Ser Leu Pro Asp Tyr Asn Thr

305 310 315 320

Pro Met Phe Gly Asp Ser Trp Ile Thr Asp Lys Asn Phe Arg Met Ala

325 330 335

Gln Phe Ala Ser Trp Ala Arg Val Phe Pro Ala Asn Gln Ala Ile Lys

340 345 350

Tyr Phe Ala Thr Asp Gly Lys Gln Gly Lys Ala Pro Asn Phe Leu Ser

355 360 365

Lys Ala Leu Ser Asn Ala Gly Phe Tyr Thr Phe Arg Ser Gly Trp Asp

370 375 380

Lys Asn Ala Thr Val Met Val Leu Lys Ala Ser Pro Pro Gly Glu Phe

385 390 395 400

His Ala Val Pro Asp Asn Gly Thr Phe Glu Leu Phe Ile Lys Gly Arg

405 410 415

Asn Phe Thr Pro Asp Ala Gly Val Phe Val Tyr Ser Gly Asp Glu Ala

420 425 430

Ile Met Lys Leu Arg Asn Trp Tyr Arg Gln Thr Arg Ile His Ser Thr

435 440 445

Leu Thr Leu Asp Asn Gln Asn Met Val Ile Thr Lys Ala Arg Gln Asn

450 455 460

Lys Trp Glu Thr Gly Asn Asn Leu Asp Val Leu Thr Tyr Thr Asn Pro

465 470 475 480

Ser Tyr Pro Asn Leu Asp His Gln Arg Ser Val Leu Phe Ile Asn Lys

485 490 495

Lys Tyr Phe Leu Val Ile Asp Arg Ala Ile Gly Glu Ala Thr Gly Asn

500 505 510

Leu Gly Val His Trp Gln Leu Lys Glu Asp Ser Asn Pro Val Phe Asp

515 520 525

Lys Thr Lys Asn Arg Val Tyr Thr Thr Tyr Arg Asp Gly Asn Asn Leu

530 535 540

Met Ile Ala Ser Leu Asn Ala Asp Arg Thr Ser Leu Asn Glu Glu Glu

545 550 555 560

Gly Lys Val Ser Tyr Val Tyr Asn Lys Glu Leu Lys Arg Pro Ala Phe

565 570 575

Val Phe Glu Lys Pro Lys Lys Asn Ala Gly Thr Gln Asn Phe Val Ser

580 585 590

Ile Val Tyr Pro Tyr Asp Gly Gln Lys Ala Pro Glu Ile Ser Ile Arg

595 600 605

Glu Asn Lys Gly Asn Asp Phe Glu Lys Gly Lys Leu Asn Leu Thr Leu

610 615 620

Thr Ile Asn Gly Lys Gln Gln Leu Val Leu Val Pro Trp Ser His Pro

625 630 635 640

Gln Phe Glu Lys

<210> 4

<211> 1935

<212> DNA

<213> Artificial sequence

<220>

<223> Artificial sequence description: artificially synthesized sequences

<400> 4

atgcaatctt cttctatcac tagaaaggat ttcgatcata tcaacttgga gtattctggt 60

ttggaaaagg ttaacaaagc tgttgctgct ggtaattacg atgatgctgc taaggctttg 120

ttggcttact atagagagaa gtctaaagct agagaaccag atttttctaa tgctgagaag 180

ccagctgata tcagacaacc tatcgataag gttactagag aaatggctga taaggctttg 240

gttcatcaat tcgctccaca caagggttac ggttacttcg attacggtaa agatatcaac 300

tggcaaatgt ggcctgttaa ggataacgaa gttagatggg ctttgcatcg tgttaagtgg 360

tggcaagcta tggctttggt ttatcatgct actggagatg agaagtatgc tagagaatgg 420

gtttaccaat attctgattg ggctagaaag aacccattgg gtttgtctca agataacgat 480

aagttcgttt ggagaccttt ggaggtttct gatagagttc aatctttgcc acctactttt 540

tctttgttcg ttaactctcc agcttttact cctgctttct tgatggaatt tttgaactct 600

taccatcaac aagctgatta cttgtctact cattatgctg agcaaggtaa ccacagattg 660

ttcgaagctc aaagaaattt gtttgctggt gtttcttttc cagagtttaa ggattctcct 720

agatggagac aaactggtat ctctgttttg aacactgaga ttaagaaaca agtttacgct 780

gatggtatgc aattcgaatt gtctccaatc taccacgttg ctgctattga tattttcttg 840

aaggcttacg gttctgctaa aagagttaac ttggagaagg aatttcctca atcttacgtt 900

caaactgttg aaaacatgat catggctttg atctctattt ctttgccaga ttacaacact 960

cctatgtttg gagattcttg gatcactgat aagaacttca gaatggctca atttgcttct 1020

tgggctagag ttttcccagc taaccaagct attaaatact tcgctactga tggtaaacag 1080

ggtaaagctc ctaacttctt gtctaaggct ttgtctaatg ctggttttta cactttcaga 1140

tctggttggg ataaaaatgc tactgttatg gttttgaagg cttctccacc tggagagttt 1200

catgctgctc cagataacgg tactttcgaa ttgttcatta agggtagaaa cttcactcct 1260

gatgctggtg tttttgttta ttctggagat gaggctatta tgaagttgag aaactggtac 1320

agacaaacta gaatccattc tactttgact ttggataacc aaaacatggt tattactaag 1380

gctagacaaa acaagtggga aactggtaac aatttggatg ttttgactta cactaaccca 1440

tcttatccta atttggatca tcaaagatct gttttgttca ttaacaagaa atactttttg 1500

gttattgata gagctatcgg agaggctact ggtaatttgg gtgttcactg gcaattgaag 1560

gaagattcta acccagtttt cgataagact aaaaatagag tttacactac ttacagagat 1620

ggtaacaatt tgatgattgc ttctttgaac gctgatagaa cttctttgaa tgaagaggag 1680

ggtaaagttt cttacgttta caataaggag ttgaaaagac cagcttttgt tttcgaaaag 1740

cctaagaaaa acgctggtac tcaaaacttc gtttctatcg tttacccata tgatggtcaa 1800

aaagctcctg agatctctat cagagaaaac aagggtaacg atttcgagaa gggtaaattg 1860

aacttgactt tgactattaa tggtaaacaa caattggttt tggttccatg gtctcaccct 1920

caatttgaaa agtaa 1935

<210> 5

<211> 1935

<212> DNA

<213> Artificial sequence

<220>

<223> Artificial sequence description: artificially synthesized sequences

<400> 5

atgcaatctt cttctatcac tagaaaggat ttcgatcata tcaacttgga gtattctggt 60

ttggaaaagg ttaacaaagc tgttgctgct ggtaattacg atgatgctgc taaggctttg 120

ttggcttact atagagagaa gtctaaagct agagaaccag atttttctaa tgctgagaag 180

ccagctgata tcagacaacc tatcgataag gttactagag aaatggctga taaggctttg 240

gttcatcaat tcgctccaca caagggttac ggttacttcg attacggtaa agatatcaac 300

tggcaaatgt ggcctgttaa ggataacgaa gttagatggg ctttgcatcg tgttaagtgg 360

tggcaagcta tggctttggt ttatcatgct actggagatg agaagtatgc tagagaatgg 420

gtttaccaat attctgattg ggctagaaag aacccattgg gtttgtctca agataacgat 480

aagttcgttt ggagaccttt ggaggtttct gatagagttc aatctttgcc acctactttt 540

tctttgttcg ttaactctcc agcttttact cctgctttct tgatggaatt tttgaactct 600

taccatcaac aagctgatta cttgtctact cattatgctg agcaaggtaa ccacagattg 660

ttcgaagctc aaagaaattt gtttgctggt gtttcttttc cagagtttaa ggattctcct 720

agatggagac aaactggtat ctctgttttg aacactgaga ttaagaaaca agtttacgct 780

gatggtatgc aattcgaatt gtctccaatc taccacgttg ctgctattga tattttcttg 840

aaggcttacg gttctgctaa aagagttaac ttggagaagg aatttcctca atcttacgtt 900

caaactgttg aaaacatgat catggctttg atctctattt ctttgccaga ttacaacact 960

cctatgtttg gagattcttg gatcactgat aagaacttca gaatggctca atttgcttct 1020

tgggctagag ttttcccagc taaccaagct attaaatact tcgctactga tggtaaacag 1080

ggtaaagctc ctaacttctt gtctaaggct ttgtctaatg ctggttttta cactttcaga 1140

tctggttggg ataaaaatgc tactgttatg gttttgaagg cttctccacc tggagagttt 1200

catgctgttc cagataacgg tactttcgaa ttgttcatta agggtagaaa cttcactcct 1260

gatgctggtg tttttgttta ttctggagat gaggctatta tgaagttgag aaactggtac 1320

agacaaacta gaatccattc tactttgact ttggataacc aaaacatggt tattactaag 1380

gctagacaaa acaagtggga aactggtaac aatttggatg ttttgactta cactaaccca 1440

tcttatccta atttggatca tcaaagatct gttttgttca ttaacaagaa atactttttg 1500

gttattgata gagctatcgg agaggctact ggtaatttgg gtgttcactg gcaattgaag 1560

gaagattcta acccagtttt cgataagact aaaaatagag tttacactac ttacagagat 1620

ggtaacaatt tgatgattgc ttctttgaac gctgatagaa cttctttgaa tgaagaggag 1680

ggtaaagttt cttacgttta caataaggag ttgaaaagac cagcttttgt tttcgaaaag 1740

cctaagaaaa acgctggtac tcaaaacttc gtttctatcg tttacccata tgatggtcaa 1800

aaagctcctg agatctctat cagagaaaac aagggtaacg atttcgagaa gggtaaattg 1860

aacttgactt tgactattaa tggtaaacaa caattggttt tggttccatg gtctcaccct 1920

caatttgaaa agtaa 1935

Claims

1. Heparinase III comprising the amino acid sequence shown as Seq ID No.2 or Seq ID No. 3.

2. A nucleotide sequence encoding the heparinase III of claim 1.

3. The nucleotide sequence of claim 2, wherein the nucleotide sequence is as set forth in Seq ID No.4 or Seq ID No. 5.

4. A recombinant vector comprising the nucleotide sequence of claim 2 or 3.

5. The recombinant vector according to claim 4, wherein the recombinant vector is a eukaryotic recombinant vector.

6. The recombinant vector according to claim 5, wherein the eukaryotic recombinant vector is any one of pPink-HC, pPICZaA, pPICZ A.

7. The recombinant vector according to claim 6, wherein the eukaryotic recombinant vector is pPink-HC.

8. A host cell comprising the recombinant vector of any one of claims 4 to 7.

9. The host cell of claim 8, wherein the host cell is pichia pastoris; preferably, the pichia pastoris is X33.

10. A process for the preparation of heparinase III according to claim 1, characterized in that it comprises the following steps:

(a) synthesizing a nucleotide sequence encoding the heparinase III of claim 1;

(c) introducing the recombinant vector in the step (b) into a host cell, then culturing, inducing and expressing, and purifying to obtain heparinase III;

preferably, the step (c) uses Buffer W to purify the Srep-Tactin column.