CN102533800A - A marine Streptomyces halogenase gene and its product, and a biosynthetic gene cluster of its modified product - Google Patents
A marine Streptomyces halogenase gene and its product, and a biosynthetic gene cluster of its modified product Download PDFInfo
- Publication number
- CN102533800A CN102533800A CN2011103330758A CN201110333075A CN102533800A CN 102533800 A CN102533800 A CN 102533800A CN 2011103330758 A CN2011103330758 A CN 2011103330758A CN 201110333075 A CN201110333075 A CN 201110333075A CN 102533800 A CN102533800 A CN 102533800A
- Authority
- CN
- China
- Prior art keywords
- gene
- halogenase
- streptomyces
- sequence
- library
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 241000187747 Streptomyces Species 0.000 title claims abstract description 68
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 66
- 108091008053 gene clusters Proteins 0.000 title claims abstract description 47
- 230000001851 biosynthetic effect Effects 0.000 title description 9
- 229920001184 polypeptide Polymers 0.000 claims abstract description 18
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 18
- 102000004196 processed proteins & peptides Human genes 0.000 claims abstract description 18
- 210000003705 ribosome Anatomy 0.000 claims abstract description 17
- 108090000790 Enzymes Proteins 0.000 claims abstract description 12
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 12
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 12
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 12
- 102000004190 Enzymes Human genes 0.000 claims abstract description 11
- 108090000364 Ligases Proteins 0.000 claims abstract description 9
- 238000000034 method Methods 0.000 claims abstract description 8
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 7
- 239000002773 nucleotide Substances 0.000 claims abstract description 6
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract 3
- JJGZGELTZPACID-OTLJHNKQSA-N chloropeptin II Chemical class N([C@@H]1CC=2C3=CC=C(C=C3NC=2)C=2C=C3C=C(C=2O)OC2=CC=C(C=C2)C[C@H](N(C([C@@H](C=2C=C(Cl)C(O)=C(Cl)C=2)NC(=O)[C@@H]3NC(=O)[C@@H](C=2C=C(Cl)C(O)=C(Cl)C=2)NC1=O)=O)C)C(=O)N[C@@H](C(O)=O)C=1C=CC(O)=CC=1)C(=O)C(=O)C1=CC(Cl)=C(O)C(Cl)=C1 JJGZGELTZPACID-OTLJHNKQSA-N 0.000 claims description 22
- 238000004458 analytical method Methods 0.000 claims description 21
- 150000001413 amino acids Chemical class 0.000 claims description 19
- 102000004169 proteins and genes Human genes 0.000 claims description 19
- 239000012634 fragment Substances 0.000 claims description 15
- 238000012163 sequencing technique Methods 0.000 claims description 15
- 150000001875 compounds Chemical class 0.000 claims description 12
- 102000003960 Ligases Human genes 0.000 claims description 7
- 238000004519 manufacturing process Methods 0.000 claims description 5
- 239000013612 plasmid Substances 0.000 claims description 4
- 241000894006 Bacteria Species 0.000 claims description 3
- 239000011248 coating agent Substances 0.000 claims description 2
- 238000000576 coating method Methods 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims description 2
- 238000001890 transfection Methods 0.000 claims description 2
- 125000001475 halogen functional group Chemical group 0.000 claims 5
- 235000018102 proteins Nutrition 0.000 claims 2
- 235000001014 amino acid Nutrition 0.000 claims 1
- 238000011156 evaluation Methods 0.000 claims 1
- 230000004853 protein function Effects 0.000 claims 1
- 238000012216 screening Methods 0.000 abstract description 7
- 239000000523 sample Substances 0.000 abstract description 6
- 238000010353 genetic engineering Methods 0.000 abstract description 2
- 108091008146 restriction endonucleases Proteins 0.000 abstract description 2
- 238000003752 polymerase chain reaction Methods 0.000 abstract 3
- 238000001243 protein synthesis Methods 0.000 abstract 2
- 230000014616 translation Effects 0.000 abstract 2
- 108020004414 DNA Proteins 0.000 description 27
- 239000000047 product Substances 0.000 description 25
- 108010029904 complestatin Proteins 0.000 description 17
- JJGZGELTZPACID-UHFFFAOYSA-N isocomplestatin Natural products O=C1NC(C=2C=C(Cl)C(O)=C(Cl)C=2)C(=O)NC2C(=O)NC(C=3C=C(Cl)C(O)=C(Cl)C=3)C(=O)N(C)C(C(=O)NC(C(O)=O)C=3C=CC(O)=CC=3)CC(C=C3)=CC=C3OC(C=3O)=CC2=CC=3C(C=C2NC=3)=CC=C2C=3CC1NC(=O)C(=O)C1=CC(Cl)=C(O)C(Cl)=C1 JJGZGELTZPACID-UHFFFAOYSA-N 0.000 description 16
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 6
- 241000187389 Streptomyces lavendulae Species 0.000 description 5
- 241001036087 Streptomyces xinghaiensis Species 0.000 description 5
- 239000013078 crystal Substances 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000001962 electrophoresis Methods 0.000 description 5
- 108700026244 Open Reading Frames Proteins 0.000 description 4
- 238000012300 Sequence Analysis Methods 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- 239000003242 anti bacterial agent Substances 0.000 description 4
- 229940088710 antibiotic agent Drugs 0.000 description 4
- 230000027455 binding Effects 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 229940088598 enzyme Drugs 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 3
- 241000588724 Escherichia coli Species 0.000 description 3
- YPZRHBJKEMOYQH-UYBVJOGSSA-N FADH2 Chemical compound C1=NC2=C(N)N=CN=C2N1[C@@H]([C@H](O)[C@@H]1O)O[C@@H]1COP(O)(=O)OP(O)(=O)OC[C@@H](O)[C@@H](O)[C@@H](O)CN1C(NC(=O)NC2=O)=C2NC2=C1C=C(C)C(C)=C2 YPZRHBJKEMOYQH-UYBVJOGSSA-N 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 229960005091 chloramphenicol Drugs 0.000 description 3
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 238000004806 packaging method and process Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 238000007862 touchdown PCR Methods 0.000 description 3
- NTUPOKHATNSWCY-PMPSAXMXSA-N (2s)-2-[[(2s)-1-[(2r)-2-amino-3-phenylpropanoyl]pyrrolidine-2-carbonyl]amino]-5-(diaminomethylideneamino)pentanoic acid Chemical compound C([C@@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=CC=C1 NTUPOKHATNSWCY-PMPSAXMXSA-N 0.000 description 2
- -1 ANSAs Chemical class 0.000 description 2
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- 108010015899 Glycopeptides Proteins 0.000 description 2
- 102000002068 Glycopeptides Human genes 0.000 description 2
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- 238000012408 PCR amplification Methods 0.000 description 2
- JWQWPTLEOFNCGX-AVGNSLFASA-N Phe-Glu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JWQWPTLEOFNCGX-AVGNSLFASA-N 0.000 description 2
- 230000002391 anti-complement effect Effects 0.000 description 2
- 108010008730 anticomplement Proteins 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 230000004071 biological effect Effects 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 230000004154 complement system Effects 0.000 description 2
- 230000009089 cytolysis Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000001976 enzyme digestion Methods 0.000 description 2
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 2
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 2
- 108010015792 glycyllysine Proteins 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 239000002609 medium Substances 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 125000000430 tryptophan group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C2=C([H])C([H])=C([H])C([H])=C12 0.000 description 2
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 1
- 108020004465 16S ribosomal RNA Proteins 0.000 description 1
- 241000588626 Acinetobacter baumannii Species 0.000 description 1
- 241000186361 Actinobacteria <class> Species 0.000 description 1
- UWQJHXKARZWDIJ-ZLUOBGJFSA-N Ala-Ala-Cys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(O)=O UWQJHXKARZWDIJ-ZLUOBGJFSA-N 0.000 description 1
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 1
- XEXJJJRVTFGWIC-FXQIFTODSA-N Ala-Asn-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XEXJJJRVTFGWIC-FXQIFTODSA-N 0.000 description 1
- 108010040956 Ala-Asp-Glu-Leu Proteins 0.000 description 1
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 1
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 1
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 1
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- ADSGHMXEAZJJNF-DCAQKATOSA-N Ala-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N ADSGHMXEAZJJNF-DCAQKATOSA-N 0.000 description 1
- JJHBEVZAZXZREW-LFSVMHDDSA-N Ala-Thr-Phe Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O JJHBEVZAZXZREW-LFSVMHDDSA-N 0.000 description 1
- 235000010585 Ammi visnaga Nutrition 0.000 description 1
- 244000153158 Ammi visnaga Species 0.000 description 1
- VBFJESQBIWCWRL-DCAQKATOSA-N Arg-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCNC(N)=N VBFJESQBIWCWRL-DCAQKATOSA-N 0.000 description 1
- GIVATXIGCXFQQA-FXQIFTODSA-N Arg-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N GIVATXIGCXFQQA-FXQIFTODSA-N 0.000 description 1
- QAXCZGMLVICQKS-SRVKXCTJSA-N Arg-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QAXCZGMLVICQKS-SRVKXCTJSA-N 0.000 description 1
- DJAIOAKQIOGULM-DCAQKATOSA-N Arg-Glu-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O DJAIOAKQIOGULM-DCAQKATOSA-N 0.000 description 1
- JAYIQMNQDMOBFY-KKUMJFAQSA-N Arg-Glu-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JAYIQMNQDMOBFY-KKUMJFAQSA-N 0.000 description 1
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 1
- CYXCAHZVPFREJD-LURJTMIESA-N Arg-Gly-Gly Chemical compound NC(=N)NCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O CYXCAHZVPFREJD-LURJTMIESA-N 0.000 description 1
- JEOCWTUOMKEEMF-RHYQMDGZSA-N Arg-Leu-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JEOCWTUOMKEEMF-RHYQMDGZSA-N 0.000 description 1
- HGKHPCFTRQDHCU-IUCAKERBSA-N Arg-Pro-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HGKHPCFTRQDHCU-IUCAKERBSA-N 0.000 description 1
- QMQZYILAWUOLPV-JYJNAYRXSA-N Arg-Tyr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)CC1=CC=C(O)C=C1 QMQZYILAWUOLPV-JYJNAYRXSA-N 0.000 description 1
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 1
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 1
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 1
- HPBNLFLSSQDFQW-WHFBIAKZSA-N Asn-Ser-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O HPBNLFLSSQDFQW-WHFBIAKZSA-N 0.000 description 1
- MYTHOBCLNIOFBL-SRVKXCTJSA-N Asn-Ser-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYTHOBCLNIOFBL-SRVKXCTJSA-N 0.000 description 1
- OERMIMJQPQUIPK-FXQIFTODSA-N Asp-Arg-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O OERMIMJQPQUIPK-FXQIFTODSA-N 0.000 description 1
- ZRUBWRCKIVDCFS-XPCJQDJLSA-N Asp-Leu-Thr-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ZRUBWRCKIVDCFS-XPCJQDJLSA-N 0.000 description 1
- CZIVKMOEXPILDK-SRVKXCTJSA-N Asp-Tyr-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O CZIVKMOEXPILDK-SRVKXCTJSA-N 0.000 description 1
- GZYDPEJSZYZWEF-MXAVVETBSA-N Asp-Val-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O GZYDPEJSZYZWEF-MXAVVETBSA-N 0.000 description 1
- 206010003757 Atypical pneumonia Diseases 0.000 description 1
- 208000035404 Autolysis Diseases 0.000 description 1
- 241000222122 Candida albicans Species 0.000 description 1
- 108010078791 Carrier Proteins Proteins 0.000 description 1
- 206010057248 Cell death Diseases 0.000 description 1
- 229930189264 Chondrochloren Natural products 0.000 description 1
- 102000004594 DNA Polymerase I Human genes 0.000 description 1
- 108010017826 DNA Polymerase I Proteins 0.000 description 1
- 238000007900 DNA-DNA hybridization Methods 0.000 description 1
- 108010067770 Endopeptidase K Proteins 0.000 description 1
- KCJJFESQRXGTGC-BQBZGAKWSA-N Gln-Glu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O KCJJFESQRXGTGC-BQBZGAKWSA-N 0.000 description 1
- LUGUNEGJNDEBLU-DCAQKATOSA-N Gln-Met-Arg Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N LUGUNEGJNDEBLU-DCAQKATOSA-N 0.000 description 1
- BBFCMGBMYIAGRS-AUTRQRHGSA-N Gln-Val-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BBFCMGBMYIAGRS-AUTRQRHGSA-N 0.000 description 1
- RDPOETHPAQEGDP-ACZMJKKPSA-N Glu-Asp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RDPOETHPAQEGDP-ACZMJKKPSA-N 0.000 description 1
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 1
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 1
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 1
- QNJNPKSWAHPYGI-JYJNAYRXSA-N Glu-Phe-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 QNJNPKSWAHPYGI-JYJNAYRXSA-N 0.000 description 1
- DXVOKNVIKORTHQ-GUBZILKMSA-N Glu-Pro-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O DXVOKNVIKORTHQ-GUBZILKMSA-N 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- QPDUVFSVVAOUHE-XVKPBYJWSA-N Gly-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CN)C(O)=O QPDUVFSVVAOUHE-XVKPBYJWSA-N 0.000 description 1
- QSVCIFZPGLOZGH-WDSKDSINSA-N Gly-Glu-Ser Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QSVCIFZPGLOZGH-WDSKDSINSA-N 0.000 description 1
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 1
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 1
- DBJYVKDPGIFXFO-BQBZGAKWSA-N Gly-Met-Ala Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O DBJYVKDPGIFXFO-BQBZGAKWSA-N 0.000 description 1
- QGDOOCIPHSSADO-STQMWFEESA-N Gly-Met-Phe Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QGDOOCIPHSSADO-STQMWFEESA-N 0.000 description 1
- GGLIDLCEPDHEJO-BQBZGAKWSA-N Gly-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)CN GGLIDLCEPDHEJO-BQBZGAKWSA-N 0.000 description 1
- GLACUWHUYFBSPJ-FJXKBIBVSA-N Gly-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GLACUWHUYFBSPJ-FJXKBIBVSA-N 0.000 description 1
- YABRDIBSPZONIY-BQBZGAKWSA-N Gly-Ser-Met Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O YABRDIBSPZONIY-BQBZGAKWSA-N 0.000 description 1
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 1
- LLWQVJNHMYBLLK-CDMKHQONSA-N Gly-Thr-Phe Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LLWQVJNHMYBLLK-CDMKHQONSA-N 0.000 description 1
- UMRIXLHPZZIOML-OALUTQOASA-N Gly-Trp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)NC(=O)CN UMRIXLHPZZIOML-OALUTQOASA-N 0.000 description 1
- GWCJMBNBFYBQCV-XPUUQOCRSA-N Gly-Val-Ala Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O GWCJMBNBFYBQCV-XPUUQOCRSA-N 0.000 description 1
- DKJWUIYLMLUBDX-XPUUQOCRSA-N Gly-Val-Cys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(=O)O DKJWUIYLMLUBDX-XPUUQOCRSA-N 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 1
- FYVHHKMHFPMBBG-GUBZILKMSA-N His-Gln-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N FYVHHKMHFPMBBG-GUBZILKMSA-N 0.000 description 1
- NNBWMLHQXBTIIT-HVTMNAMFSA-N His-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N NNBWMLHQXBTIIT-HVTMNAMFSA-N 0.000 description 1
- VYUXYMRNGALHEA-DLOVCJGASA-N His-Leu-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O VYUXYMRNGALHEA-DLOVCJGASA-N 0.000 description 1
- 108700020129 Human immunodeficiency virus 1 p31 integrase Proteins 0.000 description 1
- 206010021143 Hypoxia Diseases 0.000 description 1
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 1
- 241000880493 Leptailurus serval Species 0.000 description 1
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 1
- HASRFYOMVPJRPU-SRVKXCTJSA-N Leu-Arg-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HASRFYOMVPJRPU-SRVKXCTJSA-N 0.000 description 1
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 1
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 1
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 1
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 1
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 1
- WFCKERTZVCQXKH-KBPBESRZSA-N Leu-Tyr-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O WFCKERTZVCQXKH-KBPBESRZSA-N 0.000 description 1
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 1
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 1
- DUTMKEAPLLUGNO-JYJNAYRXSA-N Lys-Glu-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DUTMKEAPLLUGNO-JYJNAYRXSA-N 0.000 description 1
- OVAOHZIOUBEQCJ-IHRRRGAJSA-N Lys-Leu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OVAOHZIOUBEQCJ-IHRRRGAJSA-N 0.000 description 1
- CFOLERIRBUAYAD-HOCLYGCPSA-N Lys-Trp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O CFOLERIRBUAYAD-HOCLYGCPSA-N 0.000 description 1
- WPTDJKDGICUFCP-XUXIUFHCSA-N Met-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCSC)N WPTDJKDGICUFCP-XUXIUFHCSA-N 0.000 description 1
- KSIPKXNIQOWMIC-RCWTZXSCSA-N Met-Thr-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KSIPKXNIQOWMIC-RCWTZXSCSA-N 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 1
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 1
- 239000001888 Peptone Substances 0.000 description 1
- 108010080698 Peptones Proteins 0.000 description 1
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 1
- DDYIRGBOZVKRFR-AVGNSLFASA-N Phe-Asp-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N DDYIRGBOZVKRFR-AVGNSLFASA-N 0.000 description 1
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 1
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 1
- OXKJSGGTHFMGDT-UFYCRDLUSA-N Phe-Phe-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)C1=CC=CC=C1 OXKJSGGTHFMGDT-UFYCRDLUSA-N 0.000 description 1
- GLJZDMZJHFXJQG-BZSNNMDCSA-N Phe-Ser-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLJZDMZJHFXJQG-BZSNNMDCSA-N 0.000 description 1
- UMIHVJQSXFWWMW-JBACZVJFSA-N Phe-Trp-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N UMIHVJQSXFWWMW-JBACZVJFSA-N 0.000 description 1
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 1
- 102000013566 Plasminogen Human genes 0.000 description 1
- 108010051456 Plasminogen Proteins 0.000 description 1
- XZGWNSIRZIUHHP-SRVKXCTJSA-N Pro-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H]1CCCN1 XZGWNSIRZIUHHP-SRVKXCTJSA-N 0.000 description 1
- LGSANCBHSMDFDY-GARJFASQSA-N Pro-Glu-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O LGSANCBHSMDFDY-GARJFASQSA-N 0.000 description 1
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 1
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 1
- VBZXFFYOBDLLFE-HSHDSVGOSA-N Pro-Trp-Thr Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H]([C@H](O)C)C(O)=O)C(=O)[C@@H]1CCCN1 VBZXFFYOBDLLFE-HSHDSVGOSA-N 0.000 description 1
- LZHHZYDPMZEMRX-STQMWFEESA-N Pro-Tyr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O LZHHZYDPMZEMRX-STQMWFEESA-N 0.000 description 1
- 241000589517 Pseudomonas aeruginosa Species 0.000 description 1
- 108010003201 RGH 0205 Proteins 0.000 description 1
- LVVBAKCGXXUHFO-ZLUOBGJFSA-N Ser-Ala-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O LVVBAKCGXXUHFO-ZLUOBGJFSA-N 0.000 description 1
- SRTCFKGBYBZRHA-ACZMJKKPSA-N Ser-Ala-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SRTCFKGBYBZRHA-ACZMJKKPSA-N 0.000 description 1
- HJEBZBMOTCQYDN-ACZMJKKPSA-N Ser-Glu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HJEBZBMOTCQYDN-ACZMJKKPSA-N 0.000 description 1
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 1
- FUMGHWDRRFCKEP-CIUDSAMLSA-N Ser-Leu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O FUMGHWDRRFCKEP-CIUDSAMLSA-N 0.000 description 1
- QMCDMHWAKMUGJE-IHRRRGAJSA-N Ser-Phe-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O QMCDMHWAKMUGJE-IHRRRGAJSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- PMTWIUBUQRGCSB-FXQIFTODSA-N Ser-Val-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O PMTWIUBUQRGCSB-FXQIFTODSA-N 0.000 description 1
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 1
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 1
- 206010041925 Staphylococcal infections Diseases 0.000 description 1
- 241000191967 Staphylococcus aureus Species 0.000 description 1
- 241000049036 Streptomyces albiaxialis Species 0.000 description 1
- 241000327287 Streptomyces flavofuscus Species 0.000 description 1
- 241000201081 Streptomyces maritimus Species 0.000 description 1
- 108010053950 Teicoplanin Proteins 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- UKBSDLHIKIXJKH-HJGDQZAQSA-N Thr-Arg-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UKBSDLHIKIXJKH-HJGDQZAQSA-N 0.000 description 1
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 1
- ABWNZPOIUJMNKT-IXOXFDKPSA-N Thr-Phe-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O ABWNZPOIUJMNKT-IXOXFDKPSA-N 0.000 description 1
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 1
- CURFABYITJVKEW-QTKMDUPCSA-N Thr-Val-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O CURFABYITJVKEW-QTKMDUPCSA-N 0.000 description 1
- GWBWCGITOYODER-YTQUADARSA-N Trp-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N GWBWCGITOYODER-YTQUADARSA-N 0.000 description 1
- ZPZNQAZHMCLTOA-PXDAIIFMSA-N Trp-Tyr-Ile Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CC=C(O)C=C1 ZPZNQAZHMCLTOA-PXDAIIFMSA-N 0.000 description 1
- KSVMDJJCYKIXTK-IGNZVWTISA-N Tyr-Ala-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 KSVMDJJCYKIXTK-IGNZVWTISA-N 0.000 description 1
- FMOSEWZYZPMJAL-KKUMJFAQSA-N Tyr-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N FMOSEWZYZPMJAL-KKUMJFAQSA-N 0.000 description 1
- LFCQXIXJQXWZJI-BZSNNMDCSA-N Tyr-His-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N)O LFCQXIXJQXWZJI-BZSNNMDCSA-N 0.000 description 1
- QFXVAFIHVWXXBJ-AVGNSLFASA-N Tyr-Ser-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O QFXVAFIHVWXXBJ-AVGNSLFASA-N 0.000 description 1
- UUBKSZNKJUJQEJ-JRQIVUDYSA-N Tyr-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O UUBKSZNKJUJQEJ-JRQIVUDYSA-N 0.000 description 1
- PAPWZOJOLKZEFR-AVGNSLFASA-N Val-Arg-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N PAPWZOJOLKZEFR-AVGNSLFASA-N 0.000 description 1
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 1
- DDNIHOWRDOXXPF-NGZCFLSTSA-N Val-Asp-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N DDNIHOWRDOXXPF-NGZCFLSTSA-N 0.000 description 1
- CELJCNRXKZPTCX-XPUUQOCRSA-N Val-Gly-Ala Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O CELJCNRXKZPTCX-XPUUQOCRSA-N 0.000 description 1
- DJEVQCWNMQOABE-RCOVLWMOSA-N Val-Gly-Asp Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N DJEVQCWNMQOABE-RCOVLWMOSA-N 0.000 description 1
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 1
- KISFXYYRKKNLOP-IHRRRGAJSA-N Val-Phe-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N KISFXYYRKKNLOP-IHRRRGAJSA-N 0.000 description 1
- AIWLHFZYOUUJGB-UFYCRDLUSA-N Val-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 AIWLHFZYOUUJGB-UFYCRDLUSA-N 0.000 description 1
- NHXZRXLFOBFMDM-AVGNSLFASA-N Val-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C NHXZRXLFOBFMDM-AVGNSLFASA-N 0.000 description 1
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 1
- MNSSBIHFEUUXNW-RCWTZXSCSA-N Val-Thr-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N MNSSBIHFEUUXNW-RCWTZXSCSA-N 0.000 description 1
- RTJPAGFXOWEBAI-SRVKXCTJSA-N Val-Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RTJPAGFXOWEBAI-SRVKXCTJSA-N 0.000 description 1
- 108010059993 Vancomycin Proteins 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 1
- 108010069490 alanyl-glycyl-seryl-glutamic acid Proteins 0.000 description 1
- 238000003277 amino acid sequence analysis Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000000844 anti-bacterial effect Effects 0.000 description 1
- 108010068380 arginylarginine Proteins 0.000 description 1
- 108010060035 arginylproline Proteins 0.000 description 1
- 108010038633 aspartylglutamate Proteins 0.000 description 1
- 108010047857 aspartylglycine Proteins 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 239000011942 biocatalyst Substances 0.000 description 1
- 208000029028 brain injury Diseases 0.000 description 1
- 229940095731 candida albicans Drugs 0.000 description 1
- 238000007036 catalytic synthesis reaction Methods 0.000 description 1
- 230000030833 cell death Effects 0.000 description 1
- DDTDNCYHLGRFBM-YZEKDTGTSA-N chembl2367892 Chemical compound CC(=O)N[C@H]1[C@@H](O)[C@H](O)[C@H](CO)O[C@H]1O[C@@H]([C@H]1C(N[C@@H](C2=CC(O)=CC(O[C@@H]3[C@H]([C@H](O)[C@H](O)[C@@H](CO)O3)O)=C2C=2C(O)=CC=C(C=2)[C@@H](NC(=O)[C@@H]2NC(=O)[C@@H]3C=4C=C(O)C=C(C=4)OC=4C(O)=CC=C(C=4)[C@@H](N)C(=O)N[C@H](CC=4C=C(Cl)C(O5)=CC=4)C(=O)N3)C(=O)N1)C(O)=O)=O)C(C=C1Cl)=CC=C1OC1=C(O[C@H]3[C@H]([C@@H](O)[C@H](O)[C@H](CO)O3)NC(C)=O)C5=CC2=C1 DDTDNCYHLGRFBM-YZEKDTGTSA-N 0.000 description 1
- YTRQFSDWAXHJCC-UHFFFAOYSA-N chloroform;phenol Chemical compound ClC(Cl)Cl.OC1=CC=CC=C1 YTRQFSDWAXHJCC-UHFFFAOYSA-N 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000006471 dimerization reaction Methods 0.000 description 1
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 1
- ZPWVASYFFYYZEW-UHFFFAOYSA-L dipotassium hydrogen phosphate Chemical compound [K+].[K+].OP([O-])([O-])=O ZPWVASYFFYYZEW-UHFFFAOYSA-L 0.000 description 1
- 108010054813 diprotin B Proteins 0.000 description 1
- 239000012154 double-distilled water Substances 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 238000012869 ethanol precipitation Methods 0.000 description 1
- 231100000318 excitotoxic Toxicity 0.000 description 1
- 230000003492 excitotoxic effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000012239 gene modification Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 108010080575 glutamyl-aspartyl-alanine Proteins 0.000 description 1
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 1
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 1
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 1
- 108010078326 glycyl-glycyl-valine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 238000005658 halogenation reaction Methods 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 230000007124 immune defense Effects 0.000 description 1
- 208000028867 ischemia Diseases 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 108010057821 leucylproline Proteins 0.000 description 1
- 231100000053 low toxicity Toxicity 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 108010038320 lysylphenylalanine Proteins 0.000 description 1
- 208000015688 methicillin-resistant staphylococcus aureus infectious disease Diseases 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000000324 neuroprotective effect Effects 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 230000009871 nonspecific binding Effects 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 235000019319 peptone Nutrition 0.000 description 1
- JTJMJGYZQZDUJJ-UHFFFAOYSA-N phencyclidine Chemical compound C1CCCCN1C1(C=2C=CC=CC=2)CCCCC1 JTJMJGYZQZDUJJ-UHFFFAOYSA-N 0.000 description 1
- 238000002205 phenol-chloroform extraction Methods 0.000 description 1
- 150000002989 phenols Chemical class 0.000 description 1
- 108010089198 phenylalanyl-prolyl-arginine Proteins 0.000 description 1
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 1
- 108010051242 phenylalanylserine Proteins 0.000 description 1
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 1
- 238000007747 plating Methods 0.000 description 1
- 229930001118 polyketide hybrid Natural products 0.000 description 1
- 125000003308 polyketide hybrid group Chemical group 0.000 description 1
- 210000004896 polypeptide structure Anatomy 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 206010039073 rheumatoid arthritis Diseases 0.000 description 1
- 230000028043 self proteolysis Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 229960001608 teicoplanin Drugs 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- 229940040944 tetracyclines Drugs 0.000 description 1
- 108010061238 threonyl-glycine Proteins 0.000 description 1
- 108700004896 tripeptide FEG Proteins 0.000 description 1
- 239000012137 tryptone Substances 0.000 description 1
- 241001446247 uncultured actinomycete Species 0.000 description 1
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 1
- 229960003165 vancomycin Drugs 0.000 description 1
- MYPYJXKWCTUITO-LYRMYLQWSA-N vancomycin Chemical compound O([C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@H]1OC1=C2C=C3C=C1OC1=CC=C(C=C1Cl)[C@@H](O)[C@H](C(N[C@@H](CC(N)=O)C(=O)N[C@H]3C(=O)N[C@H]1C(=O)N[C@H](C(N[C@@H](C3=CC(O)=CC(O)=C3C=3C(O)=CC=C1C=3)C(O)=O)=O)[C@H](O)C1=CC=C(C(=C1)Cl)O2)=O)NC(=O)[C@@H](CC(C)C)NC)[C@H]1C[C@](C)(N)[C@H](O)[C@H](C)O1 MYPYJXKWCTUITO-LYRMYLQWSA-N 0.000 description 1
- MYPYJXKWCTUITO-UHFFFAOYSA-N vancomycin Natural products O1C(C(=C2)Cl)=CC=C2C(O)C(C(NC(C2=CC(O)=CC(O)=C2C=2C(O)=CC=C3C=2)C(O)=O)=O)NC(=O)C3NC(=O)C2NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(CC(C)C)NC)C(O)C(C=C3Cl)=CC=C3OC3=CC2=CC1=C3OC1OC(CO)C(O)C(O)C1OC1CC(C)(N)C(O)C(C)O1 MYPYJXKWCTUITO-UHFFFAOYSA-N 0.000 description 1
- 150000003952 β-lactams Chemical class 0.000 description 1
Images
Landscapes
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Abstract
Description
技术领域 technical field
本发明涉及基因工程领域,特别是海洋链霉菌卤化酶基因及其产物,另外还涉及卤化酶修饰产物的生物合成基因簇。 The invention relates to the field of genetic engineering, in particular to the marine streptomyces halogenase gene and its product, and also to the biosynthetic gene cluster of the halogenase modified product. the
背景技术 Background technique
海洋链霉菌(Streptomyces xinghaiensis)是大连理工大学分离鉴定的一个海洋链霉菌新种(一株具有广谱抗菌活性的海洋链霉菌S187,中国发明专利200710158478.7),该菌种具有良好的抗耐药金黄色葡萄球菌MRSA菌株活性,同时对绿脓假单胞菌、大肠杆菌、鲍曼不动杆菌以及白色念珠菌等也具有较好的活性,而其抗绿脓假单胞菌活性在所分离的所有菌株中比较突出。海洋链霉菌的16S rDNA序列与两个亲缘关系较近的标准菌种S. flavofuscus NRRL B-8036T和S. albiaxialis DSM 41799T同源性分别为98.1% 和97.5%,但基因组DNA-DNA杂交分析显示该菌种与两个标准菌种的相关性只有31.4%和46.9%。对海洋链霉菌进行深入研究,发现了一个卤化酶基因及其修饰产物的完整基因簇。 Streptomyces xinghaiensis is a new species of Streptomyces xinghaiensis isolated and identified by Dalian University of Technology (a strain of Streptomyces xinghaiensis with broad-spectrum antibacterial activity, Chinese invention patent 200710158478.7), this strain has good resistance to drug-resistant gold Staphylococcus aureus MRSA strain activity, also has good activity to Pseudomonas aeruginosa, Escherichia coli, Acinetobacter baumannii and Candida albicans etc. prominent among all strains. The 16S rDNA sequence of Streptomyces marine has 98.1% and 97.5% homology with two closely related standard strains , S. flavofuscus NRRL B-8036 T and S. albiaxialis DSM 41799 T , respectively, but the genomic DNA-DNA hybridization The analysis showed that the correlation between this strain and the two standard strains was only 31.4% and 46.9%. In-depth research on marine Streptomyces found a complete gene cluster of halogenase genes and their modified products.
卤化酶是修饰很多抗生素的重要修饰酶,对很多抗生素的活性具有重要的作用。由于FADH2依赖型的卤化酶是许多抗生素生物合成基因簇中的后修饰酶,因此编码这类卤化酶的基因序列与多种抗生素的生物合成基因簇偶联。在生物合成的过程中,催化合成反应的中间进行卤代反应,从而对最终产物的生物活性产生影响。利用卤化酶作为新型生物催化剂,可产生新的药物分子用于临床治疗。 Halogenase is an important modifying enzyme for modifying many antibiotics, and plays an important role in the activity of many antibiotics. Since FADH 2- dependent halogenases are post-modifying enzymes in many antibiotic biosynthetic gene clusters, the gene sequences encoding such halogenases are coupled to the biosynthetic gene clusters of various antibiotics. In the process of biosynthesis, the halogenation reaction is carried out in the middle of the catalytic synthesis reaction, thereby affecting the biological activity of the final product. Using halogenase as a novel biocatalyst, new drug molecules can be produced for clinical treatment.
卤化酶可修饰多种抗生素,包括四环素类,安莎类,β内酰氨类、糖肽类等。其中complestatin (chloropeptin II) 是含有卤化修饰结构的一种特殊的糖肽类化合物,与万古霉素、替考拉宁结构不同之处是其不含有糖基,它是由线性非核糖体多肽(non-ribosomal peptide, NRPS)进一步通过NRPS合成酶进行酚氧化偶联形成的刚性交联结构。早在1980年人们就发现complestatin具有较强的抑制人补体旁路途径的活性。作为人体重要的免疫防御系统之一,补体系统过度激活与类风湿性关节炎等多种疾病相关,而重症非典型性肺炎(SARS)也和补体系统的过度激活有关,因此开发高效低毒的抗补体药物对于相关疾病的治疗具有重要意义。研究者还发现complestatin具有较强的抑制艾滋病毒HIV-1整合酶的活性。此外,complestatin还可促进纤溶酶原自溶,也可作为阻止缺氧缺血等急性脑损伤造成的兴奋毒性细胞死亡的神经保护分子。由于complestatin的多种独特的生物活性,也吸引着人们进行化学合成研究,但直到2010年才报道了其化学全合成,但化学合成步骤多,难度较大。此外,complestatin溶解性不好,可能也是限制其药物开发的因素,因此其类似物的开发具有重要研究意义。 Halogenase can modify a variety of antibiotics, including tetracyclines, ANSAs, β-lactams, glycopeptides, etc. Among them, complestatin (chloropeptin II) is a special glycopeptide compound containing a halogenated modified structure. It is different from vancomycin and teicoplanin in that it does not contain a sugar group. It is composed of a linear non-ribosomal polypeptide ( non-ribosomal peptide, NRPS) is further formed by NRPS synthetase for phenol oxidation coupling to form a rigid cross-linked structure. As early as 1980, it was discovered that complestatin has a strong activity of inhibiting the alternative pathway of human complement. As one of the important immune defense systems of the human body, the overactivation of the complement system is related to many diseases such as rheumatoid arthritis, and severe atypical pneumonia (SARS) is also related to the overactivation of the complement system, so the development of high-efficiency and low-toxicity Anticomplement drugs are of great significance for the treatment of related diseases. The researchers also found that complestatin has a strong inhibitory activity of HIV-1 integrase. In addition, complestatin can also promote the autolysis of plasminogen, and also act as a neuroprotective molecule to prevent excitotoxic cell death caused by acute brain injury such as hypoxia-ischemia. Due to the variety of unique biological activities of complestatin, it also attracts people to carry out chemical synthesis research, but its chemical synthesis was not reported until 2010, but the chemical synthesis steps are many and difficult. In addition, the poor solubility of complestatin may also be a factor limiting its drug development, so the development of its analogues has important research significance. the
发明内容 Contents of the invention
本发明的目的是提供一种海洋链霉菌卤化酶基因,另外还提供其产物;本发明还提供利用卤化酶基因修饰产物的生物合成基因簇,其基因序列与complestatin的生物合成基因序列具有较高的同源性,但又具有其自身独特的基因组成。 The object of the present invention is to provide a kind of marine Streptomyces halogenase gene, also provide its product in addition; The present invention also provides the biosynthetic gene cluster that utilizes halogenase gene modification product, its gene sequence and the biosynthetic gene sequence of complestatin have higher homology, but has its own unique genetic composition. the
本发明公开一种海洋链霉菌卤化酶基因核苷酸序列,其特征为: The invention discloses a nucleotide sequence of a marine Streptomyces halogenase gene, which is characterized by:
(a)、含有SEQ ID NO. 1所示核苷酸序列的核酸;和 (a), nucleic acid containing the nucleotide sequence shown in SEQ ID NO. 1; and
(b)、与(a)所述核酸具有至少75%序列相同性、同时保留了卤化酶功能的核酸。 (b) A nucleic acid having at least 75% sequence identity with the nucleic acid of (a) while retaining the halogenase function.
本发明一种包括编码基因序列的海洋链霉菌卤化酶蛋白质: A marine Streptomyces halogenase protein comprising a coding gene sequence of the present invention:
(a)、含有SEQ ID NO. 2所示的氨基酸序列的蛋白质;和 (a), a protein containing the amino acid sequence shown in SEQ ID NO. 2; and
(b)、在(a)的蛋白质的氨基酸序列中经过取代、缺失或添加一个或几个氨基酸且具有卤化酶功能的由(a)衍生的蛋白质。 (b) A protein derived from (a) that undergoes substitution, deletion or addition of one or several amino acids in the amino acid sequence of the protein in (a) and has a halogenase function.
本发明还公开所获得的海洋链霉菌卤代酶修饰产物基因簇序列,以已及分离获得该序列的方法,包括以下步骤: The present invention also discloses the obtained gene cluster sequence of the modified product of marine Streptomyces halogenase, and a method for isolating and obtaining the sequence, including the following steps:
1)用Fosmid 载体构建插入片段约为35-40kb 的海洋链霉菌基因组文库; 1) Use the Fosmid vector to construct a marine Streptomyces genome library with an insert fragment of about 35-40kb;
2)将所获得的文库转染细菌,平板涂布,经鉴定文库合格后挑取平板上的单克隆于培养基中培养; 2) Transfect bacteria with the obtained library, spread on the plate, and pick the single clone on the plate to culture in the culture medium after the qualified library is identified;
3)提取培养的单克隆的DNA,PCR 扩增,并对PCR 扩增产物进行检测,获得含有卤代酶基因的阳性克隆;和对该阳性克隆进行测序,获得该卤代酶基因全长以及修饰产物基因簇序列; 3) Extract the DNA of the cultured single clone, amplify it by PCR, and detect the PCR amplification product to obtain a positive clone containing the halogenase gene; and sequence the positive clone to obtain the full length of the halogenase gene and Modify product gene cluster sequence;
4)海洋链霉菌complestatin类似物基因簇功能预测,以及卤化酶修饰产物结构预测:包括非核糖体多肽合成酶蛋白功能域的分析,以及编码complestatin类似物的基因簇测序结果分析。 4) Prediction of function of Streptomyces marine complestatin analogue gene cluster, and structure prediction of halogenase modification products: including analysis of non-ribosomal polypeptide synthetase protein functional domains, and analysis of gene cluster sequencing results encoding complestatin analogues.
所述分析海洋链霉菌修饰产物基因簇合成产物结构预测的方法,包括: The method for analyzing the structure prediction of the synthetic product of the modified product gene cluster of Streptomyces marine comprises:
,1)海洋链霉菌卤化酶测序结果分析; ,1) Analysis of the sequencing results of halogenase of Streptomyces marinum;
,2)海洋链霉菌Fosmid文库卤代酶阳性质粒的序列结果分析; ,2) Sequence analysis of the halogenase-positive plasmid of the marine Streptomyces fosmid library;
,3)海洋链霉菌非核糖体多肽合成酶进行蛋白功能域的分析; ,3) Analysis of protein functional domains by non-ribosomal polypeptide synthetase of Streptomyces marine;
4)海洋链霉菌中编码卤化酶修饰产物的基因簇测序结果分析。 4) Analysis of the sequencing results of gene clusters encoding halogenase modified products in Streptomyces marinum.
本发明采用建立Fosmid基因组文库的方法,利用卤化酶基因的保守探针进行文库的PCR筛选,获得了新的卤化酶基因。与传统的SuperCos载体相比,Fosmid文库构建时不采用限制性酶切,避免了酶切位点的偏好性,而且其拷贝数低,更具有稳定性。利用PCR快速筛选Fosmid文库,成功获得了卤化酶基因的全长及其所在的基因簇。该基因簇共包括11个生物合成酶,分别为非核糖体多肽合成酶基因,调节基因和转运蛋白基因等基因具有较高同源性。 The invention adopts the method of establishing a Fosmid genome library, uses the conserved probe of the halogenase gene to carry out PCR screening of the library, and obtains a new halogenase gene. Compared with the traditional SuperCos vector, the Fosmid library construction does not use restriction enzymes, which avoids the preference of enzyme cutting sites, and its copy number is lower and more stable. Using PCR to quickly screen the Fosmid library, the full length of the halogenase gene and its gene cluster were successfully obtained. The gene cluster includes a total of 11 biosynthetic enzymes, which are non-ribosomal polypeptide synthetase genes, regulatory genes and transporter genes and other genes with high homology. the
附图说明: Description of drawings:
图1是显示海洋链霉菌与已知晶体结构的蛋白氨基酸序列的同源比对; Figure 1 is a homologous alignment showing the amino acid sequences of Streptomyces marine and proteins with known crystal structures;
图2是显示海洋链霉菌卤代酶氨基酸序列系统发育树; Figure 2 is a phylogenetic tree showing the amino acid sequence of the marine Streptomyces halogenase;
图3是海洋链霉菌中complestatin-like 化合物基因簇与S. lavendulae中complestatin基因簇的对比; Figure 3 is a comparison of the complestatin-like compound gene cluster in marine Streptomyces and the complestatin gene cluster in S. lavendulae ;
图4是显示海洋链霉菌 非核糖体多肽合成酶(NRPS)的结构模块与S. lavendulae的比对; Figure 4 shows the alignment of the structural modules of Streptomyces marine non-ribosomal polypeptide synthase (NRPS) with S. lavendulae ;
图5是显示海洋链霉菌 非核糖体多肽合成酶(NRPS)一级结构的预测。 Figure 5 shows the prediction of the primary structure of the non-ribosomal polypeptide synthase (NRPS) of Streptomyces marinum.
图6是海洋链霉菌卤化酶修饰产物生物合成基因簇的功能注释表。 Fig. 6 is a functional annotation table of the biosynthetic gene cluster of halogenase modified products of Streptomyces marinum.
具体实施方式 Detailed ways
以下结合具体实施例,对本发明作进一步说明。 The present invention will be further described below in conjunction with specific embodiments. the
本发明将海洋链霉菌的基因组用DNA破碎仪(Hydro-Shear 0703,美国GeneMachine)将基因组DNA打断,获得主要条带在35~40kb的片断,使用Klenow片段进行末端补平,并通过酚氯仿抽提乙醇沉淀精制DNA片断,精制后的DNA片段经过脉冲场电泳确认,连入Copycontrol Fosmid Library Production Kit (Epicentre,USA)提供的Fosmid载体。用噬菌体包装蛋白(Copycontrol Fosmid Library Production Kit, Epicentre)体外包装,转染大肠杆菌EPI300,使用碱裂解法提取DNA,NotI酶切鉴定,脉冲场电泳检测插入片断的长度。判断文库是否合格,从而构建并获得海洋链霉菌基因组文库。 In the present invention, the genome of Streptomyces marinum is interrupted by a DNA breaker (Hydro-Shear 0703, GeneMachine, USA) to obtain a fragment with a main band of 35-40kb, and the Klenow fragment is used to fill in the end, and the phenol-chloroform The purified DNA fragments were extracted and precipitated with ethanol, and the purified DNA fragments were confirmed by pulse-field electrophoresis, and then ligated into the Fosmid vector provided by Copycontrol Fosmid Library Production Kit (Epicentre, USA). In vitro packaged with phage packaging protein (Copycontrol Fosmid Library Production Kit, Epicentre), transfected into Escherichia coli EPI300, DNA was extracted by alkaline lysis, identified by Not I enzyme digestion, and the length of the inserted fragment was detected by pulse field electrophoresis. Determine whether the library is qualified, so as to construct and obtain the marine Streptomyces genome library.
使用卤化酶特异性引物对海洋链霉菌基因组文库进行筛选,最终得到13个阳性单克隆,对这13个阳性单克隆的双向进行末端测序,并选择其中一个进行全测序。在对序列结果进行分析后,获得海洋链霉菌卤化酶基因及其修饰的Complestatin类似基因簇。 The marine Streptomyces genome library was screened with halogenase-specific primers, and 13 positive single clones were finally obtained. The bidirectional end sequencing of these 13 positive single clones was performed, and one of them was selected for full sequencing. After analyzing the sequence results, the streptomyces marinum halogenase gene and its modified Complestatin similar gene cluster were obtained. the
实施例1:海洋链霉菌卤化酶基因的分离,以及其修饰产物基因簇的获得。 Example 1: Isolation of the halogenase gene of Streptomyces marinum and the acquisition of its modified product gene cluster. the
提取海洋链霉菌基因组DNA: Extraction of marine Streptomyces genomic DNA:
用TBS培养基(以g/L计, 胰蛋白胨 17,大豆蛋白胨 3,氯化钠 3,葡萄糖 2.5,磷酸氢二钾2.5,pH 7.5)在28℃,150rpm中培养48小时。收集放线菌菌体细胞10ml以上,将菌体细胞重悬于5ml SET缓冲液,加入适量0.1mm的玻璃珠进行漩涡震荡。用1ml 20mg/ml的溶菌酶,37℃消化2h以上,再加入500 μl 15mg/ml的蛋白酶K,37℃反应30min。 之后加入1/10体积的20%SDS溶液,55℃反应1h。用1/3体积的5 M NaCl溶液, 室温反应 30 min,然后加入 1 : 1体积比的饱和酚/氯仿,混匀后置于室温反应30min。以8000 r/min 4℃ 离心20min ,弃蛋白沉淀,并向上清其中加入等体积的异丙醇,室温放置30min。 用枪头或毛细管将析出的DNA绕出,并以70%乙醇洗涤DNA数次。最后将乙醇挥发干,将得到的DNA溶于ddH2O中。
Use TBS medium (in g/L, tryptone 17,
海洋链霉菌卤代酶基因序列的获得: Acquisition of the marine Streptomyces halogenase gene sequence:
由于海洋链霉菌 基因组中的编码卤化酶的基因簇序列未知,所以尝试用兼并引物调取基因,在NCBI美国国家生物信息中心搜索已发布的放线菌卤化酶的氨基酸序列,利用Clustal W软件进行同源比对。根据比对结果,在这一类卤代酶的两个保守区,即FAD 结合位点(GGGXXG) 和色氨酸残基结合位点(GWTWXIP)的位置上,利用在线CODEHOP引物设计软件http://bioinfo.weizmann.ac.il/blocks/codehop.html上,设计简并引物。 Since the sequence of the gene cluster encoding halogenase in the Streptomyces marine genome is unknown, an attempt was made to retrieve the gene with an amalgamative primer, and the amino acid sequence of the published actinomycete halogenase was searched at the NCBI National Center for Biological Information, and the Clustal W software was used to carry out the analysis. homologous comparison. According to the comparison results, the online CODEHOP primer design software http: http://bioinfo.weizmann.ac.il/blocks/codehop.html, Designing degenerate primers.
表1 扩增海洋链霉菌 卤代酶基因所用的CODEHOP简并引物
*注:简并引物中y代表CT;w代表AT;s代表CG; r代表AC;n代表 ACGT。 *Note: In degenerate primers, y stands for CT; w stands for AT; s stands for CG; r stands for AC; n stands for ACGT.
合成CODEHOP简并引物后,以海洋链霉菌 基因组DNA为模板,进行Touchdown PCR扩增卤化酶基因并对PCR产物测序,获得约960bp的海洋链霉菌 卤代酶基因序列。 After synthesizing CODEHOP degenerate primers, Genomic DNA of Streptomyces marine was used as a template to perform Touchdown PCR to amplify the halogenase gene and sequence the PCR product to obtain a sequence of about 960 bp of Streptomyces marine halogenase gene. the
PCR反应体系如表2&3: The PCR reaction system is shown in Table 2&3:
表2 PCR反应体系 Table 2 PCR reaction system
*注1:Primer F*代表Halo- F或sinH-F;Primer R*代表Halo-R或sinH-R; *Note 1: Primer F* stands for Halo-F or sinH-F; Primer R* stands for Halo-R or sinH-R;
*注2:加入DMSO的目的是减少高G+C %含量DNA之间的二级结构,使DNA变性充分。 *Note 2: The purpose of adding DMSO is to reduce the secondary structure between DNA with high G+C % content and fully denature DNA.
表3 Touchdown PCR反应程序
*注:Touchdown退火温度由64℃开始,每6个循环降低2℃,直至56℃。 *Note: The Touchdown annealing temperature starts at 64°C and decreases by 2°C every 6 cycles until it reaches 56°C.
制备基因组文库插入片段: Prepare Genomic Library Inserts:
取适量海洋链霉菌的基因组DNA,用DNA破碎仪(Hydro-Shear 0703,美国GeneMachine)将基因组DNA打断,然后将片断化的基因组DNA通过脉冲场电泳分离,在避免紫外照射的条件下切胶回收35~40kb的片断。
Take an appropriate amount of genomic DNA of Streptomyces marinum, and use a DNA fragmentation instrument (Hydro-Shear 0703, GeneMachine, USA) to break up the genomic DNA, then separate the fragmented genomic DNA by pulse-field electrophoresis, and recover by cutting the gel under the condition of avoiding
DNA片断末端补平与Fosmid载体连接: The ends of the DNA fragments are blunted and ligated with the Fosmid vector:
将回收后的DNA片断用Klenow片段进行末端补平,并通过酚氯仿抽提乙醇沉淀精制DNA片断,精制后的DNA片段经过脉冲场电泳确认,连入Copycontrol Fosmid Library Production Kit (Epicentre,USA)提供的Fosmid载体。 The recovered DNA fragments were blunted with Klenow fragments, and the DNA fragments were purified by phenol-chloroform extraction and ethanol precipitation. The purified DNA fragments were confirmed by pulse-field electrophoresis and connected to Copycontrol Fosmid Library Production Kit (Epicentre, USA). Fosmid carrier.
文库的包装、转染、平板涂布、鉴定: Library packaging, transfection, plate coating, identification:
用噬菌体包装蛋白(Copycontrol Fosmid Library Production Kit, Epicentre)体外包装,转染大肠杆菌EPI300,然后涂布于含有氯霉素的LB平板中。随机挑取24个单克隆接种,过夜培养,用碱裂解法提取DNA,再用NotI作酶切鉴定,脉冲场电泳检测插入片断的长度。根据涂平板的结果和插入片断的长度,判断文库是否合格。 In vitro packaged with phage packaging protein (Copycontrol Fosmid Library Production Kit, Epicentre), transfected into Escherichia coli EPI300, and spread on LB plates containing chloramphenicol. Randomly pick 24 single clones to inoculate, cultivate overnight, extract DNA by alkaline lysis, then use NotI for enzyme digestion identification, and pulse field electrophoresis to detect the length of the inserted fragment. Judge whether the library is qualified or not according to the results of plate plating and the length of the inserted fragment.
基因组文库的筛选: Screening of Genomic Libraries:
由于简并引物有一定与模板DNA非特异性结合和产生二聚结构的几率,因此重新设计一对用于基因组文库筛选的特异性引物。PCR产物大小约为580bp, 在提高筛选的灵敏性和准确性的同时,也可以减少PCR扩增时所用的时间。 Since the degenerate primers have a certain chance of non-specific binding to the template DNA and producing a dimerization structure, a pair of specific primers for genomic library screening was redesigned. The size of the PCR product is about 580bp, which can reduce the time spent in PCR amplification while improving the sensitivity and accuracy of screening.
表4 海洋链霉菌 基因组文库筛选用的特异性引物
用无菌牙签挑取单克隆,加入1.5mL含有氯霉素12.5μg/mL的LB培养基37℃振荡过夜培养。每个样品取适量,加入终浓度为20%的甘油,在-70℃保存。同时每个样品再取25 μL,每24个单克隆混合成一个小组,再将每8个小组混合成一个大组,将混合后的样品99℃煮沸10min破菌,取适量做模板,按表2和表3中的PCR反应体系进行Touchdown PCR,PCR产物进行凝胶电泳检测。 Pick a single clone with a sterile toothpick, add 1.5 mL of LB medium containing chloramphenicol 12.5 μg/mL and shake overnight at 37 °C. Take an appropriate amount of each sample, add glycerol with a final concentration of 20%, and store at -70°C. At the same time, take another 25 μL of each sample, mix every 24 single clones into a group, and then mix every 8 groups into a large group, boil the mixed sample at 99°C for 10 minutes to destroy the bacteria, take an appropriate amount as a template, and follow the table The PCR reaction systems in Table 2 and Table 3 were subjected to Touchdown PCR, and the PCR products were detected by gel electrophoresis.
在15个大组样品中,检测出10个能够扩增出PCR产物的样品组,进一步将阳性样品的大组下的8个小组分别做PCR检测,再将阳性样品的小组中的24个单克隆分别做PCR检测,最后得到13个阳性单克隆,并对这13个阳性单克隆的双向进行末端测序,并选择其中一个进行全测序。 Among the 15 large groups of samples, 10 sample groups capable of amplifying PCR products were detected, and the 8 groups under the large group of positive samples were further tested by PCR, and the 24 groups of the positive samples were tested separately. The clones were tested by PCR, and finally 13 positive single clones were obtained, and the two-way terminal sequencing of these 13 positive single clones was performed, and one of them was selected for full sequencing. the
实施例2:海洋链霉菌中编码Complestatin-like Compound的基因簇测序结果分析 Example 2: Analysis of sequencing results of gene clusters encoding Complestatin-like Compound in marine Streptomyces
海洋链霉菌卤化酶测序结果及分析 Sequencing Results and Analysis of Halogenase from Streptomyces marinum
卤化酶基因序列 Halogenase gene sequence
>sinH SEQ ID NO. 1 > sinH SEQ ID NO. 1
ATGACCCGCCGGGTGACAAGGGGAGGAGGGATGGCCTTGCCGGATTCCGAGGAATTCGATGTGGTGGTCGTCGGTGGAGGGCCCGCCGGATCGACGCTGGCCGCGTTGACGGCCATGCAGGGACACCGGGTGCTGGTCCTGGAGAAGGAGTTCTTCCCCCGTCACCAGATCGGGGAGTCGCTCCTGCCGGCCACCGTGCACGGCGTGTGCCGGCTGACCGGCGTGGCGGACGAGCTCGCCGCCGCGGGCTTCCCGCGCAAGCGCGGCGGCACGTTCAAGTGGGGCGCCAACCCCGAGCCGTGGACCTTCTCCTTCTCCGTCTCCCCGCGCATGACCGGGCCGACGTCCTACGCCTACCAGGTCGAGCGGGCCAAGTTCGACGAGATCCTGCTCAACAACGCCCGCCGGGTGGGCGCCGAGGTGCGCGAGGGCTGTGCCGCCGTCGACGTCGTCGAGGACGGGGAGCGGGTCCGGGGCGTCCGGTACACCGACGCCGACGGCCGCGAGCACCGGGCGTCGGCCACGTTCGTCGTGGACGCCTCCGGCAACGGAAGCCGGCTGTACCGGCGGGTGGGCGGAACCCGGGAGTACTCGGAGTTCTTCCGCAGCCTGGCCCTGTACGGCTACTTCGAGGGCGGCAAGCGGCTGCCGGAACCGAACTCGGGCAACATCCTGTCGGTGGCGTTCGAGAGCGGCTGGTTCTGGTACATCCCGCTGAGTCCGGACCTCACCAGCGTCGGTGCCGTGGTCCGCCGGGAGATGGCCGGCAAGATCCGGGGCGACTCCGGCAAGGCGCTGGCGGCGCTCATCGCCGAGTGCCCCCTGATCTCCGAGTACCTGGCGGACGCGCGGCGGGTCACCGAGGGCCCGTACGGGAAGCTCCGGGTCCGCAAGGACTACTCGTACCACCACACGACCTTCTCGCGGCCCGGCATGATCCTGGTCGGCGACGCTGCCTGCTTCGTGGACCCGGTGTTCTCCTCCGGCGTCCACCTGGCCACCTACAGCGCCCTGCTGGCGGCCCGCTCCATCAACAGCGTGCTCGCCGGGCTGGTCGGCGAGGACCGGGCCCTGCGGGAGTTCGAGTCCCGTTACCGCCGCGAGTACGGCGTCTTCTACGAGTTCCTGCTCTCCTTCTACGAGATGCACCAGGACGAGAACTCCTACTTCTGGCAGGCCAAGAAGGTCACCCGGGCCAACCGCCCGGAGCTGGAGTCGTTCGTCGAGCTCATCGGCGGGGTCTCCTCCGGCGAGCGGGTCCTGACGGACGCCGAGGTGCTGGCGAAGCGCTTCAGCTCGGGCTCCGCGGAGTTCGCCGCGGCCGTCGACGAACTCGCGGGCAGCGAGGACGGCAGCATGGTGCCGCTGTTCAAGTCCTCGGTGGTGCGCGAGGTCATGCAGGAGGGCGGCCAGGTCCAGATGCGCGCCCTGCTCGGCGAGGACGCCGAACCCGAGGCCCCCCTGTCCGCGGACGGCCTGGTGCCGTCCCCCGACGGCATGTTCTGGCTGCCCGCCCAGGGCACCGGCGAGTAG ATGACCCGCCGGGTGACAAGGGGAGGAGGGATGGCCTTGCCGGATTCCGAGGAATTCGATGTGGTGGTCGTCGGTGGAGGGCCCGCCGGATCGACGCTGGCCGCGTTGACGGCCATGCAGGGACACCGGGTGCTGGTCCTGGAGAAGGAGTTCTTCCCCCGTCACCAGATCGGGGAGTCGCTCCTGCCGGCCACCGTGCACGGCGTGTGCCGGCTGACCGGCGTGGCGGACGAGCTCGCCGCCGCGGGCTTCCCGCGCAAGCGCGGCGGCACGTTCAAGTGGGGCGCCAACCCCGAGCCGTGGACCTTCTCCTTCTCCGTCTCCCCGCGCATGACCGGGCCGACGTCCTACGCCTACCAGGTCGAGCGGGCCAAGTTCGACGAGATCCTGCTCAACAACGCCCGCCGGGTGGGCGCCGAGGTGCGCGAGGGCTGTGCCGCCGTCGACGTCGTCGAGGACGGGGAGCGGGTCCGGGGCGTCCGGTACACCGACGCCGACGGCCGCGAGCACCGGGCGTCGGCCACGTTCGTCGTGGACGCCTCCGGCAACGGAAGCCGGCTGTACCGGCGGGTGGGCGGAACCCGGGAGTACTCGGAGTTCTTCCGCAGCCTGGCCCTGTACGGCTACTTCGAGGGCGGCAAGCGGCTGCCGGAACCGAACTCGGGCAACATCCTGTCGGTGGCGTTCGAGAGCGGCTGGTTCTGGTACATCCCGCTGAGTCCGGACCTCACCAGCGTCGGTGCCGTGGTCCGCCGGGAGATGGCCGGCAAGATCCGGGGCGACTCCGGCAAGGCGCTGGCGGCGCTCATCGCCGAGTGCCCCCTGATCTCCGAGTACCTGGCGGACGCGCGGCGGGTCACCGAGGGCCCGTACGGGAAGCTCCGGGTCCGCAAGGACTACTCGTACCACCACACGACCTTCTCGCGGCCCGGCATGATCCTGGTCGGCGACGCTGCCTGCTTCGTGGACCCGGTGTTCTCCTCCGGCGTCCACCTGGCCA CCTACAGCGCCCTGCTGGCGGCCCGCTCCATCAACAGCGTGCTCGCCGGGCTGGTCGGCGAGGACCGGGCCCTGCGGGAGTTCGAGTCCCGTTACCGCCGCGAGTACGGCGTCTTCTACGAGTTCCTGCTCTCCTTCTACGAGATGCACCAGGACGAGAACTCCTACTTCTGGCAGGCCAAGAAGGTCACCCGGGCCAACCGCCCGGAGCTGGAGTCGTTCGTCGAGCTCATCGGCGGGGTCTCCTCCGGCGAGCGGGTCCTGACGGACGCCGAGGTGCTGGCGAAGCGCTTCAGCTCGGGCTCCGCGGAGTTCGCCGCGGCCGTCGACGAACTCGCGGGCAGCGAGGACGGCAGCATGGTGCCGCTGTTCAAGTCCTCGGTGGTGCGCGAGGTCATGCAGGAGGGCGGCCAGGTCCAGATGCGCGCCCTGCTCGGCGAGGACGCCGAACCCGAGGCCCCCCTGTCCGCGGACGGCCTGGTGCCGTCCCCCGACGGCATGTTCTGGCTGCCCGCCCAGGGCACCGGCGAGTAG
卤化酶编码的氨基酸序列 Amino acid sequence encoded by halogenase
>SinH SEQ ID NO. 2 >SinH SEQ ID NO. 2
MTRRVTRGGGMALPDSEEFDVVVVGGGPAGSTLAALTAMQGHRVLVLEKEFFPRHQIGESLLPATVHGVCRLTGVADELAAAGFPRKRGGTFKWGANPEPWTFSFSVSPRMTGPTYAYQVERAKFDEILLNNARRVGAEVREGCAAVDVVEDGERVRGVRYTDADGREHRASATFVVDASGNGSRLYRRVGGTREYSEFFRSLALYGYFEGGKRLPEPNSGNILSVAFESGWFWYIPLSPDLTSVGAVVRREMAGKIRGDSGKALAALIAECPLISEYLADARRVTEGPYGKLRVRKDYSYHHTTFSRPGMILVGDAACFVDPVFSSGVHLATYSALLAARSINSVLGLVGEDRALREFESRYRREYGVFYEFLLSFYEMHQDENSYFWQAKKVTRANRPELESFVELIGGVSSGERVLTDAEVLAKRFSSGSAEFAAAVDELAGSEDGSMVPLFKSSVVREVMQEGGQVQMRALLGEDAEPEAPLSADGLVPSPDGMFWLPAQGTGE MTRRVTRGGGMALPDSEEFDVVVVGGGPAGSTLAALTAMQGHRVLVLEKEFFPRHQIGESLLPATVHGVCRLTGVADELAAAGFPRKRGGTFKWGANPEPWTFSFSVSPRMTGPTYAYQVERAKFDEILLNNARRVGAEVREGCAAVDVVEDGERVRGVRYTDADGREHRASATFVVDASGNGSRLYRRVGGTREYSEFFRSLALYGYFEGGKRLPEPNSGNILSVAFESGWFWYIPLSPDLTSVGAVVRREMAGKIRGDSGKALAALIAECPLISEYLADARRVTEGPYGKLRVRKDYSYHHTTFSRPGMILVGDAACFVDPVFSSGVHLATYSALLAARSINSVLGLVGEDRALREFESRYRREYGVFYEFLLSFYEMHQDENSYFWQAKKVTRANRPELESFVELIGGVSSGERVLTDAEVLAKRFSSGSAEFAAAVDELAGSEDGSMVPLFKSSVVREVMQEGGQVQMRALLGEDAEPEAPLSADGLVPSPDGMFWLPAQGTGE
对测序结果进行分析,其中DNA序列分析开放阅读框架(ORF)分析是使用美国国家生物信息中心提供的ORF Finder 功能(http://www.ncbi.nlm.nih.gov/gorf/gorf.html)。DNA同源性分析是使用美国国家生物信息中心提供的世界范围的Blast引擎(http://www.ncbi.nlm.nih.gov/BLAST/)与各DNA数据库中进行对比。 The sequencing results were analyzed, and the DNA sequence analysis open reading frame (ORF) analysis was performed using the ORF Finder function provided by the National Center for Biological Information (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) . The DNA homology analysis was compared with various DNA databases using the worldwide Blast engine (http://www.ncbi.nlm.nih.gov/BLAST/) provided by the National Center for Bioinformatics of the United States.
对氨基酸序列分析,其中分析目的蛋白氨基酸序列同源性分析是使用美国国家生物信息中心提供的世界范围的Blast引擎(http://www.ncbi.nlm.nih.gov/BLAST/)与各氨基酸序列数据库中的氨基酸序列进行对比。目的蛋白氨基酸序列保守结构域的分析是通过美国国家生物信息中心NCBI Conserved Domain Database search(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi)。不同蛋白之间的氨基酸序列比对分析使用软件BioEdit。 For amino acid sequence analysis, the amino acid sequence homology analysis of the target protein is carried out using the worldwide Blast engine (http://www.ncbi.nlm.nih.gov/BLAST/) provided by the National Center for Biological Information of the United States and each amino acid The amino acid sequences in the sequence database were compared. The analysis of the conserved domain of the amino acid sequence of the target protein was carried out through the NCBI Conserved Domain Database search of the National Center for Bioinformatics (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). The amino acid sequence alignment analysis among different proteins uses the software BioEdit. the
海洋链霉菌卤化酶SinH与已知晶体结构的蛋白之间的比较(图1),其与FADH2依赖型卤代酶的氨基酸十分相近,并且其第30-35个氨基酸为FAD 结合位点(GGGXXG),在第243-249位氨基酸为色氨酸残基结合位点(GWTWXIP)。由此可以确认其为FADH2依赖型卤代酶基因片段。通过与已知晶体结构的卤化酶氨基酸序列的同源比对可以看出,海洋链霉菌的卤化酶与chondrochloren生物合成基因簇中卤化酶CndH的同源性最高,有75%的同源性。海洋链霉菌 卤化酶与氯霉素生物合成基因簇中卤化酶基因CmlS的同源性不高只有较低的55%,并且与研究较多的PrnA亚类卤化酶同源性更低,只有37.6%。因此,海洋链霉菌 S187的卤化酶序列普遍不同于已知晶体结构卤化酶序列,推测是一个新的卤代酶基因。 The comparison between the streptomyces marinum halogenase SinH and the protein with known crystal structure (Fig. 1), which is very similar to the amino acid of FADH 2 -dependent halogenase, and its 30th-35th amino acid is the FAD binding site ( GGGXXG), the 243-249th amino acid is the tryptophan residue binding site (GWTWXIP). Thus, it can be confirmed that it is a fragment of the FADH 2- dependent halogenase gene. According to the homology comparison with the amino acid sequence of halogenase with known crystal structure, the halogenase of Streptomyces marinum has the highest homology with the halogenase CndH in the chondrochloren biosynthesis gene cluster, with 75% homology. The homology between Streptomyces marinum halogenase and the halogenase gene CmlS in the chloramphenicol biosynthesis gene cluster is not high, only 55%, and the homology with the more studied PrnA subclass halogenase is even lower, only 37.6% %. Therefore, the halogenase sequence of Streptomyces marinum S187 is generally different from the known crystal structure halogenase sequence, and it is presumed to be a new halogenase gene.
海洋链霉菌卤化酶氨基酸序列的系统发育树见图2,所选取的卤化酶为近期报道的抗生素生物合成基因簇中的卤化酶基因表达产物,黑色表示有晶体结构的蛋白。可以看出,SinH与complestatin基因簇中的蛋白ComH最相近。 The phylogenetic tree of the amino acid sequence of Streptomyces marinum halogenase is shown in Figure 2. The selected halogenase is the expression product of the halogenase gene in the recently reported antibiotic biosynthesis gene cluster, and the black color represents the protein with crystal structure. It can be seen that SinH is most similar to the protein ComH in the complestatin gene cluster. the
实施例3 Example 3
对实施例2的基因及其产物进行检测分析如下:
The gene and product thereof of
(1)海洋链霉菌Fosmid文库卤代酶阳性质粒的序列结果分析: (1) Sequence analysis of the halogenase-positive plasmid of the marine Streptomyces fosmid library:
将卤代酶阳性的Fosmid质粒测序结果与前述海洋链霉菌进行454和Solexa高通量测序相结合,对序列进行分析,其中DNA序列分析开放阅读框架(ORF)分析是使用美国国家生物信息中心提供的ORF Finder 功能(http://www.ncbi.nlm.nih.gov/gorf/gorf.html)。DNA同源性分析是使用美国国家生物信息中心提供的世界范围的Blast引擎(http://www.ncbi.nlm.nih.gov/BLAST/)与各DNA数据库中进行对比。 Combining the halogenase-positive Fosmid plasmid sequencing results with the 454 and Solexa high-throughput sequencing of the aforementioned marine Streptomyces, the sequence was analyzed, and the DNA sequence analysis open reading frame (ORF) analysis was provided by the National Center for Bioinformatics of the United States. ORF Finder function (http://www.ncbi.nlm.nih.gov/gorf/gorf.html). The DNA homology analysis was compared with various DNA databases using the worldwide Blast engine (http://www.ncbi.nlm.nih.gov/BLAST/) provided by the National Center for Bioinformatics of the United States.
对42.81kB测序信息进行注释后(表2)与S. lavendulae中Complestatin的基因簇进行比对(图3),可以预测海洋链霉菌中存在合成与具有抗补体活性的化合物complestatin的基因簇同源性极高的基因簇。 After annotating the 42.81kB sequencing information (Table 2) and comparing it with the gene cluster of complestatin in S. lavendulae (Figure 3), it can be predicted that there is a gene cluster homologous to the compound complestatin that synthesizes and has anti-complement activity in Streptomyces marine Highly sexed gene clusters.
海洋链霉菌卤化酶修饰的化合物的基因序列 Gene sequences of compounds modified by halogenase from Streptomyces marinum
>Complestatin-like gene cluster from S. xinghaiensis SEQ ID NO. 3 >Complestatin-like gene cluster from S. xinghaiensis SEQ ID NO. 3
ATGGTCGTCGAAGTCGAGATCAGCTCGCTGTCGACGGCCGGCTCGCCGCGGACCTCCGGAGCGGATCCGGAGCACGTGGAAACGCTGGCGGAGGCGCAGACTCCGCTGCCGCCCATCACCGTGCACCGCTCGACGATGCGGGTCATCGACGGGCTGCACCGGGTACGGGCCGCCGAACTCCGGGGGCAACGCAAGATCGCGGCAAGGTTCTTCGAGGGGGACGAGGCCGATGCCTTCGTCCTCGCCGTGGAATCCAACGTCACCCACGGTCTGCCGCTGACGACGGCGGACCGGAAGCGCGCGGCCGGGCGGATCATCGCCTCGCATCCGCAATGGTCGGACCGGATGATCGCCTCGGTCACCGGCATCGCCCCCGGAACGGTCGCGGAGATCCGCAGGCGCCGGCCCGAGGCCCAGGTCGGCGAGGTGAGCCGTATCGGGCAGGACGGCCGGGTCCGCCCGGTCAACGGGACCGAGGGCCGCAAGTTGGCCAGCCGGCTCATCGCCCAGGACCCGAGTCTGTCGCTCCGTCAGGTGGCGCGGGCCGCGTCCATCTCCCCGGAGACGGTCCGCGACGTACGGAACCGGATGATGCGCGGCGAGGATCCGCTGCCGGGGCCGCGGCGCGGGAACAAGAAGCGGGCGAACGGCACGGAGGTCCCCGCCGCGGGCCGCGGCGGACTCCGCCCCGCCGCGGTGGCCTGCCGGAACCCGGTCCAGGACCGTGCGGCGGTCCTGGACCGGCTCAGGGCGGATCCCGCGCTGCGGTTCAGCGAGACCGGGCGCACCTTGCTCCGGCTGCTCACCGTCCACATGATGAGTCCCGAGGAGTGGGACGCGATCATCGACAAGGTTCCGCCGCACTGCGGTGGCGTCGTGGCCCGGCTGGCGGGCGAATGCGCCGAGATGTGGGCGGAGTTCGCCCAGCGGGTCGAGCGCAAGGTCGCCGAGACGGCATAGCCGTCCGGGTGTCCGAGTGAGAAATCTGCCCGGTTGGACGCCGGGGGGCGGGCATCCGGCCGGGCGCTCCGGCCGGGCGCTCCGCCCGTCCAGCGGGCGGGACTTGCACCACTTCCCGGGCTCTGAGAAAGTCCCTCAGCGCGGCGCCGTCCACCGCACACCACCGTTTCCCGGTGGCCGCGCCCGCGGAATTCCCTCCGCCCCGTGAAAAGCGCGACGCGCGGCATCCGGTCCCCCGGACGGATAATCCGGGGGCATTGCACCGCCGTTCTCACACTTCGGTCCGGCCGGCCCTCCATTCCTCTTTCCGTCGGCGTGGTAACGGCCCTGTTCATCCCGTCCAATTCTCCGGACCGTCCGTCATACCTGGCCGGCGGTAAATGCAGAACGGTATGGTCCTCGTGCGGGGGGAATCGATGGAAGCAATTGATCGCTGTGCTGTAACCGGCCCCGTTCCGGCCCGCCGTCACCGCAAACCGCAGGCCCGCGGACCATTCCGTGACATGGCAGCCAAGAGAGCAGAAGAGCCGACGCAGGAGGTTTCCGTCGCGTGTTGATAACCCAAGGCCCTGGTGGCCCCACTGTCGGGGGGCCGCATTCCGCCGCGGCGCAGAAACAGACGCTCCGGCCGGGGACCGTGCGCAGGGCGCTGGCTTATATGAGGCCGCACCGGAGTGCGCTGTTCCTGCTGCTGGCCGTCACCGCGGTGGATTCCCTCATCATCGTGTCCACGCCACTGCTGCTGAAGAAGATCATCGATGACGGCATTCTCAGGGACGACATGGCCGTGGTCACGACCATCGCCCTCGTCGTCGCCGGACTCGCCGTCGTCGACGCCTTCGCGCAGCTCGGGCAGAGCTACATCTCCGGGCGCATCGGACAGGGCGTCAGCCACGATCTGCGGGTCGAGACGTTCAAGCACGTCCAGCGGCAGCCGATGGCGTTCTTCACCCGGACCCAGATCGGTCTGCTCGCCAGCCGGCTGAACGGTGACGTCATGCTGGCGCAGCAGGCTCTCAGCACCCTGCTGACGTCGGTGACCAGCGTGCTGACCGTGGTCCTCGTACTCGCCGAGATGTTCTATCTGTCCTGGCTCATCAGCCTGATCGCCGTGTGCATGCTGCCGCTGTTCGTCATCCCGGCCATCTACGTCGGACGGCGCCTCCAGCGCTACTCCCGCGAGCAGATGCAGGCCAACGGCGAGCTGGGCGGCATCATCAGCGAACGCTTCAACGCGGGCGGCGCCATGCTCGCCAAGCTCTACGGCCGCCCGCGCGAGGAGGCGGACGCCTTCGAGACCCGGGCGCGCACCGTGCGCGAGGTCGGCGTGGTCTGGACGGTCTACAGCCGGCTGTCGTTCATCTTCATGGCGCTGCTCGCCTCGCTCACCACCGCGCTCGTCTACGGCGTCGGCGGCGGCCTGGTCCTGAACGACGTCTTCCAGATCGGCACGCTGGTGGCCCTGGCCGCTCTTCTCGGCCGGCTCTACGGGCCGATCACCCAGCTCTCCGCCATGCAGTCCAACGCGCTGACCGCGATGGTCAGTTTCGACCGGCTCTTCGAGATCCTCGATCTCAAGCCGCTCATCGAGGAGAGCCCGGACGCCGTCGCCCTGCCCGCGCCGGGCGGCGGGCCCGCGCCGGAGATCGAGTTCGAGGGGGTGTCGTTCCGGTATCCGCGCGCCGGCGACGTCTCCCTGGTCTCCCTGGAGTCCAGCGCGCTCTCCCCGGCCGAGCGGGAGAAGGAGACGGCGGAAGTCCTCCACGACCTCAGCTTCCGGGCGCCGGCCGGCAAACTCACGGCCCTCGTCGGCCCGTCGGGCGGGGGCAAGACGACCACCACCCACCTGGTGTCCCGGCTCTACGACCCGACCTCGGGCACCGTCCGCATCGACGGCCACGACCTGCGCGACGTCACCCTCGACTCCCTGCGCGACGTTGTGGGCGTGGTCAGCCAGGACGCCCACTTCTACCACGACACGATCCGGGCAAACCTCCTCTACGCCCGTCCCGAGGCCACCGAGCGGGAACTGCTCGACGCCTGCCGGTCCGCGCGGATCGGCGATCTCGTCGCCTCGCTCCCGAGCGGCCTCGACACCGTGGTCGGCGACCGCGGGTACCGGCTCTCGGGCGGGGAGAAGCAGCGGCTCGCCCTGGCCAGACTGCTGCTGAAGGCCCCCTCCGTCGTGGTGCTCGACGAGGCCACCGCGCACCTGGACTCCGAGTCCGAGGCGGCCATCCAGCGGGCGCTGACCACCGCGCTGCGCGACCGGACCTCGCTGGTCATCGCCCACCGGCTGTCCACCATCCGCGAGGCCGACCAGATCCTCGTCATCGACGGGGGCCGGGTGCGCGAGAGCGGGACGCACGAGGAACTGCTGGCGGGGGGCGGGCTCTACGCCGAGCTCTACCACACCCAGTTCAACCGGCCGGGCGCCAACGGCACCGGCTCCGACGGCGGGGGACACGCCCACGAGGCGCTGGTTCCCGCACCGGTTGCCGGGGACGGCGACCTCGCGGCCCGTCAGCCGGTCGCCGGAGGCTCCCTCGCCCTGGGCGGAACCGTGCCGCCGGGGGCCGGCGGCGACCTCGTCGTCCGCCGCCATTCCGAAGAGCACTAGCGCCTTCCGGCCGGCCGCGGGGCCGGCCGGGAGGGGCCGACGACCGGCGAAGACCGACGAAAAGCAGCGGGCACCCGCTGAAGGGGGAATGTGATGAACGGCCAGGAGGGTTCCACCCTGCCCCTGTCCGAATGCCAGGAGGGAATCTGGCTGGCACAGCGGATAGAGAGCTCTCGAGGGCTCTACCACATCGGCCAGTACATCGAAATCCTCGGCCCCGTGGACACCCGGGTCTTCGAACTCGCGTTGCGCCGGGCCGTCGAGGAAACCGACATCCTGCGGGTGCGTTTCGTCGAGGATTCCGCGGGCCGGGTCTCCCAGTGGATCGGCCCCCCGCCGGAATGGAGCATGCCGGTCGTCGACCTCAGCGGGGAAAAGGATCCGTGGCAGGCCGCCGGGACGTGGATGCGCGGCGAGCTGGGCCGGGTGAACGATCCCACCGAGGGCGGACTTTTCGCTCATGCGCTGTTCACGCTCGCGGCCGACCGCCATATCTGGTACCAGCGCTACAACCATCTGCTGATGGACGGCTTCGCCTGTTCCCTGATGGGGCGGCGCGTCGCCGACCTCTACACGGCGATGCTGCGCGGAGAGCCGTTACGGGCACCCGGATACGCCCCGCTCGGCGAACTGCTGGCCCAGGAATCCGCCTACCGTGCGTCCGAGCAGTGCGCACGCGACCGGCAGTACTGGCACGACCGCTTCGCCGACCGCCCCGATCCGGCGGCCGTTCCCGGCCACCGTCCGTGCCACGGGCCATCTGCCGCCCCCCGAGGTGGCGGCCCTGCACGCGGCCGCGGCCGGCGCGAAGGTGAGCTGGCCCCGGTTCCTCGTCGCCGCGGTGGCGGCCTTCACCAGCCGGATGACCGGTGCGGACGAGGCCGTGCTCAGCATCCCCGTCGCCGGGCGCACGAGCGCCCAGGCGCGGCGGACCCCCTGCACGATGGCGAACATGCTGCCGTTCCGGCTGCCCGCCGGCCCCTCGGCGAACCCGGCCGAACTGGCGCGGGAGGCGGAGCGGGAGGCCACCGGGCTCCTCGGGCACCAGCGCTTCCGGGGCGAGCGGCTGCGCCAGGAGCTCGACTGGCCGAGTGGGGGCCGGTGGCATTTCGGCCCCTCCGTGAACATCCTGCCCCTCGGTGCCAACCTGCGTTTCGGAGAGTGCCGGGGGATCGTCCGCGACCTGTCGAGCCGGCGCGTCGAGGAAGTGGGCGTGGTGGTCAGCGGCTGGTCGGACGACCGGGGCATGGCGGTGGCCCTGGAGGCCAACGCGGCGCACTACGACGAGGACTGGGCGCGTGCGGGCCACCGGTCCCTTCTCTCGTTCATGGAGCGGGTGGTCGCTGACCCGTCGGTGCCGGTGGGCCGGATCGGTGTGCTGGACGCGGCCGGACACGGGCGGATCGTCGGGGGCTGGAGTGCGACGGCTGGTGAGCCGCCGGGGCTGTCCGTGCCGGAGCTGGTGGCGGGGCGGGTGCGGGAGGCGCCGGGCGCGGTGGCGGTGGTGGAGGGTGAACGGTCGCTGTCGTACGGGGAGTTGGACGGGGCAGCGGGTCGTCTGGCGGGGTTTCTGTCCTCGCTGGGTGTGGGGCGTGGTGAGCGGGTCGCGGTGGTGATGGAGCGGTCGGCGGATCTGCTGGTGGCGCTGCTCGGGGTGTGGAGGGCCGGGGCAGCGTATGTGCCGGTGGACGCCGGTTCTCCGGTGGAGCGGGTGGCGTTCGTGCTGGCCGATGCCGCTCCGGTGGTGGTGCTGTGTACGGAGGCGACGCGGGGTGTGATCCCGGAGGACAGCGCCGTGCGGGTGCTGGTATGGGACGACCCGGCCCTGGCGGTCGAACTGGCCGCCGTGGAGGTGCCGTTGTCCGTTCCGGTGGGTCCGCGGGACGTGGCGTATGTGATGTACACGTCCGGTTCGACGGGGGTGCCGAAGGGTGTGGCGGTGCCGCACGGCGGTGTGGCGGCGCTGGTCGGGGAGCGGGAGTGGTCGGTCGGGCGGGATGACGCGGTGCTGATGCACGCCCCGCACGCGTTCGACGTATCGCTGTTCGAGGTGTGGGTTCCGCTCGCCGCCGGCGCCCGGGTCGTCGTCGCCGAGCCGGGAGCGGTCGAGGCCGCACGGCTGCGGGAGGGCGTCGCCGGTGACGGGCTGACAGCGGTGCACCTGACCGCCGGTTCGTTCCGCGTCCTCGCGGCCGAGGCGCCGGAGTGTTTCCGAGGCCTGCGCCAGGTGCTGACCGGCGGTGACGTCGTGCCGCCGGAGGCGGTGGCCCGGGTGCGGGAGGCGTGCCCGGAGGTGTCCGTCCGTCATCTGTACGGGCCCACGGAGACCACGCTCTGCGCGACCTGGCACGAGCTGCGTCCCGGTGGCGTGCTGGGGGAGGTGCTGCCGATCGGCCGCCCGCTGCCCGGCCGGCGCACCTTCGTCCTCGACGCCTTCCTCCAGCCGGTACCGCCCGGTGCGGTGGGGGAGCTGTACGTCGCAGGGCCCGGCCTCGCCCGGGGCTACTGGGACCGGCCGGGCCCGACGGCCGAGCGGTTCCTCGCCTGCCCGTTCCTCCCCGGCGAGCGGATGTACCGCACCGGCGACCTGGTCCGCTGGACCGGCACCGGCGAACTCCTCTTCGTGGGCCGCGCGGACGGCCAGGTCAAGCTGCGCGGGTTCCGGGTGGAGCCGGGCGAGGTGGAGGCGGCCCTGGCCACCCACCCGGCCGTCGCGCAGGCGGTGGTGGTGGCCCGTGAGGACCGTCCGGGCGAACGCCGCCTGGTCGGCTACGTCGTCCCGGACGGAACCGGCACCCCGGACGGAACCGGAGCCCCCGACCCGCGGGCCGTGCGCGAGCATGCCGCCGGCATCCTGCCGGAGTACATGGTCCCGGCCGCCGTCCTCGTCCTGGACGCGCTCCCGGTCACCGCCAACGGCAAGGTGGACCGCAGGGCGCTGCCCGCCCCGGACTTCGCCGAGCGGGTCTCCGGCCGTGCGCCCCGGACCGCCGTCGAGGAGACGCTGTGCCGGCTCTCCGCCGAGGTGCTGGGCCTGGAGCGGGTCGGCGCCGAGGACAGCTTCTTCTCCCTGGGCGGGGACTCGATCATGGCGATGCAGCTCGCCGCCCGCGCCCGCCGCGCCGGCCTGCTCTTCAAGCCGCAGGACGTGTTCGAGCACGAGACCCCCGCCGGCCTGGCGGCGGTGGCCGCCGCCGGGCCGCTCCCCGCGGAGTCCGGGCCCGCCGGCGGCCACCGTGCGGGCGACGGCTCCCTCCTGGCGTCCCTGCGCCCCGGCGAACTCGACGACTTGCGGGCCAGGGTGCCCGGTCTGGTGGACGTCTGGCCGCTGTCGCCACTGCAGGAGGGCATGCTCTTCCACGCCACCTCCCACGACCGGGGCCCGGACGTGTACACGAGCCGGCGCATGCTGGCCCTGGACGGGCCGCTGGACACGGACCGGCTCCGGGCGTCCTGGCAGACGCTGCTGGACCGGCACGAGGTGCTGCGCGCCGGTTTCCACCGGCGCGAGTCCGGAGAGACCGTGCAGGTCATCGCCCGGGACGTGGCACTGCCCTGGCGGGAGGCCGACCTGTCGCACCTCCCGGAGGACACCGCCCGGGAGGAACTCGCGCGGCTCGCCCGCGCCGAGCGGGCGGAACGTTTCGACCTCGCGGCCGCCCCGCTGCTGCGGCTCCTGCTGGTCCGTCTCGGCGCGGACCGGTACCGCCAGATCATCACCGCCCACCACACGCTCATCGACGGCTGGTCCATGTCCGTCCTCTTCGCCGAGCTGGCCGAGGTGTACGCGGCCGACGGCGACGGGCGGGCGCTGCCGGCGCCGGCCTCGTACCGGGAGTACCTGGCCTGGCTGGAACGGCAGGACCGGGACGCGGCACGGGAGGCCTGGCGGGCGGAGCTCGCGGACACGGCGGAGCCGACCCTGGTGGCCCCCGAGGACCGGGTGAGCACACCGGTGCTGCCGGAGCCGGTGTCCTTCGAGTTCACCGAGGAGCTCACCCGCGGCGTGACGGAACTGGCCCGCACCCACGGGGTGACCGTGAACACCGTCATACAGGCCGCCTGGGCCCTGGTGCTGGCACGTCTGGTGGGGCGCACCGACGTGGTGTTCGGCACCACGGTGGCCGGCCGTCCCGCGGACCTGCCCGGGGCCGAGTCGGCGGTCGGCCCTTTCATCAACACCCTGCCGGTGAGGGTCGGACTGGTGCCGGAACAGCCCGTCGCGGAGCTGCTGGCCGGTCTCCGGGACCGGCAGGTGGCGCAGATGGGCCGCCAGTTCACCGGGCTCCAGGAGATCCGCCGGCTCGCCGGCCCCGGCGCCGTCTTCGACACGCTCGTCGTGTACGAGAACCTGCCCCGGACGGCGCGGGACACCGCCTCCCCGGCCATCCGTCCCGTCGGGGAGCCCACGGACATGGGCCACTTCCCGCTGGCGCTGATCGTGGTGCCCGACGAGCGGCTGCGCGGCCACCTCGTCCACCGCCCGGACGCGGTCGGGCGGAGCCGTGCCCGGGAACTGGTCTCCTGGCTGACGCGGGTGCTGGAACGGATGACGGCGGACCCGGCGTCCCCGGTGGGCCGGGTCGGCGTCCTGGACACGGCGGAGCGCGCCCTCGTCCTGGACACGTGGAACCCCCCGGCCACGGCGGAACCGGACGTGCCGGCACCGGAGCTGTTCGCCCGTGCGGCCGCGGCCGTGCCCGGCGCCGTGGCGGTCGAGGACGGCCGCCGCTCCCTGACCTACCGCGAACTCTCCGCGGAGACCCGGCGCCTGGCGCACCACCTGGCCGGCGCGGGCGTGGGCCCGGAGACCCGAGTCGGCGTCATCGCGGACCGTTCGGCGGAACTCGTCACGGCCCTGCTGGCGATCTCCCTGGCGGGCGGCGTCTACGTGCCCATGGACCCGGCCCACCCCCCGGCCCGGCTGCGCCTGATGCTGGACGACGTCGCCCCGCCGGTACTGCTGTGCACCCGGGACACCCGCGCCGTGGTGCCGGCGGCGTTCCCCGGCCGCATCGTGGTCCTCGGCGAGGCCGGCGCCGGCAGCGCGGAGGCCGGGCGTACCGGCGGTGACGGCCCGGACGCCTGGCGCCCGCCACGGCTGAGCCCGGCGAACGCGGCCTACGTGATCCACACCTCCGGGTCGACCGGAACCCCCAAGGGCGTGGTCACCTCCCACCGCGGACTGGCCAACCTCGTGGCCGTCCACATCGATCGGTACGCCCTCGGCACCGGCAGCCGGGTGCTGCAACTGCTCTCGCCCGGCTTCGACGTCTCGATGGCCGACATCTGGCCGGTGCTGTGCGCGGGCGGGCGGCTGGTGCTGGCCCCGCCGGGACGGCTCCACGCGACCGGTGAGGAACTGGTCGGGCTGATGCGCGACCGGCGGATCACCCGCGTCGCGATGACCCCGACCCTGCTGGCGCAGCTCCCCCCGGAGGACCTGCCCGACCTGCGCACGCTGGTCCTCGGCGGCGAGCCCGCGCCCGAGGACGTGCGCCGGCGCTGGTCGGCCGGCCGGGAGATGTACAACGAGTACGGGGTCACCGAGGCGACCGTCACCTCCACCCTGTACCGGACCCCGGACGGCCCCGGCACGCCGCCGATCGGCCGCCCCGTCGGCAACACCCGGGCCTATGTGCTCGACGGCTTCCTGCAGCCCGTACCGCCGGGCACGGCGGGCGAGCTGTACCTCGCGGGCGCGGGCCTGGCCCGCGGCTACCTGGGGCGGGCGCGGCTGACGGCCGAGCGGTTCGTCGCCTGCCCGTTCGCCCCCGGGGAGCGGATGTACCGCACCGGCGACCTCGCGCACTGGACCGCCGACGGCCGGCTCGTCTACGCGGGCCGCGCCGACGCCCAGGTGAAGGTCCGCGGCTTCCGCGTGGAACTCGGCGAGATCGAGGCGGCCCTCTCCGCTCACCCGGCCGTCGAACGGGCCGTGGTCGTCGCCCGCGAGGACCGCCCGGGCGAGCGCCGCCTGGTCGGCTACGCGGTCCCGTACGGAGGTGCGGTGGACGGGCGGTCCCTGCGCGAGCACCTCGCCGGGACGCTTCCGGAGTACATGGTGCCCGCGGCCGTGGTGACCCTGGACGCGCTGCCGGTCACCGGCCACGGAAAGATCGACACGAAGGCCCTGCCCGCCCCGGACCTCACCGGCAACGCCTCCGGACGGGCACCGGAAAGCCGGGCCGAGACGATCCTGTGCGCCCTGTTCGCCGAGGTGCTGGGAGCCGGGCGGGTCGGACCCGACGACAATTTCTTCGGGCTCGGCGGGGACTCGATCACCTCGATGCAGCTGGTGAGCCGCGCCCGGAGCGAGGACGTGGTCTTCACCTCCCAGGACGTGTTCGAGCACGAGACCCCCGCGGGACTCGCGGCGATCGCCCGGTTCGGCGACCGCGCCGGGGCCGGCCCGGACCACGGCGTCGGCGAGGTGGAGTGGACCCCCGTCATGCGGCAGCTCGGTGAGCGGGTGACCGGCGGCGCGTTCGCCCAGTGGGTGGTGCTCGGGTCCCCGGCCGGGCTGCGGCGGGATGCCCTGGTGGCCGCCGTGGCCGCGGTCCTCGACACCCACGCCATGCTGCGCCTCCGCGTGCTCCCGGGCGAGAACGGGCCGCGTCTGCTGACCGGCGAACCCGGATCGGCCGACGCCGCGGGCCTGGTCACCCGCGTGGACGCCGCTGGGATCGCGGCCTCCGGCCTGGACGCGCTCGCCGGACGGGTGGCGCGCGACGCGGTGGCGCGCCTGGACCCGGGCGCCGGCGCGGTGTTCCGGGTGGTGTGGGTGGACGCCGGGCCGGAGCGGACCGGCCGGCTCGTGCTGGTGGCGCACCACCTCTCCGTGGACGGCGTGTCCTGGCGCATCCTCGCCCCCGACCTGCGGGCCGCGTACGAGGCGGCCGAGGCCGGCCGGAAACCGGGACTCGAACCCGTCGCCACCTCCTTCCGGCAGTGGGCCGGCCTGCTGGCCGCCCAGGCCGCCCAACCGGCCCGGACCGCCGAACTGGCGTCCTGGACCGCTCTCCTCGACGGCGTCCGGCCGCCCCGCGGCATCGGTGCGCCGGACCCCGTGCGGGACACCGCCGCGACGGTGCGCCGCCGGACCCTGGTGGTGCCCGCGCGGCAGGCGCGGACACTGGTGAGCCGCGCCCCGGCGGTCTTCCACTGCGGTGTGCACGACATCCTGCTCGCCGCCCTCGCCGCCGCCGTCGCCCACTGGTGGCAGGACAGCGGCACCGCGCTGCTCGTCGACGTCGAGGGCCACGGGCGCGAGCCGCTCGACGGCACCGACGTGCTGCGCACGGTGGGCTGGTTCACCGGCGTCCACCCCGTGCGGCTGGACACCTCCGGCACCGACCCGTCCGAAGTGGCGGCCGGCGGCCCGGCGGCCGGCGCGCTGCTGAAGGCGGTCAAGGAGCAGGCCCGCGCGGTGCCCGGGGACGGACTCGGCTACGGGCTGCTGCGCCACCTCAACCCCGCCACCGGACCGGTGCTGGCCGGACTGCCGAGCCCCCGGATCGGCTTCAACTACCTGGGCCGCTTCCCGGCCGGCGCACGGTCCGACGCGGTGAAACCGTGGCAGATGGCCGGCGAGACGGCGATCGGCGGCTCCGCGGACCCCGGCATGCCGGCGGTGCACGCCCTCGAAGCCGGAGCGGCCGTCCGGGACACCGCCGACGGCCCCGAACTCGTCATCACGCTGAGCCGGCCCGCGGCCCTGCTGGACGACGCGTCGGCGGACCGGCTGGGCCGGCTCTGGCTGGACATGCTCGCCGGGCTGGCCGCTCACGCCGTCGACCCCGGAGCGGGCGGGCACACCCCCTCCGACTTCCCGCTCCTCGACCTCGCGCAGGACGAGGTCGAGCAATTCGAAGCGATAGCAGCCCAGCTCGAAGGAGGTCTGTCGCTGTGAACTCACCGGCCGCGGCCAGAGGGTCCGCACTCGCGGAGGTCTGGCCCCTCTCGCCCCTGCAGGAAGGGCTGCTGTTCCACGCGGACTTCGACGCCCAGGGGCCCGACGTCTACACCGTGCAGACGGTCCTGGAGATCGGCGGAACCCTGGACGCCCGCCGCTTCCGCGCCGCCTGGGAGACGGTGGTGGCCCGGCACGCCGCGCTCCGCGCGAGCTTCCACCGGCGCAGGACCGGCGAAGCGGTGCAGATCATCCCCCGGGAGGTGACCCTGCCCTGGCAGGAGGCCGACCTGTCGGACCTGACCGCCGCCGACGCGGAGGCCCGCGTCCGGCGGCTGGCCGGGAGCGAGCGTGACCGCCGGATCGACCCGGCGGTGGCACCGCTGCTGCGGCTGCTCCTCGTCCGGCTCGGCGAGGACCGGCACAGCCTGGTGATGACCAGCCACCACCTGCTGATGGACGGCTGGTCCATGCCTCTGCTGCTCAATGAGCTCACCGCCGCCTACGCGGCCGGTGACCAGGCAGCCCCGCCGGCGCACCCGGCCTCGTACCGCGAATACCTGGCGTGGCTCGGCCGGCAGGACAAGGAGACGGCCCGGGACGCCTGGAAGGCGGAACTGGCCGGGGCCGACGAGCCGACGCTGGTCGCCGGGCCGGGGAACACGGCCCGGGCCGGGGCGCCCCCCAGAAGAAACGCGTCCTGGATCCCGGAGAAGACGGCCGGTGCCATCGGGGACCTGGCCCGCCGGCACGGGCTGACGGTGAACACCGTGCTCCAGGGCGCCTGGGCGCTGGTACTCGCCCGGCTCACCGGCCGGACCGACGTGGTGTTCGGCGCCACCGTCGCCGGGCGGCCCCCGGAACTGCCGCGCGTCGAATCGATGATCGGGCTGTTCATCAACACCCTTCCCGTCCGGGTCCGGCTCGACGGCTCCCGGTCCCTGCTGGAACTCCTCACCCGGGTGCAGGAGCACCAGTCCGCGCTCATGCCCCACCAGCACCTCGGGCTCGCGGAGATCCAGGGCCTCGCCGGGCCCGGCGCCGTCTTCGACACGCTCATGGTCTACGAGAACTACCCGCGCCCGCCCGCCGCGGAGTCCGCCACGGCCGAGACCCTCACGCTCACCGTGGCCGAAGCCCGCCAGGCCACCCACTACCCGCTGACGGTCGGCGTCCTGCCCGGTGAGCGCTTCCGCGTGGACGTGACCTACCGGCCGGACCTCGTCGGTGAGGAGATCGGCGAGGCGGTGGGCGGATGGCTCGTGCGGATCCTGGAGCAGATGGCCGCGGACCCGTCGGTACCGGTGGCACGGCTGGACCTGCTGCCCGCGGACGCACGCGGGCTGGTGCTGGAGGGCTGGAGTGCGACGGCGGGCGAGCCGCCGGGGCTGTCCGTGCCGGAGCTGGTGGCGGGGCGGGTGCGGGAGGCGCCGGGCGCGGTGGCGGTGGTGGAGGGTGAACGGTCGCTGTCGTACGGGGAGTTGGACGACGGGGCGGGGCGTTTGGCGGGGTTTTTGTCCTCGCTGGGTGTGGGGCGTGGTGAGCGGGTCGCGGTGGTGATGGAGCGGTCGGCGGATCTGCTGGTGGCGCTGCTCGGGGTGTGGAGGGCCGGGGCGGCGTATGTGCCGGTGGATGCGGGTTCTCCGGTGGAGCGGGTGGCGCTGGTGCTGGAGGACGCGGCTCCGGTGGTGGTGTTGTGTACGGAGGCGACGCGGGGTGCGGTGCCGGAGGATGCGGCCGTGCGGGTGCTGGTGCTGGACGACCCGGCCGTGGCGGTCGAACTGGCCACCGTGGAGGTGCCGTTGTCCGTCCCGGTGGGTCCGCGGGACGTGGCGTATGTGATGTACACGTCCGGTTCGACGGGGGTGCCGAAGGGTGTGGCGGTGCCGCACGGCGGTGTGGCGGCGCTGGTCGGGGAGCGGGAGTGGTCGGTCGGGTCGGGTGACGCGGTGCTGATGCACGCCCCGCACGCGTTCGACGCCTCGCTGTTCGAGGTATGGGTGCCCCTCGTCGCCGGTGCGCGGGTGGTGGTCGCCGAACCGGGCGCGGTCGAGGCCCAGCAGGTGCGGCAGCACATAGCCGGCGGTGTGACCGCGTTGCATGTGACGGCTGGTTCGTTCCGGGTGCTGGCGGAGGAGTCCCCGGAGTGTTTCCGGGGTCTGCGTCAGGTGCTGACGGGCGGTGACGTGGTGCCGGTCGCCTCGGTGGCGCGGGTGCGGGAGGCGTGTCCGGACGTCTTGGTGCGGCATCTGTACGGGCCGACGGAGACGACGCTGTGCGCGACCTGGCACGAGTTGCGTCCCGGTGACGTGCTGGGGGAGGTGCTGCCGATCGGCCGCCCGCTGCCCGGCCGGCGCACCTTCGTCCTCGACGCCTTCCTGCAGCCGGTGCCTCCGGGTGTGACCGGTGAACTGCACGTGGCCGGTGCGGGGTTGGCGCGCGGTTACTGGGGCGGGCCGGGCCCGACGGGCGAGCGGTTCGTGGCCTGCCCGTTCCTCCCGGGCGAGCGGATGTACCGCACCGGCGACCTGGTGCGCTGGACCCGGGACGGTGAGCTTCTCTTCGCCGGCCGCGCCGATACACAGGTCAAGATCCGGGGCTACCGTGTCGAACTCGGCGAGGTGGAGGCCGCCCTGGCCGCGTCCCCGGGCGTCGCCCAAGCGGTGGTGGTGGCACGGGAGGACCGGCCGGGTGAACGCCGGCTGGTCGGCTACGTCGTCCCGGACGGAACCGGCGGCCCCGACCCGCAGGCCGTGCGCGAGCGGGCCGCCGCGGTGCTGCCGGAGTACATGGTCCCCGCCGCCGTCCTCGTCCTGGACACCCTGCCCGTGACCCGTAACGGAAAAGTCGACCGAGCGGCCCTGCCCGCCCCGGACTTCACCGAACGGGTCGCCGGCCGTGAGCCGCGGACGGCGGCGGAGGAGACGCTGTGCCGGCTCTTCGCCGAGGTGCTGGGTGTGGAGCGGGTCGGCGTCGAGGACAGTTTCTTCTCGCTCGGCGGGGACTCGATCATGTCGATGCAGCTCGCCGCCCGTGCCCGCCGCGCGGACCTGCTCTTCAAGGCCCAGGACGTCTTCGAGCGCGAGACCCCCGCCGGACTGGCCGCCGTCGCCCACAGCGCGGCCCGGGAAACCTCCGGGCCGGACACGGGCGCGGGCGAGGTGCCCTGGACACCGGTGATGCGTGAGCTGGGCGAGCACGCCGTCCGGCCGAAGCTGGCGCAGTGGATGACCGTCGGAGCACCGGCGGACCTCGAACAGGACGTTCTGGTGAGCGCCCTGAACGCGGTCGCCGACACCCACGCCATGCTCCGCGCCGTGGTCCTGCCGGGCGAGACGGGGCCACGCCTGGTCGTCGGGGAACGCGGTTCGGTGGACGCGGCCGAGCGGATCGGCCGACTGGACGCCACCGGGGCGGCGGACGGCGATCTGGACGGCATCGCCGGCCGCGCGGCGCGGGAGGCCGCGGAGGGGCTGGACCCCTCCGCGGGCGTGTTGTTCCGGGTGGTGTGGGTGGACGCCGGACCGGACCGCCCGGGCCGGCTCGCGCTGGCGGCGCACCACCTCGCGGTCGACGGCGTCTCCTGGCGCATCCTGCTGCCCGACCTGGAAGCCGCGTACGAGGCGGTCGCGGCCGGGCGGAAGCCGGAACTCGATCCGGTGCCGACCTCGTTCCGGCGGTGGGCCCGCCTGCTGGCCGAGCAGGCGGTACGCGAGGAGCGGACCGCCGAACTGGAGGAGTGGGCCGCACTTCTCGGTGAGCCCGAGGCCCCGCTGGGAGACTCCCCGCTGGACCCGGCGCGGGACACCGCCGCGACCGTGCGCCGCCGTTCCTGGACGGTGCCCGCGGGACCGGCGCGGACGCTGGTGAACAGGACACCGGCGGTGTTCCACTGCGGAGTGCGCGAGGTGCTGCTGGCCGCCCTGGCCGGAGCGGTCGGGCACTGGCGGGGCGGCAACGCCCCCGGGCTGCTCATCGACATCGAGGGGCACGGCCGCGAACCGGTCGGAGACGCCGATCCCGCCCGCACGGTGGGCTGGTTCACCGGCGTGCACCCCGTCCGGCTGGAGCTGTCCGGGGTCCGGACGGCCGAGGTGCCGGCGGGCGGCCCGGCGGCCGGTGAGCTGCTGAAGGCGGTCAAGGAGCAGGCGCGGGCCGTACCGGGCGACGGGCTCGGGCACGGTCTGCTGCGCCACCTCAACCCCGGCACCGGTCCCGTCCTGGCCGCCCTGCCCCGCCCGCAGGTCGGCTTCAACTACCTCGGCCGGTTCGCCGGCGCCGGGACCGGTGGGACGGCCGCCTGGCAGCCGGCCGGTGACGTGGCCCTCGGCGGTTCCGTGGACCCGGACATGCCCGTCCTGCACGCTGTCGAGGCGGGCGCCGTCGTCCGGGACACCCCCGACGGGCCGGAGCTGACCCTCACCCTGAGCCGGCCCGCCGCGCTGCTGGACGACGCCTCGGCCGACCGGCTGGGCCGGCTCTGGCTGGACATGCTCGCCGGTCTGGCCGCGCACACCGCCGACCCGGCCGCGGGCGGGCACACCGCCTCCGATTTCCCCCTTCTCGACCTGGCCCAGGACGAGGTGGACGAACTCGAGGCCGGGTTCGCCGACGACCTTTCGTAACCCGCTCGCAGCAAGCCGCTCTGAGGAGAGAAGCGATGACCCGATCCCTTGTCGAGGACGTGTGGCCGCTGTCGCCGCTGCAGGAGGGGCTGCTGTTCCACGCCGCCTTCGACGACCGGGGGCCCGACGTCTACCAGGGGCAGCGGATGCTCGATCTGACCGGCCCGGTGGAGGCGGGCCGGCTGCGGGTGTCGTGGGAGGCGCTGCTGGCCCGGCACGCGGCGCTGCGGGCGGGCTTCCGCCGGCGCAGGTCCGGGGAGGCCGTGCAGGTCATCGCGCGGGAGGTGGAACTGCCCTGGCGCGAGGCCGATGTCTCCGGACTGGCCGGGGACGAGGCGCGGGCCGAGATGGAGCGGCTGGCCGCGGAGGAGCGGGCCGAGCGGTTCGACCCCGCGGTGGCGCCCCTGCTGCGGCTGCTGCTGATCCGGACCGGCGAGGAGCGGCACCGGCTGGTCGTCACCAGCCACCACGTCCTCATGGACGGCTGGTCCATGCCGGTGCTGCTCGGGGAACTGTCCACGGTGTACGCGGCGGGCGGGACCGCGGACGGCCTCCCTCCGGTCGCGTCCTACCGGGACTACCTGGCGTGGCTGGGACGCCAGGACAAGCAGGCGGCGCGGACGGCGTGGCGGGCCGCGCTGGCCGGTGCCGACGAACCGACGCTCGTGGCACCGGCCGATCCCGGCAGGATGCCCGTGATCCCGGAGAGTCTGATCACCGACTTCCCCGAGGACCTGAGCCGGCGGCTGGTGGAGTTCGCCCGCGCCCGGGGACTCACGGTCAACACGGTGATGCAGGGCGCCTGGGCGCTGGTGCTGGCGCGGCTGGCGGGCCGCACGGACGTGGTCTTCGGCGGCACCGTGGCCGGGCGGCCCGCGGAGTTACCGGGTGTCGAGTCGATGGTCGGCCTGTTCATCAACACCCTTCCCGTGCGGGTGCCGCTCGACGCGGAGCAGCCGGTGGCGGAGATGCTGGCCGCACTGCAGGAGCGCCAGTCCGCGCTGATGGCCCACCAGCACCTCGGGTTGCCGGAGGTCCAGCAGCTCGCGGGTGCCGGAGCGGTGTTCGACACGCTCGTCGTGTACGAGAACTATCCGCGTCCCCCCGCCGGCCCGCCCGCCCCGGACACCTTCACCCTGGGCTTCGCCGAGGGGCGGGAGACCGCGCACTACCCGTTCACGCTGGTCGTCGCGCCCGGCGACCGCATGCGCTGCAAGCTCGACTACCGGCCCGACCTCTTCGACCGGGACACCGCCGGGTCGGTCTTCCGGCGGCTGGAGCTGGTGCTGGAGCAGATGGCCGCGGACCCGTCGGTCCCGGTGGGCCGGATCGGCGTGGTGGACGGCCTGGAGCGCGGGCTGGTGCTGGAGGGCTGGAGTGCGACGGCGGGTGAGCCGCCGGAGGTGCCGGTGCCGGAGCTGGTGGCGGTGCGGGTGCGGGAGGCGCCGGGCGCGGTGGCGGTGGTGGACGGTGAACGGTCGCTGTCGTACGGGGAGTTGGGTGAGGCGGCGGGTCGTCTGGCGGGGTGTCTCCACGGGCTGGGTGTGCGGCGTGGTGACCGGGTCGCGGTGGTGATGGAGCGGTCGGCGGATCTGCTGGTGGCGCTGCTGGCGGTGTGGAAGGCCGGGGCGGCCTATGTGCCGGTGGACGCCGGTTCTCCGGTGGAGCGGGTGGCGTTCGTGCTGGAGGACGCGGCTCCGGTGGTGGTGCTGTGCACGGAGGAGACGCGCGGTGTGATCCCGGAGGACAGCGCCGCGCGGGTGCTGGTATCGGACGACCCGGCCCTGGCGGTCGAACTGGCCGCCGTGGAGGTGCCGTTGTCCGTTCCGGTGGGTCCGCGGGACGTGGCGTACGTGATGTACACGTCCGGTTCGACGGGGGTGCCGAAGGGTGTGGCGGTGCCGCACGGCAGTGTCGCGGCGCTGGTCGGCGAAGCAGGCTGGTCGGTCGGGCCGGACGACTCGGTGCTGATGCACGCCCCGCACGCGTTCGACGTATCGCTGTTCGAGGTGTGGGTTCCGCTCGCCGCCGGAGGCCGTGTGGTCGTGGCGGAGCCCGGCGTGGTGGACGCCCAGCGGGTGCGCGCGGAGATCACGGACAGGGGTGTCACGGCGGTGCACCTGACCGCCGGTTCGTTCCGCGTCCTCGCGGCCGAGACGCCGGACTGCTTCCGCGGCCTGCGGGAGGTGCTGACCGGCGGCGACGTGGTGCCGGTCGCCTCGGTGGCCCGGGTGCGGGAGGCGTGCCCGGAGGTCTCCGTGCGCCATCTGTACGGGCCCACGGAGACCACACTGTGCGCGACCTGGCGGGTGTGGAGGCCGGGCGCGGAGCGCGTCGGTCCGGTGCTGCCCATCGGGCGTCCGCTGCCCAGCCGCCAGGTGTTCGTGCTGGACGCCTTCCTCCAGCCGGTCCCGCCCGGTGTGACCGGCGAACTCTATGTCTCCGGCGCCGGGTTGGGGCAGGGCTACTGGGACCGGCCGGCTCCGAGCGCCGAGCGGTTCGTCGCCTGCCCGTTCGTCCCCGGGGAGCGGATGTACCGCACCGGCGACCTGGTGCGCTGGACCGACGACGGGGAACTCCTCTTCGCCGGACGGGTCGACGCGCAGGTCAAGATCCGCGGATTCCGGGTGGAGCCGGGCGAGGTGGAGGCGGCCCTGGCCACACACCCCGCCGTCGCCCAGGCCGTGGTGGTGGCACGGGAGGACCGGCCGGGCGAGCGCCGCCTGGTCGGCTACGTCGTCCCGGACGGAGACGCGGAGAAGCCGGCCGGGGAAGCCGTGCGCGAGCACGCCGTCGAACTGCTGCCCGAGTACATGGTGCCCGCGGTGGTGCTGGTGCTGGACGCGCTGCCCGTGACCCGTAACGGCAAGGTCGACCGCGCGGCCCTGCCCGCCCCCGACTTCGCCGGACGCGTGTCCGGACGGGAACCCCGGACGGAGAACGAGACCGTGCTGTGCGGGCTCTTCGCGGACGTCCTGGGTCTGGAGCGGGTCGGTGCCGAGGACGGCTTCTTCACGCTGGGCGGTGACTCGATCTCCTCCATGCAGCTCGTCGCCCGGGCCCGCCGCGCGGGACTGGTGCTCACACCGCGGCAGGTGTTCGACGAGAAGACCCCCGAGCGGCTGGCGCTCGTGGCGCGGAGGCCCGGGACCGCGGGCCGCGGAGCGGGGAACGCGCGGGACACGGGCGTAGGCGAGGCGCCCTGGACACCGGTGATGCGGGAACTGGGCGAACGGGCCGCCCGTCCGCGGTTCGCGCAGTGGGCGGTGGTCGGCGCCCCCGCCGGGCTCGACCCCGGGACGCTGGCGGCCGGGCTGGCCGCCCTCCTCGACCATCACGCCATGCTGCGGGCCCGTGCGGTACTCGACGGCGGGGAACCCCGCTTCACGGTCGGCCGGCCCGGATCCGCCGATGCCGCCGCCCTGGTCGGCCGGGTGGACGCCACCGGAGCGGAACCCGGCGCGCTGGACGGGATCGCCGCCCGGGCGGCGCGGGAGGCCGCCGAGCGGCTGGACCCCTCGGCGGGCGCGATGGTCCGGGCGGTGTGGGTGGACGCCGGGCCGGAGCGGACCGGCCGGCTCGCGCTGGTGATCCACCATCTGGTGGTCGACGGCGTCTCCTGGCGCATCCTGCTGCCGGATCTGCGGGCCGCGTACGAGGCCGCGGCGGCCGGCCGGAAGCCGGAACTGGACCCGGTGGGCACCTCGTTCCGGCGGTGGGCCGGCCTGCTGGCCGGACAGGCGGCCTCCGGAGACCGGCTCGCCGAACTGGGCGACTGGGCGGCCCTGCTCGGCGGCGAACGGCTCCCCGTCGGGCGGCGCGCGCCGGACCCGGCGCGGGACACCGCCGCGACCATGCGCCACCGCTCATGGGTGGTGCCGCAGCACGAGGCGGCCGTCCTCACCGGCAGGACACCGGAGGCGTTCCACTGCGGGGTGCAGGACGTCCTGCTGGCGGGACTCGCGGGCGCGGTCGCGCACCGGTACGACGGCGGCACCGGCCACGGCGGTGACGGCAGCGGCACTGGCGCCGCGCTCGTGGTGGACGTGGAGGGCCACGGCCGCGAACCGCTGGAGGGAGCCGACCTGTCCCGCACGGTGGGCTGGTTCACCCGGTCCCACCCGGTCCGGCTGGACCTGTCCGGCATCGGAACCGGCGAGGTCCCGGCCGGGGGCGCCGCGGCCGGGGAGCTGCTGAAGACGGTCAAGGAGCAGGCACGCGCGGTGCCGGGCGACGGGCTCGGCCACGGGCTGCTGCGCCACCTCGACCCCACTGCCGGCCCCGCCCTCGCCGCCCTGCCGGGCGCGCAGATCGGCTTCAACTATCTCGGCCGGTTCGCCGCCGGACCGCGGGAGGAACCGGTCGCCGCCTGGCAGTTGGCCGGTGAGACGGCGATCGGCGGCTCGGCCGACCCGGGCCTGCCCGCCCCGCACGCCGTCGAGGCGAACGCGGCCGTCCGCGACACCCCCGGAGGTCCCGAGCTGACCCTCACCCTGAGCTGGCCCGGCGGGATTCTGGACGAGTCCGAGGCGGAGGAACTGGGGCGCTCCTGGCGGCGGATGCTGAGCGGCCTGGCCGCCCACTCCGCCCGCCCCGGGGCCGGCGGGCACACCCCGTCCGACTTCCCGCTGGCCGGCCTGACGCGGGACGGCCTGGCGGAACTGGAGGCGTGCGTCCCGGAACCGGCGGACGTGTGGCCGCTGTCGCCGCTCCAGGAGGGCATGCTCTTCCACGCCACCTTCAACCAGGAGGGGCCGGACGTCTACCAGAGCCAGCGCCTGCTGGGGCTCGACGGGCCGCTGGACACCGCCCGGCTCCGGGCGGCCTGGGAGGAGCTGCCGGCCCGGCACGCGGTCCTGCGGGCGGGCTTCCACCGCCTGACCTCCGGCGAGGCCGTGCAGGTCGTCGCCCGGCGGGTGGAACTGCCCTGGCGCGAGGCCGATCTGTCCGGCCTGCCGGAGACCGAGGCCCTGGCGGAGGCGGAGCGGCTGGCCGCGAACGAACTGGCGGAGCGCTTCGACCTGGCGAAGCCGCCGCTGCTGCGGCTGCTGCTCGTCCGGCTCGCCCGAAACCGGTACCGGCTGGCCATCACCAGCCACCACATCCTGATCGACGGCTGGTCCATGCCGGTCGTCCTCAACGAGGTGTCCGCGCTGTACGCGGCGGGCCGAGGCACGGACGCCGCCCTGCGGCCGGCCGCCTCCTACCGGAACCACCTGGCCTGGCTGGCGGGGCAGGACAAGGAGGCGGCCCGGGCCGCGTGGCGCGCGGAACTGTCCGGGGTCGCCGAACCGGTACTGGTGGCCCCGGCGGACCCGGGCCGGGCCCCCGTGACACCGGTCGTGAGCTCCGCGGAGCTGTCCGCGGAGAGCACCCGGGCACTGACCGGGCGGGCGCGCGCCCACGGACTGACGGTGAACACCCTGGTGCAGGGCGCCTGGGCGCTGGTGCTCGCGCGGCTGACGGGCCGTACGGACGTGGTGTTCGGCGGCACCGTCGCCGGGCGGCCGCCCGAACTGCCCGACGTCGAGTCGATGGTCGGCCTGTTCATCAACACGCTGCCCGTGCGGGTACGGCTGGACGGGGCCCAGCCGGTGCGGGACATGCTCCGGGAACTGCAGGAGCACCAGTCGGCGCTGATCGCGCACCAGCACCTCGGGCTGCCGGAGATCCAGCAACTCGCGGGGCCGGGCGCGGTCTTCGACACGATGCTGATGTTCGAGAACTACCCGCGGAACGCGCCCGAGCTCTCCGGCCCCGCGGGGACGGACGGCGGAGTGGCGATCAGGCAGCTGAAGACCCTGGCCGGCACCCACTACCCCCTGGCGGTGGGGGCCGTTCCCGGAGAGCACCTCCGGGTCCATGTCACCTACCGGCCGGATCTGTTCGGCCACGAGAGCGCCGCCAGGATCGCGCGGGGCGTCGTGCGGGTGCTGGAGCAGATCGCGGCGGACCCGTCGGTGCCGCTGGGACGGCTGGACGTGGTGGACCCGGTCGAGCGCGGCCGGACCGTCGAGGGCTGGAGCGCGTCGGCCGACCGGCCGCCGGAGCCGGCGGTGCCGGAGCTGGTCGCGGCGCGGGCGCGGACGACGCCGGACGCGGTGGCCGTCGTCGATGGCGAACGGCCGCTGTCCTACGGGGAGTTGGACGACGGGGCGGGGCGTCTGGCGGCGTATCTCTCCTCGTTGGGTGTGGGGCGTGGTGAGCGGGTCGCGGTGGTGATGGAGCGGTCGGCGGATCTGCTGGTGGCGCTGCTGGGGGTGTGGAAGGCCGGTGCGGCGTATGTGCCGGTGGAGGCCAGTACTCCGGTGGAGCGGGTGGCGTTCGTGCTGGCCGATGCCGCTCCGGTGGTGGTGCTGTGCACGGAGACGACGCGGGGTGTGATCCCGGAGGACACCGCCGCCCCGGTGGTGGTGCTGGACAGTGTGTCGGTGGCGGCCGAGGTCACCGCCCGGGAGCCCTGGCCGGGCGCTCCCGTGAGTGCCGGGGACGTGGCGTATGTGATGTACACGTCCGGCTCGACGGGGGTGCCGAAGGGTGTGGCGGTGCCGCACGGCGGTGTGGCGGCGCTGGTCGGGGAGCGGGAGTGGTCGGTCGGGTCGGGTGACGCGGTGCTGATGCACGCCCCGCACGCGTTCGACGCCTCGCTGTTCGAGGTGTGGGTGCCCCTCGTCGCCGGTGCGCGGGTGGTGGTCGCCGAACCGGGCGCGGTCGAGGCCCAGCAGGTGCGGCAGCACATAGCCGGCGGTGTGACCGCGTTGCATGTGACGGCTGGTTCGTTCCGGGTGCTGGCGGAGGAGTCCCCGGAGTGTTTCCGGGGTCTGCGTCAGGTGCTGACGGGCGGTGACGTGGTGCCGGTCGCCTCGGTGGCGCGGGTGCGGGAGGCGTGCCCGGAGGTGTCGGTGCGGCATCTGTACGGGCCGACGGAGACGACGTTGTGCGCGACCTGGCACGAGTTGCGTCCCGGGGAGGTGCTGGGGGAGGTGCTGCCGATCGGCCGTCCGCTGCCGGGGCGGCGGGTGTTCGTGCTGGACGCCTTCCTGCAGCCGGTGCCGCCGGGTGTGACCGGTGAACTGTACGTATCCGGTGCGGGGTTGGCGCGCGGTTACTGGGGCGGGCCGGGCCCGACGGGCGAGCGGTTCGTGGCCTGCCCGTTCCTCCCGGGCGAGCGGATGTACCGCACCGGCGACCTGGTCCGCTGGACCGAGGACGGTGAGCTTCTCTTCGCCGGCCGCGCGGACGAGCAGGTGAAGATCCGGGGCTACCGTGTCGAACCCGGCGAGGTGGAGGCGGTGTTGGCCGCCCACCCGGCTGTCGCGCAGGCGGTGGTGGTGGCACGGGAGGACCGGCCGGGTGAACGCCGGCTGGTCGGCTACGTGGTGCCCGAGGGGCCCGAGGGCGTGGACCCGCAGGCCGTGCGCGAGCGGGCCGCCGCGGTGCTGCCGGAGTACATGGTCCCGGCCGCCGTCCTCGTCCTGGACGCGCTTCCGGTCACGGAGAACGGCAAGGTGGACCGCAAGGCGCTCCCCGCCCCGGAGTTCGAGGGCGCCGCCGCCGGCCGTGAGCCGCGGACGGTGGCGGAGGAGACGCTGTGCCGGCTCTTCGCCGAGGTGCTGGGTGTGGAGCGGGTCGGCGTCGAGGACAGTTTCTTCTCGCTCGGCGGGGACTCGATCATGTCGATGCAGCTGGCCGCCCGTGCCCGCCGCGCGGACCTGCTGTTCAAGGCCCAGGACGTCTTCGAGCACGAGACCCCCGCCGGACTGGCTGCCGTCGCCCGCAGCGCGGCCCGGGAAACCTCCGCGCCCGGCGCGGACACGGGCACGGGCGCGGGCGAGGTGCCCTGGACACCGGTGATGCGGGAGCTCGGTGACCGGGCCGCCCGGCCCGGCCTCGCGCAGTGGGCGATCGTCGGGGCACCGGCCGGACTGCGGCGCGACGCCCTCGTGGCCGCTGTCGGCGCCGTGCTGGACACGCACGACATGCTGCGGGCCCGGGTGAGCGGGGACGGGACGGAGCGCGTGCTGGCCGTCGGCGAGCGCGGCTCGGTGGACGCGGCCGGGCGGATCGGCCGGCTGGACGCCACCGGGGCGGCGGACGGCGATCTGGACGGCATCGCGGGCCGCGCGGCGCGGGAGGCCGCGGAAGGGCTCGACCCCACGCTCGGAGCGGTCTTCCGGGTGGTGTGGGTGGACGCCGGCCCCGACCGCCCCGGCCGGCTCGTCCTGGCGGCACATCACCTCGTCGTGGACGGCGTCTCCTGGCGCATCCTGCTGCCCGACCTGGAAGCCGCGTACGAGGCCGTGGCGGCCGGGCGCGAGCCGTCGCTCGACCCGGTGGAAACACCGTTCCGCCGATGGGCCGGCCTCCTCGCCGAGCAGGCGGTCTCGGAGGAGCGGACCGCCGAACTCGATGCCTGGACCGCCCTGCTCGGAGAGGAGACGGCCCCCGTCGGGCGGCGCGCCCCGGACCCGGCGCGGGACACCGCCGCGACCGTGCGCCGCCGCTCGTGGACGGTGCCCGCGGAACAGGCCGGGGTGCTGGCGGGCCGGATGCCGGCGGCCTTCCACTGCGGGGTGCGCGAGGTGCTGCTGGCCGCCCTGGCCGGAGCGGTCGCCCGCTGGCGCCCGGACACCGGCTCCGCGACGCTGATCGACATCGAGGGGCACGGCCGCGAACCGCTGGAGGGAGCCGACCTGTCCCGCACCGTGGGCTGGTTCACCAGTACCCACCCGGCCCGTCTGGACCCGGCGGGCGTCGACCTGGACGCCGTGCTCACCGGCGGTCCCGCGGCCGGCGAACTGCTGAAGACCGTCAAGGAGCAGCTCCGGGCGGTGCCCGGCGACGGACTCGGCCACGGACTGCTGCGCCACCTCAACCCCGGCACCGGGCCCGCCCTGGCCGCCCTGCCCGGCCCGCAGATCGCGTTCAACTATCTGGGGCGGTTCGCCGCCGGACCGCGGGCCGGCGAGGACACCGCGGTCTCCGCCTGGCAGATGGCCGGTGACGCGGCGATCGGCGGCTCCGTCGACCCGGACATGCCCGCCCGGCACGCACTGGAAGCGGGCGCGGCCGTCCTGGACACCGCCGGGGGACCGGAGCTGACACTGACGCTGAGCTGGCCGGAACAGGTGCTCGACGAGACGGAGGCGGACCGGCTGGGCCGGCTCTGGCTGGACCTGCTCGCCGGCCTGGCCGCGCACACCGCCGACCCCGCCGCGGGCGGGCACACCCCCTCCGACTTCCCCCTCGTGGATCTCGCCCGGGAGAGCGTGGAGCGGCTGGAGGCCGCGGTGCCGGGCCTGGTGGACATCTGGCCGCTCTCCCCGCTGCAGGAAGGGCTGCTCTTCCACGCCGGCTTCGACGACCGCGGCCCGGACCTCTACGAGGGGCAGCGCGTCCTGGCCCTGGACGGGCCGCTCGACGCGGACCGGCTGCGGTCCGCGTGGCGGACGCTGACGGACCGGCACCCGGTGCTGCGGGCGAGCTTCCACCGCCTGGAGTCCGGCGAGGCCGTGCAGGTCATCGCGGGGGAGGTGGAACTGCCCTGGCACGAGTCCGATCTCTCCGGGCTGCCGGAGGACGAGGCGCGGGCGGGCCTGGACCGGCTGGTCCGGGAGGAGCGGGCCCGGCGGCTCGACGTCACCCGGGCACCGCTGCTGCGCCTGCTGCTGGTCCGGCTCGGCCGGGACCGGCACGTCCAGGTCGTCACCAGCCATCACATCGTCACGGACGGCTGGTCCCTGCCGGTGATCATCGGTGAACTGTCCGTGCTGTACGAGGCGGGCGGCGACGACGCCCGGGCCCTGCCGCCGGCGACCTCGTACCGGGAGTACCTCGCCTGGCTGGAACGGCAGGACAAGGAGGCGGCCCGGGAGGCGTGGCGCGCGGAGTTCGCCGGCCTGGACGAGCCGACGCTGGCCGTGCCCGGGGACGCGGCCCTGGCCCCGGCGGTGCCGGACCGCGTCCCGTTCGCGTTCCCGGAGGACCTCACCCGCGCCGTGGACGCCCTGACGCGCGCCACGGGCTGACCGTCAACACGGTGGTCCAGGGCGCGTGGGCGCTGCTGCTGGCGCGGCTGGCGGGCCGGACGGACGTGGTGTTCGGCGCGACGGTGGCGGGGCGGCCCGAGGAACTGCCGCGCGTGGAGTCCATGGTCGGCCTGTTCCTCAACACCCTGCCGGTGCGGGTCGACCCGGCGGGGGAGGAGTCCGTCGCCGCGATGCTCACCGGCCTGCGGGACCGGCAGGTCGCGCTGATGTCCCATCAGCACGTCGGTCTCCCCGAGATCCGCCGGCTCGCCGGTCCGGGTGCCGTCTTCGACACGCTCGTGGTGTACGAGAACTACCCCCGCCCCGCTCTCCGGGAGCCCTCGCCCGGCACGCTGACCATCCGGCCCGGCGGCAAGCCGGAGGACACCGGCCACTACCCGCTGACGCTGATCGCGGTGCCCGGCGAGCGGATGCGCGGCGAACTCGTCTACCGGCCCGACGTGTTCCCGCGCGCCTGGGCCGAGGACCTGGTGGCCTCGCTCGCCCGGGTCCTGGAGCAGATGGCCGCGGACCCCTCCGCGCCCGTGGCGCGGGTGGGCGTACTGGGGCCGGAGCAGCGCACGCTCACCCTGGACACCTGGAACCGGACCGCGGCGCCGTCCGCCGCCGCTCCGCTGCCGGAGTTGTTCCGCCGGCAGGCGGAGCGGTCACGGGACGCGGTGGCCGTGGCGGACGGCGAGCGGACGCTGACCTACGGCGAGCTGGAGGCCGGGACGAACCGGCTGGCCCGTCATCTGACCCGTGCGGGCGTCGGCCCGGAGGACGCGGTGGCCGTCATGGTGCCGAGGTCCGCGGCGCTGGTGACGTCCGTGCTGGCGGTGTCCGCGGCCGGCGGGGCCTTCGTACCGGTGGACCCGGCCCACCCCGCCGAGCGCATCGCGTTCGTGTTCCGCGACACGGAGCCGGCGGTGGTGGTGTGCACCCGGGAGACCCGGGAGGCGGTACCGCCGGACTTCCCGGGCCGGCTGGTCGTCCTGGACGACCCGGAGACCGCCGGGGCCGTCGCCGCCCGCCCGCCCGGCCCGCTGTCGGACGGGGAGCGCCGCGCACCGCTGGACGTCCGCAACGCCGCCTACGTCATCCACACCTCCGGATCGACCGGTGTGCCGAAGGGTGTGGTGGTGTCCCACCGGGGACTGGGCAACCTGGCCCGGGCACAGATCGAGCGGTTCGCGGTGGAACCCGGCTCCCGGGTGCTCCAGTTCGCCTCGCTCAGCTTCGACGCGGCCGTCTCCGAACTCTGCATGGCGCTGCTGTCCGGCGCCGCACTGGTGCTGACCGGCCCGGAGGGTCTGCCGCCGCAGGTGCCGCTGGGCGAGGCGCTGCGCGCGACGGGTGCCACCCATGTGACGGTGCCGCCGAGCGTGCTGGCCACGGAGGAGGAGCTGCCCGGCGGCCTGGAGACCCTGGTGGTCGCCGGCGAGGCCTGCCCGGCGGCCCTGGCGGACCGCTGGTCCGCCGGGCGGCGGATGGTGAACGCGTACGGCCCCACCGAGGTGACGGTGTGCGCGGCGATGAGCGCGCCCCTGTCACCCGGCGGTGCCGAGGTGCCGGTCGGACGCCCGATGGCGAACACGCGGGCCTACGTCCTCGACGGCTTCCTGCAACCCGTACCGCCGGGGGCGGTCGGCGAGCTGTACGTCACCGGCCCCGGACTGGCCCGCGGCTACCGGGGGCGCCCGGACCTGACGGCGGAGCGGTTCGTCGCCTGCCCCTTCGTCCCCGGGGAGCGGATGTACCGCACCGGCGACCTGGCGCGCTGGACCGGGGACGGCGAACTCGTCTTCACCGGCCGCGCCGACACCCAGGTCAAGGTGCGCGGCCACCGGATCGAACCGGGAGAGGTCGAGGCGGTGCTGTCCGCGCATCCCGGGGTCGCCGGGGCCGTGGTCGTGGCGCGCCGGGACGGCCCCGGCGGCGACCGCCTCGTCGGCTATGTCGTCCCCGCCCCGCCCCGGCCGGGCGACGGCCCCGCCGAGGCGCGGCCCGTCGAGGAACTGCTCGGCGCGCTGCGCGAGTTCACCGCTGAGCGCCTTCCGGACCCCATGGTGCCGTCGGTGTTCGTACCGCTGGACCGGCTGCCGCTCACCGCGAACGGCAAGGTGGACCGCCGGGCCCTGCCGGCCCCCGACTACGACGGGAAGGTTTCCGGGCGGGAGCCGCGGACGGCGGCCGAGACGGTGTTCTGCGACCTGTTCGCCGAGGTGCTGGGGCTGGAGCGGGCCGGGGCCGACGACAGCTTCTTCGAACTGGGCGGCGACTCCATCTCCTCGATGCAGCTCGCGTCCCGCGCCCGGCGCGCCGGCTATGCGGTGACGCCACGGCAGGTCTTCGAGGAGAAGACGCCCGAACGGCTGGCCGCCGTGGCCGGACCGGCCGGCGCCGCCGCGGAGGACGTGGACGACATCGGCACCGGCGAGGTGCCCCGGACCCCCGTCATGCTGGCACTCGGCGAACGCGCCCTCCGGCCCCGGTTCGCGCAGTGGGCGGTGGTCGGCGCACCGGGCGGCCTGGGACGCGAGGTGCTGACGGCCGGCCTGGTCGCCGTCCTCGACCGCCACGACATGCTGCGGGCCCGGGTGGAGACCGGCGACGACGGGGAGCCGCTCCTCGTGGCCGGCGAGCCCGGCACGGCCGACCCCGCGGACCTCGTCACGCGGGTGGACGCCGGCGGCGCGGACGACGGCTCGCTGGACGCCGTCGCCGGCCGGGCCGCGCGGGAGGCCGTGGAACGGCTCGACCCCCGCGCGGGCGTAATGCTGCAAGCCGTCTGGGTCGACGCCGGGCCCGGACGGACCGGGCGGCTGGTCCTCGTCGTCCACCACCTGGCCGTCGACGGGGTGTCCTGGCGGGTGCTGGTACCGGACCTGGCCATGGCCTGCGAGGCCGCGGCGGCGGGACGCGAACCGGTGCTCGACCCCGTCGGAACCTCCTTCCGGCGGTGGGCGAACCTGCTGGCCGCCCAGGCCGGCGACCCCGGCCGGGTCGCCGAACTCCCGGCCTGGAAAGCCCTGCTGGGCGATCCTGACCCGCTGCCGTTCACCCGCGCGCTGGACCCGGCCCGGGACACCGCGGAGACCCTGCGCCGCCGGTCCTGGACGGTTCCCGCGCGAGAGGCCGCCATCCTGGCCGGCCGCACCCCGGCCGCGTTCCACTGCGGGGTGCACGAGGTACTGCTGGCCGGGCTGGCGGGCGCCGTGGCACGGACCCGGCAGGACGGCCGCACCGCCGTCCTGGTGGAGGTGGAGGGCCACGGCCGGGAACCGGTCGAGGGCACCGATCTGTCCCGCACGGTGGGCTGGTTCACCAGTACCCACCCGGTCCGGCTCGACGCCGCCGGTGTGGACCTCGCCGGCGCGGCCGCGGGCGGCCCCGCGGCCGGCGCGCTGCTGAAGGCCGCCAAGGAACAGGTGCGGGCGGTGCCGGGTGACGGGCTCGGCCACGGGCTGCTGCGCCATCTCAACGCCGGGACCGGACCGGTGCTGGCGGCCCTCCCCGGCCCGCAGATCGGCTTCAACTACCTGGGGCGCTTCACCGCCGGCAGCCGCCGGGGCCCGGTCGGCCCCTGGCAGATGGCCGGTGACACGGCCATCGGCGGCTCGGCCGACCCCGGCATGCCCTGCGAGCACGCTCTCGAAGCGGCCGCGGCCATCGTGGACACTCCCGTGGGCCCGGAGCTGACGCTTACGCTGAGCTGGCCCGCCGCCGTGCTGGACGAAGCCGCGGCGGAGCGACTGGGCCGGGCATGGCTGGACCTGCTGGGCGGCTTGGCCGCCCACACCGCCGATCCCGCCGCCGGCGGCCACACGCCCTCCGACTTCCCCCTCCTCGACCTCGCGCAGAACCAGATCGAGGAGCTCGAAGCCGGGCTCGCCGATGAGAAGGCACAGCCGACGCACCGCAAGCTCTGGTGAGGAGAGACACGATGACCCGATCCCCTGTCGAGGACGTGTGGCCGCTGTCGCCGCTGCAGGAGGGGCTGCTGTTCCACGCCGCCTTCGACGACCGGGGCCCCGACGTCTACACCGTCCAGTCCGCCCTCGCGCTGGAAGGCCCGCTGGACCCGGGGAGGCTGCGGAGGTCGTGGGAGGCGCTGCTGGACCGGCACGCCGCGCTGCGCGCCTGCTTCCGCCAGGTGAGCGGGGCACAGATGGTGCAGGTCATCGCACGGGACGTGGCGCTGCCCTGGCGCGAGGAGGACGTGTCGGGGCTCCCCGCGGCCGACGCGCTCGCCGCCGCGGACCGGCTGGCGGAGAGCGAACGGGCGGAGCGCTTCGACCCGGCGGTGGCGCCCCTGCTGCGGCTGCTGCTCGTCCGCCTCGGCGAGAACCGTCACCGCCTGGTGATGACCAGTCACCACATCCTCATGGACGGCTGGTCGGCCCCCGTCCTGATCGGAGAGCTCTCCGCGGTCTACGCGGCGGGCGGCGACGCCTCCGTACTCCCCGGCACCACCTCCTACCGCGAGTACCTGGCCTGGCTCAACCGGCAGGACAAAGAGGCCGCGCACGCCGCCTGGAAGGCGGAGCTCGCCGGGGCCGGCGAGCCGACGCTGGTCGCCCCCGCCGTCCCCGACCGGCTCCCCGTCTTCCCCGGGAGCGTCAGCGGCGACCTACCGGAAGCGCTGACCCGCGGCCTGGCGGAGCTGGCCCGCACGGCGGGCGTGACCGTCAACACCGTGGTGCAGGGCGCCTGGGCGCTGGTGCTGGCGCGCCTCGCGGGCCGTACGGACGTGGTCTTCGGCGCCACCATGGCGGGGCGTCCCCCGGAACTGCCCGGGGTCGAATCGATGGTGGGGCTGCTCATCAACACCCTGCCGGTACGGGTACCGCTCGACGGCGCCCAGCCGGTGCGGGAGATGCTCCAGCGGCTCCAGGACCGGCAGTCCGCGCTGATGGCCCACCAGCACCTGGGCATCCTGGAGATCCAGAAGACCGCGGGGCCGGGAGCGGTGTTCGACACGCTCCTGGTGTACGAGAGCTTCCCCCGTCCGCCCGCCGCACCGGCACCGGACCGGGACGCCCTGGTCATCAGGCCCGACGGGTTCTCCCGCGAGGCGGCGCACTACCCGTTCACCCTGGTCGTCGCGCCCGGCGACCGGATGCACCTCAAGCTCGAACACCGGCCGGACCTCTTCGACCGCGCCACGGCCGAGTCCGTCCTCCGCGCACTGACGCGGGTGCTGGGACGGATGGTGGCGGAACCCTCCGCGCCGGTCGGACGGATCGGAGTGCTCGACGGACCCGTGCGGGGCACCGCCCGGGAGGAGCGCGGCGGGGCGCCGGTGGCGCCGGGGCCGTCGGTGCCGGAGCTGGTGGCGGGGCGGGTGCGGGAGGCGCCGGGCGCGGTGGCGGTGGTGGAGGTTGAACGGTCGCTGACGTACGGGGAGTTGGACGGGGCGGCGGGGCGCCTCGCGGGGTATCTGTCCTCGCTGGGTGTGGGGCGTGGTGACCGGGTCGCGGTGGTGATGGAGCGGTCGGCGGATCTGCTGGTGACCCTGCTCGGGGTGTGGAGGGCTGGGGCGGCCTACGTGCCGGTGGATACGGGTTCTCCGGTGGAGCGGGTGGCGTTCGTGCTGGCCGATGCCGCTCCGGTGGTGGTGCTGTGCACGGAGGCGACGCGGGGCGCGGTGCCGAAGGATGCCGCCGCGCGGACGGTGGTCCTCGACGATCCCGGGTCGCTGTCCGAACTCGCTGCGCACAAAGGTGAGGTGGCGGCCGAGGTGAACCCCGGGGACGTGGCGTATGTGATGTACACGTCGGGTTCGACGGGGGTGCCGAAGGGTGTGGCGGTGCCGCACGGCGGTGTGGCGGCGCTGGTCGGCGAAGCCGCCTGGTCGGTCGGGCCGGACGATGCGGTGCTGATGCACGCCCCGCACGCGTTCGACGCCTCGCTGTTCGAGGTGTGGGTGCCCCTCGTCGCCGGTGCGCGGGTGGTGGTCGCCGAACCGGGCGTCGTGGACGCCGGGCAGGTACGCCGCCATGTGACCGGCGGTGTGACCGCGTTGCATGTGACGGCTGGTTCGTTCCGGGTGCTGGCGGAGGAGTCCCCGGAGTGTTTCCGGGGTCTGCGTCAGGTGCTGACCGGTGGTGATGTCGTGCCGCCGGGGGCGGTGGCGCGGGTGCGGGAGGCGTGCCCGGAGGTGTCGGTGCGGCATCTGTACGGGCCGACGGAGACGACGTTGTGCGCGACCTGGCACGAGTTGCGTCCCGGGGAGGTGCTGGGGGAGGTGCTGCCGATCGGCCGTCCGCTGCCGGGGCGGCGGGTGTTCGTGCTGGACGCCTTCCTCCACCCGGTGCCGCCGGGCGTGACCGGCGAACTGTACGTATCCGGTGCGGGGTTGGCGCGCGGTTACTGGGACCGGCCGGGCCCGACGGCCGAGCGGTTCGTGGCCTGCCCGTTCCTCCCGGGCGAGCGGATGTACCGCACCGGCGACCTGGTGCGCTGGACCCGGGACGGTGAGCTTCTCTTCGCCGGCCGCGCGGACGAGCAGGTCAAGATCCGCGGGTTCCGCGTGGAGCCCGGCGAGGTGGAGGCGGCCCTGGCCGCGTACCCGGGCGTCGCCCAGGCTGTGGTGGTGGCCCGTGACGACGGCCCGGGTGAGCGCCGGCTGGTCGGCTACGTGGTGCCCGAGGGGCCCGAGGGCGTGGACCCGCAGGCCGTGCGCGAGCGGGCCGCCGCGGTGCTGCCGGAGTACATGGTCCCGGCCGCCGTGCTGGCGATGGCCGCGCTCCCCGTGACCGCCAACGGCAAGGTGGACCGCAGGGCGCTGCCCGCCCCCGACTTCGCCGAACGGGTCTCCGGCCGTGCGCCCCGGACCGCCGTCGAGGAGACGCTGTGCCGGCTCTTCGCCGAGGTGCTCGACCTCGAACGGGTGGGCCCCGACGACAACTTCTTCGACCTGGGCGGCGACTCGGGGCTGGCCATGCGGCTCGCCGGCCGGGTCCGCGAGGAGTTCGGCGCCGAGCCGGCCGTCCGCCAGTTCTTCGGCTCCCCGACCCCGGTCGGTGTGGCCCGGCTGCTGGCCACGAAGGCCCGCCCCGTGCTCGAAGCGGCCGCCCGGCGGGAGGACGTCCCCGTCACCGCGGGCCAGTTGCGCACCTGGCTGATGTCCCGGCTCGGTGACGAGGCGGGCGTGCACCGGATCCCCGTCGCGCTGCGCCTCGGCGGCGATCTGGACCACCGGGCGCTGTGGGCCGCGCTGGGGGACGTCGCGGCGCGGCACGAGATCCTGCGGACGACCTTCGACGGAACCCGGGGCGGTGACCTGCGCCAGCGCGTCCTGGACGCCGACGCCGCGCGCCCCGCCCCGGCCGTCACGGCGGCGACCGAGGAGGAGCTGCCGGACCTGCTGTCCGCCCACGCCGCGCACGCGTTCGATCTCAGCCGTGAGACACCGTGGACCCAGCACCTCTTCGCGCTGTCGGACACCGAGCACGTCCTGCTCCTGGTGGTGCACCGGATCGCCGCCGACGACGCGTCCGTGGACGTCCTCGTCCGCGACCTGGCCACCGCCTACGGCGCGCGCCGCGAGGGCCGGATGCCCGAACGGGCCCCGCTGCCCGTGCAGTTCTCCGACTACGCGCTCTGGGAGCGGGAGCTGCTCCGGGGCGAGCGGGAACCGGAGAGCCTGGTCAACGACCAGCTCGGGTACTGGAAGGACACCCTGGCGGGTCTGGACGCCGAGCTGCCCCTGCCGGCCGACCGGCCGCGGCCCTCCGTGGCCTCCCACCGGGCCGGCTCCGTACCGCTGCGCATCGGCGCGGATCTGCACACCCGCCTGGCCGACCTGGCCGACGACGCCGGCACGACGACCTTCACGGTGGTGCAGGCCGCGCTCGTGACGCTGCTCGCCCGGCTCGGCGCCGGCACCGACGTCACCGTCGGAACGGTGATCCCGCGCCGCGACGAGGCCGGTCTGGAAGGGCTGGTGGGACCCTTCGCCGGGCCCCTGGCGCTGCGTACGGACGCCTCCGGCGATCCCGCCTTCCGCGACCTGCTCGGCCGGGCGCAGGCGGGCGGCCAGGAAGCGCGCGAGCACCGGGACGTGCCGTTCGAGCGCGTGGCGGACGCGCTGCGGCTGCCTCCCTCGCTGGCGCGCCACCCGGTGTTCCAGGTCGTCCTGGAACTGGACGACAGTGTCGAGGAGGCGTGGGACCCCTGGGAACTGCCCGGGCTGCGCACCAGCCGCCTGGACGTGGACCCCGAGTCCACCGAACTCGACCTGTCGGTCGTCCTCACCGAGCTGTACCGGCCCGACGGGGATCTCGGCGGCATCGAGGGCCGGCTCCGCTACGCCGCGGAACTCTTCGACCGGGCCACCGCGGAGGAGCTGGCGCGGCGGCTGACGACGGTCCTGGAGCAGGTGGCGGCCGACCCGGACCGGCGGCTGAGCGCGGTGGACGTCCTGCTCGGCGCGGACGAGCACCGGCGGCTCCTGGAGACGGGGCACGGTGCGGCGGCGGACGTTCCGTGGCCCACGGTCGTGGCGGCTGTGGCCGCGCAGGCCGCACGGACCCCCGGCGCCGTCGCCGTCAGCGGACCGGACGGCTCACTGACGTACCGTGAGCTGCGCTCCGCGACGGATCTGCTCGCCCGGCGGCTGACCGCCCTGGGCGCCGGTCCGGACACCGCCGTCCTGGTGGCGCAGCCCCCCGCCACCGCGCTGGTGGTCGCGCTCCTCGCGGCGTGGGAGTCCGGGGCCGCCTGCCGCCTGGCCGATCCGACGCGGCCCCTGGACGGCGTGGACCCGGGACCCGGACGGATGCCGATCGCGGCCCTGGTGTGCGACGCGGCGCGGGGCGGGCGGACACCGGACGGCCCCGGCGTGCCCGTCGTGGCGACCGGCGGTCCGGCCCCGGCCGGACCGGACGCCGCGCCCGGCAGCGGCACACCCGGCGCAGCCGCGGCCGGCGGTGGCGCGCCCGGCGCCGCGGCGGCCGGGCCCCTCGCCGCCGTGCCGGACCGCGCGCCGCCGCTGCCCGGTCACCCGGCGCTCCTCCTTCCCGGCCCGGCCGGCGCGGACCTGGTCGTCGAACACCACACCCTGGCCGGCCACGCGGCGCACCGGGCGCGGGCATCGGCGGTGGCCGGCTCGGAGACGGTGCTCGACACCCGCGCTCCGCTCCCCCTGCTGCTCGTCCCGCTGCTCGCGGCGCTGTGCGCGGGGGGCAGCGTCCGCCTGGGCCTGCCGGACGGGGACCGGCAGCAGCCAGCCGCCGGAACCCTCACGAGCCCGGAACGCGCACGGCGGCTGCTGGTCACCACCCGTGCGCTGCTGCCGAGCGCGCTGCCGGAGCCGCCGGACGGCGGACCGTCCGCGTCCGTACCCGGCGCGGGCGGCCCGGGAGCGGGTGAACGGCCCGTCCCCGCAGCGGAGTTCGCGGAGGCCCTGGTCATGGACGCGGGCGGGCCGGCGGAGGCGGACGGCGCCCCGGACGCGCCCGGGCGCCCCCACGGGGCCGTCACGGTGTCCTGCCACGGCGCCGCGGAGACGGGCGGCGCCTGGCTGGAGAGCCGCACCGGCCCGGGCGAGGCCGCCCCGGCGGACCTCCGGGCCGGCCGGCCGGTGGCGAACACCCGAGCCTACGTGCTCGACGACCGCCTCCGGCCCGTACCGCCGGGCGCCACGGGCGACCTCTATCTGGCGGGAGTCCCGGTGGCGCGCGGCTATGCGGACCGTCCGGGCCTCACCGCCGGACGGTTCACCGCCTGCCCCTTCGGCCCGCCGGGGGAACGGATGTTCCGCACCGGCGAGCGGGCCCGGCGCACCCGCACCGGGCTGCTCGCGGTGAGCTCCGCGGACGCCGGCCGGGAGCGTGCGGCCGGAGCCCGCCGTGCCGGCGGCAGCCGCGGCGACCTGGGTGTGCTGCTGCCGCTGCGGCCCGGGGGCAGCCGCCCGCCGCTGTTCTGCGTCCATCCCGGCATGGGCCTGAGCTGGGGATACGGCGCTCTGCTGCCGTATCTCCCGGCCGACCTGCCGGTGTACGGGGTGCAGGCGCGGGGACTCGCGCGGCCGGAGCCGCTGCCGGGCAGTGTCGAGGAGATGGCCCGCGACTACGCGGACGAGATCCGCTCCGTGCAGCCGTCCGGCCCCTACCACCTCCTGGGCTGGTCCATCGGCGGCGTCATCGCCCAGGCCGTCGCCGTCCGGCTGGAGGAACTGGGCGAGGAGGTGGCGCTGCTGGCGCTGCTCGACGCCTATCCCGGCAGCGCCGCCACGTCCCGCTTCCGGAACGGGGACGGGCGGCGGGAGGACGGCTACTCCGTGCTCCGGGACGGCGGGGAGGGCGCCATGGCCGACCTCTACCGCTCCACGGGCCTGAGCGACCGGGCCCGGGCGAACCTGGAGAAGGTGCTGCGCAACATGTCCGGTTTCGCGCCGGACCACACCCCGCGCCGCTTCGGCGGGGATCTGCTGCTCTTCGTCGCCACCGCCGACCGGCCCGGTGAACCGCCGGTACGGCAGGCGGTGGAGAGCTGGCGTCCCCACATCGGGGGCGGAATCGAGCCGCACGAGGTGCGGGCCGGCCATTACGACCTGCTGCGGCCCGCGCATCTGCCGGGCATCGGACACGTCGTCACGGAAAAGCTCCGGGCGGCTGAGGAAAAGACATCGGAAAGGACCGAGTCATGACCAATCCGTTCGACAACGAGAACGGCACCTTCCTGGTTCTCGTCAACGACGAGGGGCAGTACTCGCTGTGGCCCGCGTTCGCCGAGAAGCCCGAGGGATGGACGGTCGTTCACGAGGAGGGCAGCCGCCGGGAATGCCTCGAATTCATCGAGGAGACCTGGACGGACATGCGCCCCAAGAGCCTGGTCGAGGAAATGGACCGGCAGGAGACCGCGGCACCCTGACAACAGCCTCTGACGCAACCCGGACAACGCCCCGGCGGACACCACAGGGCGGCCGCACCGGTGGCCGTCTCCTCCGGAAAGGGAACAGGTGGACACCGAAGCACTGGTGACCGTGGCCCTCGGCGACGTCGCGCTGATCGTCATCGCCTCGCGGCTGCTCGGGGCGGCGGCGCGCCGGTGCGGCCAGCCGGCCGTCGTCGGCCAGATCGTGGCCGGCATCGCCCTGGGCCCCACCCTGCTCGGCCGGCTGCCCGGCGATCCGACCGCCCGGCTCTTTCCCCCGGACGTGCTGCCGTTCCTCACCGTGCTGTCCCAGATCGCCATCGTCCTCTTCATGTTCGTGGTCGGCTACGAGACCGACCGGCGGCAGCTCCGCCGGGGCGGCGGGGCCGCGGCGGCCGTGGCGCTCGCGGCGCTGCTGGTCCCGGCGGCACTGGGGGCGGGCGTGGTCGGGCTGTTCCCCGGAGCGTTCTCCGCGGTGCGGCCCCAGCACGCGGACGGCCGGGTGTTCTGGCTCCTCATGGCCGTGGTCATGTCGGTGACCGCTCTTCCCGTCCTCGCCGCCATCGTCCGGGAGCGCGGCCTGGCGGGCACCCCCGCGGGCACCGTGGCCACGAGCGCCGCCGGGCTCATGGACGTCGCCGCCTGGCTCGTCCTGGCCGCGGCGCTGGCCGGCACCGGGCATGCCACCGCCCGGTCGTGGCCGGTGACCCTGCTCCTGCTGTCCCTCTTCACGGCCGCGCTGTTCCTGCTGGTCCGCCCGCTGCTCGCCCGGTGGCTCGAACGGTCCGGCGCCCTGGCGGCGCATCAGCTGACCATCGCGCTCGGTCTCGCGCTGGGCAGCGCCTGGGCCACCGCCGAGCTGGGGCTGCACCCGGTGTTCGGCGGTCTGCTCGCGGGCCTCGCCATGCCCCGTCCGGGCGGGGTGCCGGACGCCGGTGTGCTGCGGCCGATGGAGCAGACGGCCGGACTGCTGCTGCCGTTGTTCTTCGTGACGACCGGGCTGTCGTTCGACATCGGCTCGCTGGACGCCGACGGCGGGATCCTGCTGGCGCTGATCCTGGCGGTGGCCGTCTCGGGGAAGCTCCTCCCGGGGTACGCGGCCGCCCGGATCAGCGGCATGGACCCGCCCCAGTCGGCCGTGGTCGCCGTCCTGGTGAACACCCGCGGGCTCACGGAGCTCATCGTGCTCGACGTGGCGCTCGACGCCGGGGTCATCGGGCCCGGGCTCTTCACCGTGCTCGTGCTCATGGCCCTGACCACCACCTTCATGACGGGCCCGCTGCTGGCCCTGGCCGGCCGCCGGTGGGGGTTCCCGCCGCCACCGGCGGAACATCCCGGTAAACATCCGCGGAGGAATTCCGTGAAGCGCCGGGAGAGTCTTCACGGGAAACGCCGGGAGGGCGCCGCGGAAAGCTCCTGACGGCCCGTTCCGGCGGCGCGGACGGAAGATTTCTCCGGAATCCGGAGACGTGGTCTGCGGCGCCGTGACGCGGAATATCGGAATCCGCCCGGTGACCCGGTTGTCCGGTGATTCCGTGATCCCGTGCTTCCGTGATTCCGCCTCGCGCCGGCCGGTTGGACACCGCTTGACTCCGGCGCGGGGCGAAATCAGACTCGATCCGGCGACCGGAAATGCGGGAACCAGAAATACGGGGGCGGGAATGACCCGCCGGGTGACAAGGGGAGGAGGGATGGCCTTGCCGGATTCCGAGGAATTCGATGTGGTGGTCGTCGGTGGAGGGCCCGCCGGATCGACGCTGGCCGCGTTGACGGCCATGCAGGGACACCGGGTGCTGGTCCTGGAGAAGGAGTTCTTCCCCCGTCACCAGATCGGGGAGTCGCTCCTGCCGGCCACCGTGCACGGCGTGTGCCGGCTGACCGGCGTGGCGGACGAGCTCGCCGCCGCGGGCTTCCCGCGCAAGCGCGGCGGCACGTTCAAGTGGGGCGCCAACCCCGAGCCGTGGACCTTCTCCTTCTCCGTCTCCCCGCGCATGACCGGGCCGACGTCCTACGCCTACCAGGTCGAGCGGGCCAAGTTCGACGAGATCCTGCTCAACAACGCCCGCCGGGTGGGCGCCGAGGTGCGCGAGGGCTGTGCCGCCGTCGACGTCGTCGAGGACGGGGAGCGGGTCCGGGGCGTCCGGTACACCGACGCCGACGGCCGCGAGCACCGGGCGTCGGCCACGTTCGTCGTGGACGCCTCCGGCAACGGAAGCCGGCTGTACCGGCGGGTGGGCGGAACCCGGGAGTACTCGGAGTTCTTCCGCAGCCTGGCCCTGTACGGCTACTTCGAGGGCGGCAAGCGGCTGCCGGAACCGAACTCGGGCAACATCCTGTCGGTGGCGTTCGAGAGCGGCTGGTTCTGGTACATCCCGCTGAGTCCGGACCTCACCAGCGTCGGTGCCGTGGTCCGCCGGGAGATGGCCGGCAAGATCCGGGGCGACTCCGGCAAGGCGCTGGCGGCGCTCATCGCCGAGTGCCCCCTGATCTCCGAGTACCTGGCGGACGCGCGGCGGGTCACCGAGGGCCCGTACGGGAAGCTCCGGGTCCGCAAGGACTACTCGTACCACCACACGACCTTCTCGCGGCCCGGCATGATCCTGGTCGGCGACGCTGCCTGCTTCGTGGACCCGGTGTTCTCCTCCGGCGTCCACCTGGCCACCTACAGCGCCCTGCTGGCGGCCCGCTCCATCAACAGCGTGCTCGCCGGGCTGGTCGGCGAGGACCGGGCCCTGCGGGAGTTCGAGTCCCGTTACCGCCGCGAGTACGGCGTCTTCTACGAGTTCCTGCTCTCCTTCTACGAGATGCACCAGGACGAGAACTCCTACTTCTGGCAGGCCAAGAAGGTCACCCGGGCCAACCGCCCGGAGCTGGAGTCGTTCGTCGAGCTCATCGGCGGGGTCTCCTCCGGCGAGCGGGTCCTGACGGACGCCGAGGTGCTGGCGAAGCGCTTCAGCTCGGGCTCCGCGGAGTTCGCCGCGGCCGTCGACGAACTCGCGGGCAGCGAGGACGGCAGCATGGTGCCGCTGTTCAAGTCCTCGGTGGTGCGCGAGGTCATGCAGGAGGGCGGCCAGGTCCAGATGCGCGCCCTGCTCGGCGAGGACGCCGAACCCGAGGCCCCCCTGTCCGCGGACGGCCTGGTGCCGTCCCCCGACGGCATGTTCTGGCTGCCCGCCCAGGGCACCGGCGAGTAGAGGGGGACCCGTGGCACGGCCGTCGGACGTCTCCCCGCACAACCGGCGCGACCGGTTGGACCCGCTGCCCGAACTCAGCCGGCTGAGCATCCGCGCACCGGTCTCCGAGGCCGTCCTCACCGAGGAGCCCGCCACCACCGGCTGGCTGGTCACCGGCCCCGAGGAGGTACGGGCGGTCCTCGGCGACGCGGACCGGTTCAGCACGGCCCTGGCCGCGGGCGGCGGACCCGGCGCCCGGCGGCCGGCCCAGCCGGGCAACCTCATCCAGTACGACCCGCCCGACCACTCCCGGCTCCGGCAGATGCTCACACCCGAGTTCACGGTCCGCCGGATGCGCGCCCTGGAACCGGCCGTCGAGGCCATCGTCGAGGACGCCCTGGACTCCCTGGAGAAGGACGGCCGGCCCGCGGACTTCATGCGGCACGTCGCCTGGACCGTGCCGGGCCTGGTGATGTGCGAGCTCTTCGGCGTGCCCCGCGACGACCGGGCCGAACTGGCCCGGGTCCTCAAGGTCAGCCGGCCGGCCTTCCGCGGACGGCGGCTGCAGGTCACCGCGGGCGCCAACTACCTCGCGTACATGGCCCGGCTCGTGGAGCGCAAGCGCCGCGAGCCCGGTGACGACCTGCTCGGCCGGGTGGTGCGCGAGCACGGCGCGGACACCGATGACGAGGAACTCGTCGGGCTCAGCGCCTTCGTGATGGGCTCCGGCGTCGAGAACATGGCCAGCATGCTGGGCCTCGGCATCCTCGCCCTGCTGGAGCACCCGGCCCAGCTCGCCCTCCTCCGTGAACGGCCCGGACTGATCGACGGCGCCGTGGAGGAACTCGTCCGCCACCTCTCGGTCATCCCGACCGCCTCGCCCCGCGTCGCCCGCGAGGACGTGAACCTCGGCGGCCGGACGGTCAAGGCGGGCGACCGCGTGGCCTGCTCCCTGCTCGCGGCCAACCGCGCACGCCGTCCGGGACAGCCGCCCGACCGCCTCGACATCACGCGCGAACCCACCGCCCATGTGGCGCTCGGCCACGGCGTCCACTACTGCGTCGGCGCGTCGCTGGTCAGGATGGAGCTCAGAGCCGCCTACCCGGCGGTACTGCGCCGCTTCCCCGCACTGCGGCTCGCGGTGCCCGCCGAGGAGATCCGCTTCCGTCCGCAGGCGCCCTACGGCCTGGAAACACTGCCCATCGCCTGGTAGGGGAGCCATGCCACCCACACCCACACCCACGCCCACGCCCACCACGCCCACCACGCCCACCCCGCCCGCCTACGCCCGCCGCGACCGGTTCGACCCCGCCGCGGAACTCCGCCGGCTGACCGCCGGGAGAACCGTCACCGCGATCGACGTCGGCCCGGGAACGGACGGGGTGCCCGTCTGGCTCGTGACCGGCCACGCCGAGGTGCGCCAAGTCCTCGGCGACCACCGCCGGTTCTCCACCCGCCGCCGCTTCGGGCCGCGCTCACCGTCCGGCCGCGCCGACGGCCCCCGGCCCGACGAGATGGCCGGGCAGCTCATGGACTACGACCCGCCCGAGCACACCCGGCTCCGGCAGATCCTCACGCCCGAGTTCACCGTGCGCCGGATGCGGCGGCTGGAGCCGCTGGTCGAGGGCATCGTCACGGAACGCCTCGACGCCATGGAACGCGGCGGGCCCCCGGCCGACCTGGTGCGGTCCTTCTGCTCGCCCGTGCCCGGCGCGGTGCTGTGCGAACTGATCGGGGTGCCCCGGGACGACCGCGGCGGCTTCCTGCGCCGCTGCCACGCGTTCCTCGCCCCCGGACGCGGCCGGCAGCGGCGGGCGGCGGCCGGCGACGCGCTGTCGCGCTACCTCGCCGAGATGGTGCGGCGCGCCCGCAGGGACCCCGGCGACGGCTTCCTCGGCGCACTGGTCCGCGACCACGGCGACGAGATCACCGATCAGGAACTGCGCGGCGTCTGCGTCCTGCTGGTCCTCGCCGGCCTCGACAACGTCTCGGGCATGCTCGGCCTGGGCACCCTGCTGCTCCTGGACCACCCCGCCCAGCTCGCCGCCGTGCGCGACGACCCCGGAGCGGTGGACGGCGCGGTCGACGAACTGCTCCGCTACCTGACGGTGCCGCACGCCCCCACGCCGCGCACCGCCCTGGAGGACGTCACCGTCGGAGACCGGCTCGTCCGGGCCGGGGAGCACGTCATCTGCTCGCTGCCGATGGCCAACCGCGACCCCGCCCTGCTCCCCGAGCCCGACCGGTTCGACATCACCCGGGAGCCCACCGCCCACGTGGCCTTCGGCCACGGCGTCCACCACTGCCTGGGCGCGGCCCTGGCCCGGATGGAACTGCGGACCGCGTACCCGGCGCTGCTGCGCCGCTTCCCCCGGCTCGCCGTGGCGGTGCCCGGCGAGGAGGTTCCGTTCCGCGTCCACGCCCTCGCGCACGGCGTGGACCGGCTGCCGGTGACCTGGTGA TAG
(2)海洋链霉菌非核糖体多肽合成酶进行蛋白功能域的分析: (2) Analysis of protein functional domains by Streptomyces marine non-ribosomal polypeptide synthetase:
将海洋链霉菌 Complestatin-like Compound基因簇中编码非核糖体多肽合成酶的序列提交到英国 Sanger学院Pfam蛋白质家庭数据库的在线分析系统中(http://pfam.sanger.ac.uk/search),对海洋链霉菌的非核糖体多肽合成酶进行蛋白功能域的分析。通过与薰衣草链霉菌S. lavendulae中Complestatin基因簇中的非核糖体多肽合成酶进行比对的结果(图4),发现海洋链霉菌 中的非核糖体多肽合中的SinC基因较Complestatin基因簇中的ComC基因缺少一个甲基化功能域,初步预测海洋链霉菌 中的非核糖体多肽合成酶有别于S. lavendulae中Complestatin基因簇中的非核糖体多肽合成酶,可能合成一级结构不同于Complestatin的Complestatin-like Compound。 Submit the sequences encoding non-ribosomal polypeptide synthetases in the Complestatin-like Compound gene cluster of Streptomyces marine to the online analysis system of the Pfam protein family database of Sanger College, UK (http://pfam.sanger.ac.uk/search), Protein domain analysis of a non-ribosomal polypeptide synthase from Streptomyces maritimus. By comparing with the non-ribosomal polypeptide synthetase in the Complestatin gene cluster in Streptomyces lavender S. lavendulae (Figure 4), it was found that the SinC gene in the non-ribosomal polypeptide complex in Streptomyces marine The ComC gene lacks a methylation functional domain, and it is preliminarily predicted that the non-ribosomal polypeptide synthase in Streptomyces marine is different from the non-ribosomal polypeptide synthase in the Complestatin gene cluster in S. lavendulae , and the primary structure may be different from that of Complestatin-like Compound of Complestatin.
(3)海洋链霉菌 Complestatin-like Compound基因簇合成产物结构预测 (3) Structure prediction of the synthetic product of the Complestatin-like Compound gene cluster of Streptomyces marinum
为进一步确定海洋链霉菌中基因簇编码产生的中Complestatin-like Compound的结构,我们将海洋链霉菌 Complestatin-like Compound基因簇中编码非核糖体多肽合成酶的序列提交到在线NRPS&PKS产物结构预测网站http://dna.sherman.lsi.umich.edu/上,对海洋链霉菌 Complestatin-like Compound基因簇可能合成的产物结构进行一级结构预测。并通过ChemDraw软件打开和查看预测结果(图5)。 In order to further determine the structure of the Complestatin-like Compound produced by the gene cluster in Streptomyces marine, we submitted the sequence encoding the non-ribosomal polypeptide synthase in the Complestatin-like Compound gene cluster of Streptomyces marine to the online NRPS&PKS product structure prediction website http ://dna.sherman.lsi.umich.edu/, the primary structure prediction of the possible synthetic product structure of the marine Streptomyces Complestatin-like Compound gene cluster. And open and view the predicted results through ChemDraw software (Figure 5).
预测结果显示,海洋链霉菌 Complestatin-like Compound基因簇合成产物的多肽氨基酸排列为Hpg→Trp→Ala→Hpg→Leu→Tyr→Hpg,有别于Complestatin一级多肽结构(图6)的Hpg→Trp→Hpg→Hpg→Hpg→Tyr→Hpg 。这可以说明海洋链霉菌中的基因簇可能编码合成一种一级结构不同于Complestatin。 The prediction results show that the amino acid sequence of the polypeptide synthesized by the Complestatin-like Compound gene cluster of Streptomyces marine is Hpg→Trp→Ala→Hpg→Leu→Tyr→Hpg, which is different from the Hpg→Trp of the Complestatin primary polypeptide structure (Figure 6). →Hpg→Hpg→Hpg→Tyr→Hpg. This may indicate that the gene cluster in marine Streptomyces may encode a primary structure different from Complestatin. the
SEQUENCE LISTING SEQUENCE LISTING
the
<110> 大连理工大学 <110> Dalian University of Technology
the
<120> 一种海洋链霉菌卤化酶基因及其产物,以及其修饰产物的生物合成基因簇 <120> A biosynthetic gene cluster of a marine Streptomyces halogenase gene and its product, and its modified product
the
<130> 201110333075.8 <130> 201110333075.8
the
<140> 201110333075.8 <140> 201110333075.8
<141> 2011-10-28 <141> 2011-10-28
the
<160> 1 <160> 1
the
<170> PatentIn version 3.3 <170> PatentIn version 3.3
the
<210> 1 <210> 1
<211> 508 <211> 508
<212> PRT <212> PRT
<213> Streptomyces xinghaiensis) <213> Streptomyces xinghaiensis)
the
the
<220> <220>
<221> SITE <221> SITE
<222> (30)..(35) <222> (30)..(35)
<223> FAD 结合位点(GGGXXG) <223> FAD binding site (GGGXXG)
the
<220> <220>
<221> SITE <221> SITE
<222> (243)..(249) <222> (243)..(249)
<223> 色氨酸残基结合位点(GWTWXIP) <223> Tryptophan residue binding site (GWTWXIP)
the
<400> 1 <400> 1
the
Met Thr Arg Arg Val Thr Arg Gly Gly Gly Met Ala Leu Pro Asp Ser Met Thr Arg Arg Val Thr Arg Gly Gly Gly Met Ala Leu Pro Asp Ser
1 5 10 15 1 5 10 15
the
the
Glu Glu Phe Asp Val Val Val Val Gly Gly Gly Pro Ala Gly Ser Thr Glu Glu Phe Asp Val Val Val Val Gly Gly Gly Pro Ala Gly Ser Thr
20 25 30 20 25 30
the
the
Leu Ala Ala Leu Thr Ala Met Gln Gly His Arg Val Leu Val Leu Glu Leu Ala Ala Leu Thr Ala Met Gln Gly His Arg Val Leu Val Leu Glu
35 40 45 35 40 45 45
the
the
Lys Glu Phe Phe Pro Arg His Gln Ile Gly Glu Ser Leu Leu Pro Ala Lys Glu Phe Phe Pro Arg His Gln Ile Gly Glu Ser Leu Leu Pro Ala
50 55 60 50 55 60 60
the
the
Thr Val His Gly Val Cys Arg Leu Thr Gly Val Ala Asp Glu Leu Ala Thr Val His Gly Val Cys Arg Leu Thr Gly Val Ala Asp Glu Leu Ala
65 70 75 80 65 70 75 80
the
the
Ala Ala Gly Phe Pro Arg Lys Arg Gly Gly Thr Phe Lys Trp Gly Ala Ala Ala Gly Phe Pro Arg Lys Arg Gly Gly Thr Phe Lys Trp Gly Ala
85 90 95 85 90 95
the
the
Asn Pro Glu Pro Trp Thr Phe Ser Phe Ser Val Ser Pro Arg Met Thr Asn Pro Glu Pro Trp Thr Phe Ser Phe Ser Val Ser Pro Arg Met Thr
100 105 110 100 105 110
the
the
Gly Pro Thr Tyr Ala Tyr Gln Val Glu Arg Ala Lys Phe Asp Glu Ile Gly Pro Thr Tyr Ala Tyr Gln Val Glu Arg Ala Lys Phe Asp Glu Ile
115 120 125 115 120 125
the
the
Leu Leu Asn Asn Ala Arg Arg Val Gly Ala Glu Val Arg Glu Gly Cys Leu Leu Asn Asn Ala Arg Arg Val Gly Ala Glu Val Arg Glu Gly Cys
130 135 140 130 135 140
the
the
Ala Ala Val Asp Val Val Glu Asp Gly Glu Arg Val Arg Gly Val Arg Ala Ala Val Asp Val Val Glu Asp Gly Glu Arg Val Arg Gly Val Arg
145 150 155 160 145 150 155 160
the
the
Tyr Thr Asp Ala Asp Gly Arg Glu His Arg Ala Ser Ala Thr Phe Val Tyr Thr Asp Ala Asp Gly Arg Glu His Arg Ala Ser Ala Thr Phe Val
165 170 175 165 170 175
the
the
Val Asp Ala Ser Gly Asn Gly Ser Arg Leu Tyr Arg Arg Val Gly Gly Val Asp Ala Ser Gly Asn Gly Ser Arg Leu Tyr Arg Arg Val Gly Gly
180 185 190 180 185 190
the
the
Thr Arg Glu Tyr Ser Glu Phe Phe Arg Ser Leu Ala Leu Tyr Gly Tyr Thr Arg Glu Tyr Ser Glu Phe Phe Arg Ser Leu Ala Leu Tyr Gly Tyr
195 200 205 195 200 205
the
the
Phe Glu Gly Gly Lys Arg Leu Pro Glu Pro Asn Ser Gly Asn Ile Leu Phe Glu Gly Gly Lys Arg Leu Pro Glu Pro Asn Ser Gly Asn Ile Leu
210 215 220 210 215 220
the
the
Ser Val Ala Phe Glu Ser Gly Trp Phe Trp Tyr Ile Pro Leu Ser Pro Ser Val Ala Phe Glu Ser Gly Trp Phe Trp Tyr Ile Pro Leu Ser Pro
225 230 235 240 225 230 235 240
the
the
Asp Leu Thr Ser Val Gly Ala Val Val Arg Arg Glu Met Ala Gly Lys Asp Leu Thr Ser Val Gly Ala Val Val Arg Arg Glu Met Ala Gly Lys
245 250 255 245 250 255
the
the
Ile Arg Gly Asp Ser Gly Lys Ala Leu Ala Ala Leu Ile Ala Glu Cys Ile Arg Gly Asp Ser Gly Lys Ala Leu Ala Ala Leu Ile Ala Glu Cys
260 265 270 260 265 270
the
the
Pro Leu Ile Ser Glu Tyr Leu Ala Asp Ala Arg Arg Val Thr Glu Gly Pro Leu Ile Ser Glu Tyr Leu Ala Asp Ala Arg Arg Val Thr Glu Gly
275 280 285 275 280 285
the
the
Pro Tyr Gly Lys Leu Arg Val Arg Lys Asp Tyr Ser Tyr His His Thr Pro Tyr Gly Lys Leu Arg Val Arg Lys Asp Tyr Ser Tyr His His Thr
290 295 300 290 295 300
the
the
Thr Phe Ser Arg Pro Gly Met Ile Leu Val Gly Asp Ala Ala Cys Phe Thr Phe Ser Arg Pro Gly Met Ile Leu Val Gly Asp Ala Ala Cys Phe
305 310 315 320 305 310 315 320
the
the
Val Asp Pro Val Phe Ser Ser Gly Val His Leu Ala Thr Tyr Ser Ala Val Asp Pro Val Phe Ser Ser Gly Val His Leu Ala Thr Tyr Ser Ala
325 330 335 325 330 335
the
the
Leu Leu Ala Ala Arg Ser Ile Asn Ser Val Leu Gly Leu Val Gly Glu Leu Leu Ala Ala Arg Ser Ile Asn Ser Val Leu Gly Leu Val Gly Glu
340 345 350 340 345 350
the
the
Asp Arg Ala Leu Arg Glu Phe Glu Ser Arg Tyr Arg Arg Glu Tyr Gly Asp Arg Ala Leu Arg Glu Phe Glu Ser Arg Tyr Arg Arg Glu Tyr Gly
355 360 365 355 360 365
the
the
Val Phe Tyr Glu Phe Leu Leu Ser Phe Tyr Glu Met His Gln Asp Glu Val Phe Tyr Glu Phe Leu Leu Ser Phe Tyr Glu Met His Gln Asp Glu
370 375 380 370 375 380
the
the
Asn Ser Tyr Phe Trp Gln Ala Lys Lys Val Thr Arg Ala Asn Arg Pro Asn Ser Tyr Phe Trp Gln Ala Lys Lys Val Thr Arg Ala Asn Arg Pro
385 390 395 400 385 390 395 400
the
the
Glu Leu Glu Ser Phe Val Glu Leu Ile Gly Gly Val Ser Ser Gly Glu Glu Leu Glu Ser Phe Val Glu Leu Ile Gly Gly Val Ser Ser Gly Glu
405 410 415 405 410 415
the
the
Arg Val Leu Thr Asp Ala Glu Val Leu Ala Lys Arg Phe Ser Ser Gly Arg Val Leu Thr Asp Ala Glu Val Leu Ala Lys Arg Phe Ser Ser Gly
420 425 430 420 425 430
the
the
Ser Ala Glu Phe Ala Ala Ala Val Asp Glu Leu Ala Gly Ser Glu Asp Ser Ala Glu Phe Ala Ala Ala Val Asp Glu Leu Ala Gly Ser Glu Asp
435 440 445 435 440 445
the
the
Gly Ser Met Val Pro Leu Phe Lys Ser Ser Val Val Arg Glu Val Met Gly Ser Met Val Pro Leu Phe Lys Ser Ser Val Val Arg Glu Val Met
450 455 460 450 455 460
the
the
Gln Glu Gly Gly Gln Val Gln Met Arg Ala Leu Leu Gly Glu Asp Ala Gln Glu Gly Gly Gln Val Gln Met Arg Ala Leu Leu Gly Glu Asp Ala
465 470 475 480 465 470 475 480
the
the
Glu Pro Glu Ala Pro Leu Ser Ala Asp Gly Leu Val Pro Ser Pro Asp Glu Pro Glu Ala Pro Leu Ser Ala Asp Gly Leu Val Pro Ser Pro Asp
485 490 495 485 490 495
the
the
Gly Met Phe Trp Leu Pro Ala Gln Gly Thr Gly Glu Gly Met Phe Trp Leu Pro Ala Gln Gly Thr Gly Glu
500 505 500 505
Claims (5)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2011103330758A CN102533800A (en) | 2011-10-28 | 2011-10-28 | A marine Streptomyces halogenase gene and its product, and a biosynthetic gene cluster of its modified product |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2011103330758A CN102533800A (en) | 2011-10-28 | 2011-10-28 | A marine Streptomyces halogenase gene and its product, and a biosynthetic gene cluster of its modified product |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN102533800A true CN102533800A (en) | 2012-07-04 |
Family
ID=46341832
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN2011103330758A Pending CN102533800A (en) | 2011-10-28 | 2011-10-28 | A marine Streptomyces halogenase gene and its product, and a biosynthetic gene cluster of its modified product |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN102533800A (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103695384A (en) * | 2013-12-20 | 2014-04-02 | 武汉大学 | Halogenase for catalyzing formation of C-F and C-Cl bonds |
| CN108130292A (en) * | 2018-01-04 | 2018-06-08 | 上海交通大学 | Marine streptomyces S063 and its anti-complement activity application |
| US20210335453A1 (en) * | 2018-03-05 | 2021-10-28 | University Court Of The University Of St Andrews | Novel enzymes |
-
2011
- 2011-10-28 CN CN2011103330758A patent/CN102533800A/en active Pending
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103695384A (en) * | 2013-12-20 | 2014-04-02 | 武汉大学 | Halogenase for catalyzing formation of C-F and C-Cl bonds |
| CN108130292A (en) * | 2018-01-04 | 2018-06-08 | 上海交通大学 | Marine streptomyces S063 and its anti-complement activity application |
| CN108130292B (en) * | 2018-01-04 | 2021-03-12 | 上海交通大学 | Marine Streptomyces S063 and its application in anti-complement activity |
| US20210335453A1 (en) * | 2018-03-05 | 2021-10-28 | University Court Of The University Of St Andrews | Novel enzymes |
| US12456543B2 (en) * | 2018-03-05 | 2025-10-28 | University Court Of The University Of St Andrews | Enzymes |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Murakami et al. | Epidermal LysM receptor ensures robust symbiotic signalling in Lotus japonicus | |
| KR20190059966A (en) | S. The Piogenes CAS9 mutant gene and the polypeptide encoded thereby | |
| Müller et al. | Global transcriptome analysis of spore formation in Myxococcus xanthus reveals a locus necessary for cell differentiation | |
| JP6001648B2 (en) | Detection of saxitoxin-producing dinoflagellates | |
| CN110777155B (en) | Minimal mycin biosynthesis gene cluster, recombinant bacterium and application thereof | |
| CN114350687A (en) | Rice bacterial leaf blight resistant gene, protein and application thereof | |
| CN114277046B (en) | Three-gene tandem expression vector for synthesizing tetrahydropyrimidine and application thereof | |
| Berry et al. | Cross-species transcriptomics identifies core regulatory changes differentiating the asymptomatic asexual and virulent sexual life cycles of grass-symbiotic Epichloë fungi | |
| Chang et al. | A widespread family of viral sponge proteins reveals specific inhibition of nucleotide signals in anti-phage defense | |
| CN102533800A (en) | A marine Streptomyces halogenase gene and its product, and a biosynthetic gene cluster of its modified product | |
| CN109402092B (en) | A marine environment-derived chitinase and its gene | |
| WO2023275306A1 (en) | Method for identifying target-binding peptides | |
| US7811790B2 (en) | Polymyxin synthetase and gene cluster thereof | |
| CN101979572B (en) | Preparation and application of striatoxin S4.3 from the South China Sea | |
| CN115896134A (en) | Brevibacillus brevis engineering strain for improving synthesis of ivermectin and construction method and application thereof | |
| Wright et al. | A Broad Spectrum Lasso Peptide Antibiotic Targeting the Bacterial Ribosome | |
| US10414796B2 (en) | Genetic system for producing a proteases inhibitor of a small peptide aldehyde type | |
| Cho et al. | Structural insight of the role of the Hahella chejuensis HapK protein in prodigiosin biosynthesis | |
| CN105087554B (en) | DNA phosphorothioate modifier clusters | |
| US11535834B2 (en) | Recombinant nucleoside-specific ribonuclease and method of producing and using same | |
| KR100861771B1 (en) | Balineon synthase for biosynthesis of validamycin and preparation method thereof | |
| Lean | Genome Mining for novel lasso peptides from Actinobacteria isolated from diverse Australian environments | |
| Waschulin | Investigating the biosynthetic potential of an Antarctic soil through metagenomics, cultivation, and heterologous expression | |
| Su et al. | Genome mining and UHPLC-MS/MS illuminate the specificity of secondary metabolite synthetic gene clusters in Bacillus subtilis NCD-2 | |
| Figiel et al. | Structures and enzymatic mechanisms of DRT7/UG10 antiphage reverse transcriptases |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20120704 |