CN111471665B - DNA cyclization molecule and application thereof - Google Patents
DNA cyclization molecule and application thereof Download PDFInfo
- Publication number
- CN111471665B CN111471665B CN201910063623.6A CN201910063623A CN111471665B CN 111471665 B CN111471665 B CN 111471665B CN 201910063623 A CN201910063623 A CN 201910063623A CN 111471665 B CN111471665 B CN 111471665B
- Authority
- CN
- China
- Prior art keywords
- leu
- lys
- glu
- asp
- ile
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000007363 ring formation reaction Methods 0.000 title abstract description 5
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 52
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 47
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 47
- 108091027544 Subgenomic mRNA Proteins 0.000 claims description 78
- 230000014509 gene expression Effects 0.000 claims description 73
- 230000008685 targeting Effects 0.000 claims description 73
- 210000004027 cell Anatomy 0.000 claims description 52
- 239000003623 enhancer Substances 0.000 claims description 34
- 238000000034 method Methods 0.000 claims description 25
- 102100021519 Hemoglobin subunit beta Human genes 0.000 claims description 20
- 230000008301 DNA looping mechanism Effects 0.000 claims description 18
- 108091033319 polynucleotide Proteins 0.000 claims description 17
- 102000040430 polynucleotide Human genes 0.000 claims description 17
- 239000002157 polynucleotide Substances 0.000 claims description 17
- 101150013707 HBB gene Proteins 0.000 claims description 15
- 108700039691 Genetic Promoter Regions Proteins 0.000 claims description 8
- 108091005904 Hemoglobin subunit beta Proteins 0.000 claims description 8
- 241001465754 Metazoa Species 0.000 claims description 7
- 230000001105 regulatory effect Effects 0.000 claims description 6
- 238000012258 culturing Methods 0.000 claims description 5
- 238000011144 upstream manufacturing Methods 0.000 claims description 5
- 241000244206 Nematoda Species 0.000 claims description 3
- 210000000601 blood cell Anatomy 0.000 claims description 3
- 201000010099 disease Diseases 0.000 claims description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 3
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 3
- 238000000338 in vitro Methods 0.000 claims description 3
- 230000001225 therapeutic effect Effects 0.000 claims 2
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 claims 1
- 241000206602 Eukaryota Species 0.000 claims 1
- 125000003275 alpha amino acid group Chemical group 0.000 claims 1
- 239000012634 fragment Substances 0.000 abstract description 67
- 108020004414 DNA Proteins 0.000 abstract description 47
- 101001022957 Homo sapiens LIM domain-binding protein 1 Proteins 0.000 abstract description 26
- 239000000710 homodimer Substances 0.000 abstract description 26
- 101001022948 Homo sapiens LIM domain-binding protein 2 Proteins 0.000 abstract description 25
- 102100035113 LIM domain-binding protein 2 Human genes 0.000 abstract description 25
- 102000004169 proteins and genes Human genes 0.000 abstract description 18
- 239000000539 dimer Substances 0.000 abstract description 10
- 108010077544 Chromatin Proteins 0.000 abstract description 8
- 210000003483 chromatin Anatomy 0.000 abstract description 8
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 abstract description 3
- 238000002360 preparation method Methods 0.000 abstract description 3
- 230000015572 biosynthetic process Effects 0.000 abstract description 2
- 230000004049 epigenetic modification Effects 0.000 abstract description 2
- 239000000178 monomer Substances 0.000 abstract description 2
- 150000003384 small molecules Chemical class 0.000 abstract description 2
- 150000001413 amino acids Chemical group 0.000 description 46
- 108010092854 aspartyllysine Proteins 0.000 description 23
- 108010013835 arginine glutamate Proteins 0.000 description 19
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 18
- 108010008355 arginyl-glutamine Proteins 0.000 description 18
- 239000013598 vector Substances 0.000 description 16
- 108010034529 leucyl-lysine Proteins 0.000 description 15
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 14
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 14
- 239000013612 plasmid Substances 0.000 description 14
- 101000899111 Homo sapiens Hemoglobin subunit beta Proteins 0.000 description 13
- 108010062796 arginyllysine Proteins 0.000 description 13
- IZSMEUDYADKZTJ-KJEVXHAQSA-N Arg-Tyr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IZSMEUDYADKZTJ-KJEVXHAQSA-N 0.000 description 12
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 12
- 108010054155 lysyllysine Proteins 0.000 description 12
- 108010012581 phenylalanylglutamate Proteins 0.000 description 12
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 11
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 11
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 11
- 108010038633 aspartylglutamate Proteins 0.000 description 11
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 11
- 108010005233 alanylglutamic acid Proteins 0.000 description 10
- 108091008053 gene clusters Proteins 0.000 description 10
- 108010057821 leucylproline Proteins 0.000 description 10
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 9
- 108010047562 NGR peptide Proteins 0.000 description 9
- 108010093581 aspartyl-proline Proteins 0.000 description 9
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 9
- 108010050848 glycylleucine Proteins 0.000 description 9
- 108010012058 leucyltyrosine Proteins 0.000 description 9
- 108010071207 serylmethionine Proteins 0.000 description 9
- 108010073969 valyllysine Proteins 0.000 description 9
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 8
- OKZOABJQOMAYEC-NUMRIWBASA-N Asn-Gln-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OKZOABJQOMAYEC-NUMRIWBASA-N 0.000 description 8
- OOXUBGLNDRGOKT-FXQIFTODSA-N Asn-Ser-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OOXUBGLNDRGOKT-FXQIFTODSA-N 0.000 description 8
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 8
- CMNMPCTVCWWYHY-MXAVVETBSA-N Ile-His-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(C)C)C(=O)O)N CMNMPCTVCWWYHY-MXAVVETBSA-N 0.000 description 8
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 8
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 8
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 8
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 8
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 8
- 108010044940 alanylglutamine Proteins 0.000 description 8
- 239000002299 complementary DNA Substances 0.000 description 8
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 8
- 239000013604 expression vector Substances 0.000 description 8
- 108010003700 lysyl aspartic acid Proteins 0.000 description 8
- 108010017391 lysylvaline Proteins 0.000 description 8
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 8
- 108010020532 tyrosyl-proline Proteins 0.000 description 8
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 7
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 7
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 7
- BTJVOUQWFXABOI-IHRRRGAJSA-N Arg-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(N)=N BTJVOUQWFXABOI-IHRRRGAJSA-N 0.000 description 7
- VWADICJNCPFKJS-ZLUOBGJFSA-N Asn-Ser-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O VWADICJNCPFKJS-ZLUOBGJFSA-N 0.000 description 7
- NVFSJIXJZCDICF-SRVKXCTJSA-N Asp-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N NVFSJIXJZCDICF-SRVKXCTJSA-N 0.000 description 7
- 108010075254 C-Peptide Proteins 0.000 description 7
- IOFDDSNZJDIGPB-GVXVVHGQSA-N Gln-Leu-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IOFDDSNZJDIGPB-GVXVVHGQSA-N 0.000 description 7
- AFODTOLGSZQDSL-PEFMBERDSA-N Glu-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N AFODTOLGSZQDSL-PEFMBERDSA-N 0.000 description 7
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 7
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 7
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 7
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 7
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 7
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 7
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 7
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 7
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 7
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 7
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 7
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 7
- JMGJDTNUMAZNLX-RWRJDSDZSA-N Thr-Glu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JMGJDTNUMAZNLX-RWRJDSDZSA-N 0.000 description 7
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 7
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 7
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 7
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 7
- 108010026333 seryl-proline Proteins 0.000 description 7
- 108010061238 threonyl-glycine Proteins 0.000 description 7
- 108010051110 tyrosyl-lysine Proteins 0.000 description 7
- NKJBKNVQHBZUIX-ACZMJKKPSA-N Ala-Gln-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKJBKNVQHBZUIX-ACZMJKKPSA-N 0.000 description 6
- MAEQBGQTDWDSJQ-LSJOCFKGSA-N Ala-Met-His Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N MAEQBGQTDWDSJQ-LSJOCFKGSA-N 0.000 description 6
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 6
- MTYLORHAQXVQOW-AVGNSLFASA-N Arg-Lys-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O MTYLORHAQXVQOW-AVGNSLFASA-N 0.000 description 6
- URAUIUGLHBRPMF-NAKRPEOUSA-N Arg-Ser-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O URAUIUGLHBRPMF-NAKRPEOUSA-N 0.000 description 6
- OLGCWMNDJTWQAG-GUBZILKMSA-N Asn-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(N)=O OLGCWMNDJTWQAG-GUBZILKMSA-N 0.000 description 6
- DDPXDCKYWDGZAL-BQBZGAKWSA-N Asn-Gly-Arg Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N DDPXDCKYWDGZAL-BQBZGAKWSA-N 0.000 description 6
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 6
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 6
- SHAUZYVSXAMYAZ-JYJNAYRXSA-N Gln-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N SHAUZYVSXAMYAZ-JYJNAYRXSA-N 0.000 description 6
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 6
- QQLBPVKLJBAXBS-FXQIFTODSA-N Glu-Glu-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QQLBPVKLJBAXBS-FXQIFTODSA-N 0.000 description 6
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 6
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 6
- NZGTYCMLUGYMCV-XUXIUFHCSA-N Ile-Lys-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N NZGTYCMLUGYMCV-XUXIUFHCSA-N 0.000 description 6
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 6
- DLCXCECTCPKKCD-GUBZILKMSA-N Leu-Gln-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DLCXCECTCPKKCD-GUBZILKMSA-N 0.000 description 6
- OMHLATXVNQSALM-FQUUOJAGSA-N Leu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(C)C)N OMHLATXVNQSALM-FQUUOJAGSA-N 0.000 description 6
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 6
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 6
- DRIJZWBRGMJCDD-DCAQKATOSA-N Pro-Gln-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O DRIJZWBRGMJCDD-DCAQKATOSA-N 0.000 description 6
- RSTWKJFWBKFOFC-JYJNAYRXSA-N Pro-Trp-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O RSTWKJFWBKFOFC-JYJNAYRXSA-N 0.000 description 6
- 108010079005 RDV peptide Proteins 0.000 description 6
- UOLGINIHBRIECN-FXQIFTODSA-N Ser-Glu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UOLGINIHBRIECN-FXQIFTODSA-N 0.000 description 6
- KRDSCBLRHORMRK-JXUBOQSCSA-N Thr-Lys-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O KRDSCBLRHORMRK-JXUBOQSCSA-N 0.000 description 6
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 6
- 210000000349 chromosome Anatomy 0.000 description 6
- 230000029087 digestion Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 6
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 6
- 108010085325 histidylproline Proteins 0.000 description 6
- 108010078274 isoleucylvaline Proteins 0.000 description 6
- 108010009298 lysylglutamic acid Proteins 0.000 description 6
- 108010064235 lysylglycine Proteins 0.000 description 6
- 108010051242 phenylalanylserine Proteins 0.000 description 6
- 108010004914 prolylarginine Proteins 0.000 description 6
- 230000008672 reprogramming Effects 0.000 description 6
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 5
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 5
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 5
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 5
- KEBACWCLVOXFNC-DCAQKATOSA-N Glu-Arg-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O KEBACWCLVOXFNC-DCAQKATOSA-N 0.000 description 5
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 5
- PYFIQROSWQERAS-LBPRGKRZSA-N Gly-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)CN)C(=O)NCC(O)=O)=CNC2=C1 PYFIQROSWQERAS-LBPRGKRZSA-N 0.000 description 5
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 5
- 108010065920 Insulin Lispro Proteins 0.000 description 5
- 241000880493 Leptailurus serval Species 0.000 description 5
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 5
- OGUUKPXUTHOIAV-SDDRHHMPSA-N Leu-Glu-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N OGUUKPXUTHOIAV-SDDRHHMPSA-N 0.000 description 5
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 5
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 5
- YIRIDPUGZKHMHT-ACRUOGEOSA-N Leu-Tyr-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YIRIDPUGZKHMHT-ACRUOGEOSA-N 0.000 description 5
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 5
- QBEPTBMRQALPEV-MNXVOIDGSA-N Lys-Ile-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN QBEPTBMRQALPEV-MNXVOIDGSA-N 0.000 description 5
- MPFGIYLYWUCSJG-AVGNSLFASA-N Phe-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MPFGIYLYWUCSJG-AVGNSLFASA-N 0.000 description 5
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 5
- KJJROSNFBRWPHS-JYJNAYRXSA-N Phe-Glu-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KJJROSNFBRWPHS-JYJNAYRXSA-N 0.000 description 5
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 5
- JUJCUYWRJMFJJF-AVGNSLFASA-N Pro-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 JUJCUYWRJMFJJF-AVGNSLFASA-N 0.000 description 5
- 108010003201 RGH 0205 Proteins 0.000 description 5
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 5
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 5
- UDQBCBUXAQIZAK-GLLZPBPUSA-N Thr-Glu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDQBCBUXAQIZAK-GLLZPBPUSA-N 0.000 description 5
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 5
- JLKVWTICWVWGSK-JYJNAYRXSA-N Tyr-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JLKVWTICWVWGSK-JYJNAYRXSA-N 0.000 description 5
- LMKKMCGTDANZTR-BZSNNMDCSA-N Tyr-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LMKKMCGTDANZTR-BZSNNMDCSA-N 0.000 description 5
- UUJHRSTVQCFDPA-UFYCRDLUSA-N Tyr-Tyr-Val Chemical compound C([C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 UUJHRSTVQCFDPA-UFYCRDLUSA-N 0.000 description 5
- PAPWZOJOLKZEFR-AVGNSLFASA-N Val-Arg-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N PAPWZOJOLKZEFR-AVGNSLFASA-N 0.000 description 5
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 5
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 5
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 5
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 5
- 108010070944 alanylhistidine Proteins 0.000 description 5
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 5
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 5
- 108010077245 asparaginyl-proline Proteins 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 5
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 5
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 5
- 108010063431 methionyl-aspartyl-glycine Proteins 0.000 description 5
- 108010031719 prolyl-serine Proteins 0.000 description 5
- 108010029020 prolylglycine Proteins 0.000 description 5
- 238000011084 recovery Methods 0.000 description 5
- 238000007480 sanger sequencing Methods 0.000 description 5
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 5
- 108010012567 tyrosyl-glycyl-glycyl-phenylalanyl Proteins 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- GYNQVPIDAQTZOY-ROUUACIJSA-N (2s)-2-[[2-[[2-[[(2s)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]acetyl]amino]acetyl]amino]-3-phenylpropanoic acid Chemical compound C([C@H](N)C(=O)NCC(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 GYNQVPIDAQTZOY-ROUUACIJSA-N 0.000 description 4
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 4
- NXSFUECZFORGOG-CIUDSAMLSA-N Ala-Asn-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXSFUECZFORGOG-CIUDSAMLSA-N 0.000 description 4
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 4
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 4
- HYIDEIQUCBKIPL-CQDKDKBSSA-N Ala-Phe-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N HYIDEIQUCBKIPL-CQDKDKBSSA-N 0.000 description 4
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 4
- LSMDIAAALJJLRO-XQXXSGGOSA-N Ala-Thr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LSMDIAAALJJLRO-XQXXSGGOSA-N 0.000 description 4
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 4
- MUXONAMCEUBVGA-DCAQKATOSA-N Arg-Arg-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O MUXONAMCEUBVGA-DCAQKATOSA-N 0.000 description 4
- VXXHDZKEQNGXNU-QXEWZRGKSA-N Arg-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N VXXHDZKEQNGXNU-QXEWZRGKSA-N 0.000 description 4
- YUGFLWBWAJFGKY-BQBZGAKWSA-N Arg-Cys-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O YUGFLWBWAJFGKY-BQBZGAKWSA-N 0.000 description 4
- CVKOQHYVDVYJSI-QTKMDUPCSA-N Arg-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCN=C(N)N)N)O CVKOQHYVDVYJSI-QTKMDUPCSA-N 0.000 description 4
- LKDHUGLXOHYINY-XUXIUFHCSA-N Arg-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LKDHUGLXOHYINY-XUXIUFHCSA-N 0.000 description 4
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 4
- PZBSKYJGKNNYNK-ULQDDVLXSA-N Arg-Leu-Tyr Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O PZBSKYJGKNNYNK-ULQDDVLXSA-N 0.000 description 4
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 4
- KSUALAGYYLQSHJ-RCWTZXSCSA-N Arg-Met-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSUALAGYYLQSHJ-RCWTZXSCSA-N 0.000 description 4
- ADPACBMPYWJJCE-FXQIFTODSA-N Arg-Ser-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O ADPACBMPYWJJCE-FXQIFTODSA-N 0.000 description 4
- LFWOQHSQNCKXRU-UFYCRDLUSA-N Arg-Tyr-Phe Chemical compound C([C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 LFWOQHSQNCKXRU-UFYCRDLUSA-N 0.000 description 4
- PSUXEQYPYZLNER-QXEWZRGKSA-N Arg-Val-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PSUXEQYPYZLNER-QXEWZRGKSA-N 0.000 description 4
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 4
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 4
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 4
- IBLAOXSULLECQZ-IUKAMOBKSA-N Asn-Ile-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC(N)=O IBLAOXSULLECQZ-IUKAMOBKSA-N 0.000 description 4
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 4
- ALHMNHZJBYBYHS-DCAQKATOSA-N Asn-Lys-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ALHMNHZJBYBYHS-DCAQKATOSA-N 0.000 description 4
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 4
- NCXTYSVDWLAQGZ-ZKWXMUAHSA-N Asn-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O NCXTYSVDWLAQGZ-ZKWXMUAHSA-N 0.000 description 4
- CXBOKJPLEYUPGB-FXQIFTODSA-N Asp-Ala-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)O)N CXBOKJPLEYUPGB-FXQIFTODSA-N 0.000 description 4
- NECWUSYTYSIFNC-DLOVCJGASA-N Asp-Ala-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 NECWUSYTYSIFNC-DLOVCJGASA-N 0.000 description 4
- MJKBOVWWADWLHV-ZLUOBGJFSA-N Asp-Cys-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)C(=O)O MJKBOVWWADWLHV-ZLUOBGJFSA-N 0.000 description 4
- KHGPWGKPYHPOIK-QWRGUYRKSA-N Asp-Gly-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KHGPWGKPYHPOIK-QWRGUYRKSA-N 0.000 description 4
- ORRJQLIATJDMQM-HJGDQZAQSA-N Asp-Leu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O ORRJQLIATJDMQM-HJGDQZAQSA-N 0.000 description 4
- CTWCFPWFIGRAEP-CIUDSAMLSA-N Asp-Lys-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O CTWCFPWFIGRAEP-CIUDSAMLSA-N 0.000 description 4
- MYLZFUMPZCPJCJ-NHCYSSNCSA-N Asp-Lys-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MYLZFUMPZCPJCJ-NHCYSSNCSA-N 0.000 description 4
- WDMNFNXKGSLIOB-GUBZILKMSA-N Asp-Met-Met Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)O)N WDMNFNXKGSLIOB-GUBZILKMSA-N 0.000 description 4
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 4
- QSFHZPQUAAQHAQ-CIUDSAMLSA-N Asp-Ser-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O QSFHZPQUAAQHAQ-CIUDSAMLSA-N 0.000 description 4
- HRVQDZOWMLFAOD-BIIVOSGPSA-N Asp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N)C(=O)O HRVQDZOWMLFAOD-BIIVOSGPSA-N 0.000 description 4
- UWXFFVQPAMBETM-ZLUOBGJFSA-N Cys-Asp-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O UWXFFVQPAMBETM-ZLUOBGJFSA-N 0.000 description 4
- LHJDLVVQRJIURS-SRVKXCTJSA-N Cys-Phe-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N LHJDLVVQRJIURS-SRVKXCTJSA-N 0.000 description 4
- VIOQRFNAZDMVLO-NRPADANISA-N Cys-Val-Glu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VIOQRFNAZDMVLO-NRPADANISA-N 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- JFOKLAPFYCTNHW-SRVKXCTJSA-N Gln-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)N)N JFOKLAPFYCTNHW-SRVKXCTJSA-N 0.000 description 4
- KVXVVDFOZNYYKZ-DCAQKATOSA-N Gln-Gln-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KVXVVDFOZNYYKZ-DCAQKATOSA-N 0.000 description 4
- SMLDOQHTOAAFJQ-WDSKDSINSA-N Gln-Gly-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SMLDOQHTOAAFJQ-WDSKDSINSA-N 0.000 description 4
- DWDBJWAXPXXYLP-SRVKXCTJSA-N Gln-His-Arg Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N DWDBJWAXPXXYLP-SRVKXCTJSA-N 0.000 description 4
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 4
- ZNTDJIMJKNNSLR-RWRJDSDZSA-N Gln-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZNTDJIMJKNNSLR-RWRJDSDZSA-N 0.000 description 4
- SXGMGNZEHFORAV-IUCAKERBSA-N Gln-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXGMGNZEHFORAV-IUCAKERBSA-N 0.000 description 4
- ZVQZXPADLZIQFF-FHWLQOOXSA-N Gln-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CCC(N)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 ZVQZXPADLZIQFF-FHWLQOOXSA-N 0.000 description 4
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 4
- XKPACHRGOWQHFH-IRIUXVKKSA-N Gln-Thr-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XKPACHRGOWQHFH-IRIUXVKKSA-N 0.000 description 4
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 4
- PAQUJCSYVIBPLC-AVGNSLFASA-N Glu-Asp-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PAQUJCSYVIBPLC-AVGNSLFASA-N 0.000 description 4
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 4
- OAGVHWYIBZMWLA-YFKPBYRVSA-N Glu-Gly-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)NCC(O)=O OAGVHWYIBZMWLA-YFKPBYRVSA-N 0.000 description 4
- VGOFRWOTSXVPAU-SDDRHHMPSA-N Glu-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCC(=O)O)N)C(=O)O VGOFRWOTSXVPAU-SDDRHHMPSA-N 0.000 description 4
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 4
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 4
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 4
- AQNYKMCFCCZEEL-JYJNAYRXSA-N Glu-Lys-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AQNYKMCFCCZEEL-JYJNAYRXSA-N 0.000 description 4
- FGSGPLRPQCZBSQ-AVGNSLFASA-N Glu-Phe-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O FGSGPLRPQCZBSQ-AVGNSLFASA-N 0.000 description 4
- YQPFCZVKMUVZIN-AUTRQRHGSA-N Glu-Val-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQPFCZVKMUVZIN-AUTRQRHGSA-N 0.000 description 4
- CLODWIOAKCSBAN-BQBZGAKWSA-N Gly-Arg-Asp Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O CLODWIOAKCSBAN-BQBZGAKWSA-N 0.000 description 4
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 4
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 4
- AQLHORCVPGXDJW-IUCAKERBSA-N Gly-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN AQLHORCVPGXDJW-IUCAKERBSA-N 0.000 description 4
- HFXJIZNEXNIZIJ-BQBZGAKWSA-N Gly-Glu-Gln Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFXJIZNEXNIZIJ-BQBZGAKWSA-N 0.000 description 4
- STVHDEHTKFXBJQ-LAEOZQHASA-N Gly-Glu-Ile Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STVHDEHTKFXBJQ-LAEOZQHASA-N 0.000 description 4
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 4
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 4
- HMHRTKOWRUPPNU-RCOVLWMOSA-N Gly-Ile-Gly Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O HMHRTKOWRUPPNU-RCOVLWMOSA-N 0.000 description 4
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 4
- GAAHQHNCMIAYEX-UWVGGRQHSA-N Gly-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GAAHQHNCMIAYEX-UWVGGRQHSA-N 0.000 description 4
- GLACUWHUYFBSPJ-FJXKBIBVSA-N Gly-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GLACUWHUYFBSPJ-FJXKBIBVSA-N 0.000 description 4
- YABRDIBSPZONIY-BQBZGAKWSA-N Gly-Ser-Met Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O YABRDIBSPZONIY-BQBZGAKWSA-N 0.000 description 4
- NQKRILCJYCASDV-QWRGUYRKSA-N His-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 NQKRILCJYCASDV-QWRGUYRKSA-N 0.000 description 4
- DEOQGJUXUQGUJN-KKUMJFAQSA-N His-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N DEOQGJUXUQGUJN-KKUMJFAQSA-N 0.000 description 4
- PGXZHYYGOPKYKM-IHRRRGAJSA-N His-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CN=CN2)N)C(=O)N[C@@H](CCCCN)C(=O)O PGXZHYYGOPKYKM-IHRRRGAJSA-N 0.000 description 4
- RWIKBYVJQAJYDP-BJDJZHNGSA-N Ile-Ala-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RWIKBYVJQAJYDP-BJDJZHNGSA-N 0.000 description 4
- SACHLUOUHCVIKI-GMOBBJLQSA-N Ile-Arg-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SACHLUOUHCVIKI-GMOBBJLQSA-N 0.000 description 4
- UAVQIQOOBXFKRC-BYULHYEWSA-N Ile-Asn-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O UAVQIQOOBXFKRC-BYULHYEWSA-N 0.000 description 4
- NKRJALPCDNXULF-BYULHYEWSA-N Ile-Asp-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O NKRJALPCDNXULF-BYULHYEWSA-N 0.000 description 4
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 4
- PHIXPNQDGGILMP-YVNDNENWSA-N Ile-Glu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PHIXPNQDGGILMP-YVNDNENWSA-N 0.000 description 4
- DFFTXLCCDFYRKD-MBLNEYKQSA-N Ile-Gly-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N DFFTXLCCDFYRKD-MBLNEYKQSA-N 0.000 description 4
- TVYWVSJGSHQWMT-AJNGGQMLSA-N Ile-Leu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N TVYWVSJGSHQWMT-AJNGGQMLSA-N 0.000 description 4
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 4
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 4
- HJDZMPFEXINXLO-QPHKQPEJSA-N Ile-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N HJDZMPFEXINXLO-QPHKQPEJSA-N 0.000 description 4
- UYODHPPSCXBNCS-XUXIUFHCSA-N Ile-Val-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C UYODHPPSCXBNCS-XUXIUFHCSA-N 0.000 description 4
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 4
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 4
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 4
- LAGPXKYZCCTSGQ-JYJNAYRXSA-N Leu-Glu-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LAGPXKYZCCTSGQ-JYJNAYRXSA-N 0.000 description 4
- JFSGIJSCJFQGSZ-MXAVVETBSA-N Leu-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N JFSGIJSCJFQGSZ-MXAVVETBSA-N 0.000 description 4
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 4
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 4
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 4
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 4
- LJBVRCDPWOJOEK-PPCPHDFISA-N Leu-Thr-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LJBVRCDPWOJOEK-PPCPHDFISA-N 0.000 description 4
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 4
- WPIKRJDRQVFRHP-TUSQITKMSA-N Leu-Trp-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O WPIKRJDRQVFRHP-TUSQITKMSA-N 0.000 description 4
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 4
- 102000003960 Ligases Human genes 0.000 description 4
- 108090000364 Ligases Proteins 0.000 description 4
- RVOMPSJXSRPFJT-DCAQKATOSA-N Lys-Ala-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVOMPSJXSRPFJT-DCAQKATOSA-N 0.000 description 4
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 4
- VHNOAIFVYUQOOY-XUXIUFHCSA-N Lys-Arg-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VHNOAIFVYUQOOY-XUXIUFHCSA-N 0.000 description 4
- CKSXSQUVEYCDIW-AVGNSLFASA-N Lys-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N CKSXSQUVEYCDIW-AVGNSLFASA-N 0.000 description 4
- NTSPQIONFJUMJV-AVGNSLFASA-N Lys-Arg-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O NTSPQIONFJUMJV-AVGNSLFASA-N 0.000 description 4
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 4
- LLSUNJYOSCOOEB-GUBZILKMSA-N Lys-Glu-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O LLSUNJYOSCOOEB-GUBZILKMSA-N 0.000 description 4
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 4
- PGLGNCVOWIORQE-SRVKXCTJSA-N Lys-His-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O PGLGNCVOWIORQE-SRVKXCTJSA-N 0.000 description 4
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 4
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 4
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 4
- GZGWILAQHOVXTD-DCAQKATOSA-N Lys-Met-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O GZGWILAQHOVXTD-DCAQKATOSA-N 0.000 description 4
- MTBLFIQZECOEBY-IHRRRGAJSA-N Lys-Met-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O MTBLFIQZECOEBY-IHRRRGAJSA-N 0.000 description 4
- JMNRXRPBHFGXQX-GUBZILKMSA-N Lys-Ser-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JMNRXRPBHFGXQX-GUBZILKMSA-N 0.000 description 4
- RMKJOQSYLQQRFN-KKUMJFAQSA-N Lys-Tyr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O RMKJOQSYLQQRFN-KKUMJFAQSA-N 0.000 description 4
- LMMBAXJRYSXCOQ-ACRUOGEOSA-N Lys-Tyr-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O LMMBAXJRYSXCOQ-ACRUOGEOSA-N 0.000 description 4
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 4
- OHXUUQDOBQKSNB-AVGNSLFASA-N Lys-Val-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OHXUUQDOBQKSNB-AVGNSLFASA-N 0.000 description 4
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 4
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 4
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 4
- HDNOQCZWJGGHSS-VEVYYDQMSA-N Met-Asn-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HDNOQCZWJGGHSS-VEVYYDQMSA-N 0.000 description 4
- UROWNMBTQGGTHB-DCAQKATOSA-N Met-Leu-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UROWNMBTQGGTHB-DCAQKATOSA-N 0.000 description 4
- IILAGWCGKJSBGB-IHRRRGAJSA-N Met-Phe-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IILAGWCGKJSBGB-IHRRRGAJSA-N 0.000 description 4
- 238000012408 PCR amplification Methods 0.000 description 4
- NEHSHYOUIWBYSA-DCPHZVHLSA-N Phe-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NEHSHYOUIWBYSA-DCPHZVHLSA-N 0.000 description 4
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 4
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 4
- KRYSMKKRRRWOCZ-QEWYBTABSA-N Phe-Ile-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KRYSMKKRRRWOCZ-QEWYBTABSA-N 0.000 description 4
- YKUGPVXSDOOANW-KKUMJFAQSA-N Phe-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YKUGPVXSDOOANW-KKUMJFAQSA-N 0.000 description 4
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 4
- RYQWALWYQWBUKN-FHWLQOOXSA-N Phe-Phe-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RYQWALWYQWBUKN-FHWLQOOXSA-N 0.000 description 4
- APMXLWHMIVWLLR-BZSNNMDCSA-N Phe-Tyr-Ser Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(O)=O)C1=CC=CC=C1 APMXLWHMIVWLLR-BZSNNMDCSA-N 0.000 description 4
- FZHBZMDRDASUHN-NAKRPEOUSA-N Pro-Ala-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1)C(O)=O FZHBZMDRDASUHN-NAKRPEOUSA-N 0.000 description 4
- CYQQWUPHIZVCNY-GUBZILKMSA-N Pro-Arg-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CYQQWUPHIZVCNY-GUBZILKMSA-N 0.000 description 4
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 4
- SSWJYJHXQOYTSP-SRVKXCTJSA-N Pro-His-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O SSWJYJHXQOYTSP-SRVKXCTJSA-N 0.000 description 4
- APIAILHCTSBGLU-JYJNAYRXSA-N Pro-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@@H]2CCCN2 APIAILHCTSBGLU-JYJNAYRXSA-N 0.000 description 4
- JFBJPBZSTMXGKL-JYJNAYRXSA-N Pro-Met-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JFBJPBZSTMXGKL-JYJNAYRXSA-N 0.000 description 4
- RCYUBVHMVUHEBM-RCWTZXSCSA-N Pro-Pro-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RCYUBVHMVUHEBM-RCWTZXSCSA-N 0.000 description 4
- CHYAYDLYYIJCKY-OSUNSFLBSA-N Pro-Thr-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CHYAYDLYYIJCKY-OSUNSFLBSA-N 0.000 description 4
- LZHHZYDPMZEMRX-STQMWFEESA-N Pro-Tyr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O LZHHZYDPMZEMRX-STQMWFEESA-N 0.000 description 4
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 4
- YIUWWXVTYLANCJ-NAKRPEOUSA-N Ser-Ile-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O YIUWWXVTYLANCJ-NAKRPEOUSA-N 0.000 description 4
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 4
- QJKPECIAWNNKIT-KKUMJFAQSA-N Ser-Lys-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QJKPECIAWNNKIT-KKUMJFAQSA-N 0.000 description 4
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 4
- MQBTXMPQNCGSSZ-OSUNSFLBSA-N Thr-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N MQBTXMPQNCGSSZ-OSUNSFLBSA-N 0.000 description 4
- IRKWVRSEQFTGGV-VEVYYDQMSA-N Thr-Asn-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IRKWVRSEQFTGGV-VEVYYDQMSA-N 0.000 description 4
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 4
- DKDHTRVDOUZZTP-IFFSRLJSSA-N Thr-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DKDHTRVDOUZZTP-IFFSRLJSSA-N 0.000 description 4
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 4
- BVOVIGCHYNFJBZ-JXUBOQSCSA-N Thr-Leu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O BVOVIGCHYNFJBZ-JXUBOQSCSA-N 0.000 description 4
- UGFSAPWZBROURT-IXOXFDKPSA-N Thr-Phe-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)O)N)O UGFSAPWZBROURT-IXOXFDKPSA-N 0.000 description 4
- VBMOVTMNHWPZJR-SUSMZKCASA-N Thr-Thr-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VBMOVTMNHWPZJR-SUSMZKCASA-N 0.000 description 4
- LXXCHJKHJYRMIY-FQPOAREZSA-N Thr-Tyr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O LXXCHJKHJYRMIY-FQPOAREZSA-N 0.000 description 4
- BZTSQFWJNJYZSX-JRQIVUDYSA-N Thr-Tyr-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O BZTSQFWJNJYZSX-JRQIVUDYSA-N 0.000 description 4
- QTQNGBOKNQNQLS-PMVMPFDFSA-N Trp-His-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC3=CNC4=CC=CC=C43)N QTQNGBOKNQNQLS-PMVMPFDFSA-N 0.000 description 4
- DTPWXZXGFAHEKL-NWLDYVSISA-N Trp-Thr-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DTPWXZXGFAHEKL-NWLDYVSISA-N 0.000 description 4
- KDGFPPHLXCEQRN-STECZYCISA-N Tyr-Arg-Ile Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDGFPPHLXCEQRN-STECZYCISA-N 0.000 description 4
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 4
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 4
- VBFVQTPETKJCQW-RPTUDFQQSA-N Tyr-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VBFVQTPETKJCQW-RPTUDFQQSA-N 0.000 description 4
- RCMWNNJFKNDKQR-UFYCRDLUSA-N Tyr-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 RCMWNNJFKNDKQR-UFYCRDLUSA-N 0.000 description 4
- NZBSVMQZQMEUHI-WZLNRYEVSA-N Tyr-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NZBSVMQZQMEUHI-WZLNRYEVSA-N 0.000 description 4
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 4
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 4
- VLDMQVZZWDOKQF-AUTRQRHGSA-N Val-Glu-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VLDMQVZZWDOKQF-AUTRQRHGSA-N 0.000 description 4
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 4
- AEMPCGRFEZTWIF-IHRRRGAJSA-N Val-Leu-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O AEMPCGRFEZTWIF-IHRRRGAJSA-N 0.000 description 4
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 4
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 4
- JVGHIFMSFBZDHH-WPRPVWTQSA-N Val-Met-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N JVGHIFMSFBZDHH-WPRPVWTQSA-N 0.000 description 4
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 4
- UQMPYVLTQCGRSK-IFFSRLJSSA-N Val-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N)O UQMPYVLTQCGRSK-IFFSRLJSSA-N 0.000 description 4
- MIAZWUMFUURQNP-YDHLFZDLSA-N Val-Tyr-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N MIAZWUMFUURQNP-YDHLFZDLSA-N 0.000 description 4
- AOILQMZPNLUXCM-AVGNSLFASA-N Val-Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN AOILQMZPNLUXCM-AVGNSLFASA-N 0.000 description 4
- 230000004913 activation Effects 0.000 description 4
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 4
- 108010087924 alanylproline Proteins 0.000 description 4
- 230000003321 amplification Effects 0.000 description 4
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000004520 electroporation Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 4
- 108010079547 glutamylmethionine Proteins 0.000 description 4
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 4
- 108010015792 glycyllysine Proteins 0.000 description 4
- 108010081551 glycylphenylalanine Proteins 0.000 description 4
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 4
- 108010027338 isoleucylcysteine Proteins 0.000 description 4
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 4
- 238000003199 nucleic acid amplification method Methods 0.000 description 4
- 108010025488 pinealon Proteins 0.000 description 4
- 229920001184 polypeptide Polymers 0.000 description 4
- 108090000765 processed proteins & peptides Proteins 0.000 description 4
- 102000004196 processed proteins & peptides Human genes 0.000 description 4
- VOUUHEHYSHWUHG-UWVGGRQHSA-N (2s)-2-[[2-[[2-[[2-[[(2s)-2-[[2-[[2-[(2-aminoacetyl)amino]acetyl]amino]acetyl]amino]-3-hydroxypropanoyl]amino]acetyl]amino]acetyl]amino]acetyl]amino]-3-hydroxypropanoic acid Chemical compound NCC(=O)NCC(=O)NCC(=O)N[C@@H](CO)C(=O)NCC(=O)NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O VOUUHEHYSHWUHG-UWVGGRQHSA-N 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 3
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 3
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 3
- FXKNPWNXPQZLES-ZLUOBGJFSA-N Ala-Asn-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FXKNPWNXPQZLES-ZLUOBGJFSA-N 0.000 description 3
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 3
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 3
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 3
- FDAZDMAFZYTHGS-XVYDVKMFSA-N Ala-His-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O FDAZDMAFZYTHGS-XVYDVKMFSA-N 0.000 description 3
- MNZHHDPWDWQJCQ-YUMQZZPRSA-N Ala-Leu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O MNZHHDPWDWQJCQ-YUMQZZPRSA-N 0.000 description 3
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 3
- IPZQNYYAYVRKKK-FXQIFTODSA-N Ala-Pro-Ala Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IPZQNYYAYVRKKK-FXQIFTODSA-N 0.000 description 3
- OMCKWYSDUQBYCN-FXQIFTODSA-N Ala-Ser-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O OMCKWYSDUQBYCN-FXQIFTODSA-N 0.000 description 3
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 3
- ZJLORAAXDAJLDC-CQDKDKBSSA-N Ala-Tyr-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O ZJLORAAXDAJLDC-CQDKDKBSSA-N 0.000 description 3
- SQKPKIJVWHAWNF-DCAQKATOSA-N Arg-Asp-Lys Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(O)=O SQKPKIJVWHAWNF-DCAQKATOSA-N 0.000 description 3
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 3
- DJAIOAKQIOGULM-DCAQKATOSA-N Arg-Glu-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O DJAIOAKQIOGULM-DCAQKATOSA-N 0.000 description 3
- QKSAZKCRVQYYGS-UWVGGRQHSA-N Arg-Gly-His Chemical compound N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O QKSAZKCRVQYYGS-UWVGGRQHSA-N 0.000 description 3
- CRCCTGPNZUCAHE-DCAQKATOSA-N Arg-His-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 CRCCTGPNZUCAHE-DCAQKATOSA-N 0.000 description 3
- PAPSMOYMQDWIOR-AVGNSLFASA-N Arg-Lys-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PAPSMOYMQDWIOR-AVGNSLFASA-N 0.000 description 3
- ASQKVGRCKOFKIU-KZVJFYERSA-N Arg-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ASQKVGRCKOFKIU-KZVJFYERSA-N 0.000 description 3
- SLKLLQWZQHXYSV-CIUDSAMLSA-N Asn-Ala-Lys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O SLKLLQWZQHXYSV-CIUDSAMLSA-N 0.000 description 3
- QEYJFBMTSMLPKZ-ZKWXMUAHSA-N Asn-Ala-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O QEYJFBMTSMLPKZ-ZKWXMUAHSA-N 0.000 description 3
- XHFXZQHTLJVZBN-FXQIFTODSA-N Asn-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N XHFXZQHTLJVZBN-FXQIFTODSA-N 0.000 description 3
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 3
- ASCGFDYEKSRNPL-CIUDSAMLSA-N Asn-Glu-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O ASCGFDYEKSRNPL-CIUDSAMLSA-N 0.000 description 3
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 3
- WIDVAWAQBRAKTI-YUMQZZPRSA-N Asn-Leu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O WIDVAWAQBRAKTI-YUMQZZPRSA-N 0.000 description 3
- JLNFZLNDHONLND-GARJFASQSA-N Asn-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N JLNFZLNDHONLND-GARJFASQSA-N 0.000 description 3
- HZZIFFOVHLWGCS-KKUMJFAQSA-N Asn-Phe-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O HZZIFFOVHLWGCS-KKUMJFAQSA-N 0.000 description 3
- HPNDKUOLNRVRAY-BIIVOSGPSA-N Asn-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N)C(=O)O HPNDKUOLNRVRAY-BIIVOSGPSA-N 0.000 description 3
- HPASIOLTWSNMFB-OLHMAJIHSA-N Asn-Thr-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O HPASIOLTWSNMFB-OLHMAJIHSA-N 0.000 description 3
- QUMKPKWYDVMGNT-NUMRIWBASA-N Asn-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QUMKPKWYDVMGNT-NUMRIWBASA-N 0.000 description 3
- GHWWTICYPDKPTE-NGZCFLSTSA-N Asn-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N GHWWTICYPDKPTE-NGZCFLSTSA-N 0.000 description 3
- VPPXTHJNTYDNFJ-CIUDSAMLSA-N Asp-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N VPPXTHJNTYDNFJ-CIUDSAMLSA-N 0.000 description 3
- JDHOJQJMWBKHDB-CIUDSAMLSA-N Asp-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N JDHOJQJMWBKHDB-CIUDSAMLSA-N 0.000 description 3
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 3
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 3
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 3
- WQSXAPPYLGNMQL-IHRRRGAJSA-N Asp-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N WQSXAPPYLGNMQL-IHRRRGAJSA-N 0.000 description 3
- LIJXJYGRSRWLCJ-IHRRRGAJSA-N Asp-Phe-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LIJXJYGRSRWLCJ-IHRRRGAJSA-N 0.000 description 3
- ZBYLEBZCVKLPCY-FXQIFTODSA-N Asp-Ser-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ZBYLEBZCVKLPCY-FXQIFTODSA-N 0.000 description 3
- DRCOAZZDQRCGGP-GHCJXIJMSA-N Asp-Ser-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DRCOAZZDQRCGGP-GHCJXIJMSA-N 0.000 description 3
- DKQCWCQRAMAFLN-UBHSHLNASA-N Asp-Trp-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O DKQCWCQRAMAFLN-UBHSHLNASA-N 0.000 description 3
- DTMLKCYOQKZXKZ-HJGDQZAQSA-N Gln-Arg-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DTMLKCYOQKZXKZ-HJGDQZAQSA-N 0.000 description 3
- ITZWDGBYBPUZRG-KBIXCLLPSA-N Gln-Ile-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O ITZWDGBYBPUZRG-KBIXCLLPSA-N 0.000 description 3
- LGIKBBLQVSWUGK-DCAQKATOSA-N Gln-Leu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGIKBBLQVSWUGK-DCAQKATOSA-N 0.000 description 3
- XFAUJGNLHIGXET-AVGNSLFASA-N Gln-Leu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XFAUJGNLHIGXET-AVGNSLFASA-N 0.000 description 3
- IHSGESFHTMFHRB-GUBZILKMSA-N Gln-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(N)=O IHSGESFHTMFHRB-GUBZILKMSA-N 0.000 description 3
- KPNWAJMEMRCLAL-GUBZILKMSA-N Gln-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N KPNWAJMEMRCLAL-GUBZILKMSA-N 0.000 description 3
- FITIQFSXXBKFFM-NRPADANISA-N Gln-Val-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FITIQFSXXBKFFM-NRPADANISA-N 0.000 description 3
- FYBSCGZLICNOBA-XQXXSGGOSA-N Glu-Ala-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FYBSCGZLICNOBA-XQXXSGGOSA-N 0.000 description 3
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 3
- PBEQPAZRHDVJQI-SRVKXCTJSA-N Glu-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N PBEQPAZRHDVJQI-SRVKXCTJSA-N 0.000 description 3
- WOSRKEJQESVHGA-CIUDSAMLSA-N Glu-Arg-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O WOSRKEJQESVHGA-CIUDSAMLSA-N 0.000 description 3
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 3
- NTBDVNJIWCKURJ-ACZMJKKPSA-N Glu-Asp-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NTBDVNJIWCKURJ-ACZMJKKPSA-N 0.000 description 3
- JVSBYEDSSRZQGV-GUBZILKMSA-N Glu-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O JVSBYEDSSRZQGV-GUBZILKMSA-N 0.000 description 3
- OXEMJGCAJFFREE-FXQIFTODSA-N Glu-Gln-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O OXEMJGCAJFFREE-FXQIFTODSA-N 0.000 description 3
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 3
- LVCHEMOPBORRLB-DCAQKATOSA-N Glu-Gln-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O LVCHEMOPBORRLB-DCAQKATOSA-N 0.000 description 3
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 3
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 3
- NJPQBTJSYCKCNS-HVTMNAMFSA-N Glu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N NJPQBTJSYCKCNS-HVTMNAMFSA-N 0.000 description 3
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 3
- NWOUBJNMZDDGDT-AVGNSLFASA-N Glu-Leu-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 NWOUBJNMZDDGDT-AVGNSLFASA-N 0.000 description 3
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 3
- HZISRJBYZAODRV-XQXXSGGOSA-N Glu-Thr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O HZISRJBYZAODRV-XQXXSGGOSA-N 0.000 description 3
- JVYNYWXHZWVJEF-NUMRIWBASA-N Glu-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O JVYNYWXHZWVJEF-NUMRIWBASA-N 0.000 description 3
- MFYLRRCYBBJYPI-JYJNAYRXSA-N Glu-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O MFYLRRCYBBJYPI-JYJNAYRXSA-N 0.000 description 3
- JXYMPBCYRKWJEE-BQBZGAKWSA-N Gly-Arg-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JXYMPBCYRKWJEE-BQBZGAKWSA-N 0.000 description 3
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 3
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 3
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 3
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 3
- BBTCXWTXOXUNFX-IUCAKERBSA-N Gly-Met-Arg Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O BBTCXWTXOXUNFX-IUCAKERBSA-N 0.000 description 3
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 3
- CSMYMGFCEJWALV-WDSKDSINSA-N Gly-Ser-Gln Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O CSMYMGFCEJWALV-WDSKDSINSA-N 0.000 description 3
- VNNRLUNBJSWZPF-ZKWXMUAHSA-N Gly-Ser-Ile Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNNRLUNBJSWZPF-ZKWXMUAHSA-N 0.000 description 3
- FFJQHWKSGAWSTJ-BFHQHQDPSA-N Gly-Thr-Ala Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O FFJQHWKSGAWSTJ-BFHQHQDPSA-N 0.000 description 3
- PNUFMLXHOLFRLD-KBPBESRZSA-N Gly-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 PNUFMLXHOLFRLD-KBPBESRZSA-N 0.000 description 3
- 102100039894 Hemoglobin subunit delta Human genes 0.000 description 3
- 102100030826 Hemoglobin subunit epsilon Human genes 0.000 description 3
- DVHGLDYMGWTYKW-GUBZILKMSA-N His-Gln-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DVHGLDYMGWTYKW-GUBZILKMSA-N 0.000 description 3
- IDQNVIWPPWAFSY-AVGNSLFASA-N His-His-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O IDQNVIWPPWAFSY-AVGNSLFASA-N 0.000 description 3
- TTYKEFZRLKQTHH-MELADBBJSA-N His-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O TTYKEFZRLKQTHH-MELADBBJSA-N 0.000 description 3
- KFQDSSNYWKZFOO-LSJOCFKGSA-N His-Val-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KFQDSSNYWKZFOO-LSJOCFKGSA-N 0.000 description 3
- 241000282412 Homo Species 0.000 description 3
- YKRYHWJRQUSTKG-KBIXCLLPSA-N Ile-Ala-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YKRYHWJRQUSTKG-KBIXCLLPSA-N 0.000 description 3
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 3
- YKRIXHPEIZUDDY-GMOBBJLQSA-N Ile-Asn-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKRIXHPEIZUDDY-GMOBBJLQSA-N 0.000 description 3
- QYZYJFXHXYUZMZ-UGYAYLCHSA-N Ile-Asn-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N QYZYJFXHXYUZMZ-UGYAYLCHSA-N 0.000 description 3
- DFJJAVZIHDFOGQ-MNXVOIDGSA-N Ile-Glu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DFJJAVZIHDFOGQ-MNXVOIDGSA-N 0.000 description 3
- PDTMWFVVNZYWTR-NHCYSSNCSA-N Ile-Gly-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O PDTMWFVVNZYWTR-NHCYSSNCSA-N 0.000 description 3
- SJLVSMMIFYTSGY-GRLWGSQLSA-N Ile-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SJLVSMMIFYTSGY-GRLWGSQLSA-N 0.000 description 3
- PHRWFSFCNJPWRO-PPCPHDFISA-N Ile-Leu-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N PHRWFSFCNJPWRO-PPCPHDFISA-N 0.000 description 3
- IDMNOFVUXYYZPF-DKIMLUQUSA-N Ile-Lys-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N IDMNOFVUXYYZPF-DKIMLUQUSA-N 0.000 description 3
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 3
- RVNOXPZHMUWCLW-GMOBBJLQSA-N Ile-Met-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N RVNOXPZHMUWCLW-GMOBBJLQSA-N 0.000 description 3
- FGBRXCZYVRFNKQ-MXAVVETBSA-N Ile-Phe-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N FGBRXCZYVRFNKQ-MXAVVETBSA-N 0.000 description 3
- PZWBBXHHUSIGKH-OSUNSFLBSA-N Ile-Thr-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PZWBBXHHUSIGKH-OSUNSFLBSA-N 0.000 description 3
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 3
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 3
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 3
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 3
- IGUOAYLTQJLPPD-DCAQKATOSA-N Leu-Asn-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IGUOAYLTQJLPPD-DCAQKATOSA-N 0.000 description 3
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 3
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 3
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 3
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 3
- CIVKXGPFXDIQBV-WDCWCFNPSA-N Leu-Gln-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CIVKXGPFXDIQBV-WDCWCFNPSA-N 0.000 description 3
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 3
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 3
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 3
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 3
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 3
- AIRUUHAOKGVJAD-JYJNAYRXSA-N Leu-Phe-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIRUUHAOKGVJAD-JYJNAYRXSA-N 0.000 description 3
- SYRTUBLKWNDSDK-DKIMLUQUSA-N Leu-Phe-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYRTUBLKWNDSDK-DKIMLUQUSA-N 0.000 description 3
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 3
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 3
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 3
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 3
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 3
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 3
- NCTDKZKNBDZDOL-GARJFASQSA-N Lys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O NCTDKZKNBDZDOL-GARJFASQSA-N 0.000 description 3
- KPJJOZUXFOLGMQ-CIUDSAMLSA-N Lys-Asp-Asn Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N KPJJOZUXFOLGMQ-CIUDSAMLSA-N 0.000 description 3
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 3
- QIJVAFLRMVBHMU-KKUMJFAQSA-N Lys-Asp-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QIJVAFLRMVBHMU-KKUMJFAQSA-N 0.000 description 3
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 3
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 3
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 3
- MGKFCQFVPKOWOL-CIUDSAMLSA-N Lys-Ser-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N MGKFCQFVPKOWOL-CIUDSAMLSA-N 0.000 description 3
- DLCAXBGXGOVUCD-PPCPHDFISA-N Lys-Thr-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DLCAXBGXGOVUCD-PPCPHDFISA-N 0.000 description 3
- MSSJHBAKDDIRMJ-SRVKXCTJSA-N Met-Lys-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O MSSJHBAKDDIRMJ-SRVKXCTJSA-N 0.000 description 3
- 241000699670 Mus sp. Species 0.000 description 3
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 3
- BKWJQWJPZMUWEG-LFSVMHDDSA-N Phe-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BKWJQWJPZMUWEG-LFSVMHDDSA-N 0.000 description 3
- LGBVMDMZZFYSFW-HJWJTTGWSA-N Phe-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CC=CC=C1)N LGBVMDMZZFYSFW-HJWJTTGWSA-N 0.000 description 3
- ZENDEDYRYVHBEG-SRVKXCTJSA-N Phe-Asp-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZENDEDYRYVHBEG-SRVKXCTJSA-N 0.000 description 3
- WIVCOAKLPICYGY-KKUMJFAQSA-N Phe-Asp-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N WIVCOAKLPICYGY-KKUMJFAQSA-N 0.000 description 3
- SWCOXQLDICUYOL-ULQDDVLXSA-N Phe-His-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SWCOXQLDICUYOL-ULQDDVLXSA-N 0.000 description 3
- KXUZHWXENMYOHC-QEJZJMRPSA-N Phe-Leu-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O KXUZHWXENMYOHC-QEJZJMRPSA-N 0.000 description 3
- KZRQONDKKJCAOL-DKIMLUQUSA-N Phe-Leu-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZRQONDKKJCAOL-DKIMLUQUSA-N 0.000 description 3
- LRBSWBVUCLLRLU-BZSNNMDCSA-N Phe-Leu-Lys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(O)=O LRBSWBVUCLLRLU-BZSNNMDCSA-N 0.000 description 3
- PTLMYJOMJLTMCB-KKUMJFAQSA-N Phe-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N PTLMYJOMJLTMCB-KKUMJFAQSA-N 0.000 description 3
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 3
- UAYHMOIGIQZLFR-NHCYSSNCSA-N Pro-Gln-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UAYHMOIGIQZLFR-NHCYSSNCSA-N 0.000 description 3
- SOACYAXADBWDDT-CYDGBPFRSA-N Pro-Ile-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SOACYAXADBWDDT-CYDGBPFRSA-N 0.000 description 3
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 3
- BCNRNJWSRFDPTQ-HJWJTTGWSA-N Pro-Ile-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BCNRNJWSRFDPTQ-HJWJTTGWSA-N 0.000 description 3
- BRJGUPWVFXKBQI-XUXIUFHCSA-N Pro-Leu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRJGUPWVFXKBQI-XUXIUFHCSA-N 0.000 description 3
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 3
- DWGFLKQSGRUQTI-IHRRRGAJSA-N Pro-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 DWGFLKQSGRUQTI-IHRRRGAJSA-N 0.000 description 3
- 238000011529 RT qPCR Methods 0.000 description 3
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 3
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 3
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 3
- VZQRNAYURWAEFE-KKUMJFAQSA-N Ser-Leu-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VZQRNAYURWAEFE-KKUMJFAQSA-N 0.000 description 3
- GVMUJUPXFQFBBZ-GUBZILKMSA-N Ser-Lys-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GVMUJUPXFQFBBZ-GUBZILKMSA-N 0.000 description 3
- GDUZTEQRAOXYJS-SRVKXCTJSA-N Ser-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GDUZTEQRAOXYJS-SRVKXCTJSA-N 0.000 description 3
- BUYHXYIUQUBEQP-AVGNSLFASA-N Ser-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N BUYHXYIUQUBEQP-AVGNSLFASA-N 0.000 description 3
- FZXOPYUEQGDGMS-ACZMJKKPSA-N Ser-Ser-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O FZXOPYUEQGDGMS-ACZMJKKPSA-N 0.000 description 3
- NAXBBCLCEOTAIG-RHYQMDGZSA-N Thr-Arg-Lys Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CCCCN)C(O)=O NAXBBCLCEOTAIG-RHYQMDGZSA-N 0.000 description 3
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 3
- BIJDDZBDSJLWJY-PJODQICGSA-N Trp-Ala-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O BIJDDZBDSJLWJY-PJODQICGSA-N 0.000 description 3
- HJTYJQVRIQXMHM-XIRDDKMYSA-N Trp-Asp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N HJTYJQVRIQXMHM-XIRDDKMYSA-N 0.000 description 3
- YGKVNUAKYPGORG-AVGNSLFASA-N Tyr-Asp-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YGKVNUAKYPGORG-AVGNSLFASA-N 0.000 description 3
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 3
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 3
- FIRUOPRJKCBLST-KKUMJFAQSA-N Tyr-His-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O FIRUOPRJKCBLST-KKUMJFAQSA-N 0.000 description 3
- LFCQXIXJQXWZJI-BZSNNMDCSA-N Tyr-His-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N)O LFCQXIXJQXWZJI-BZSNNMDCSA-N 0.000 description 3
- GULIUBBXCYPDJU-CQDKDKBSSA-N Tyr-Leu-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 GULIUBBXCYPDJU-CQDKDKBSSA-N 0.000 description 3
- CWVHKVVKAQIJKY-ACRUOGEOSA-N Tyr-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=C(C=C2)O)N CWVHKVVKAQIJKY-ACRUOGEOSA-N 0.000 description 3
- PSALWJCUIAQKFW-ACRUOGEOSA-N Tyr-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N PSALWJCUIAQKFW-ACRUOGEOSA-N 0.000 description 3
- QPOUERMDWKKZEG-HJPIBITLSA-N Tyr-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QPOUERMDWKKZEG-HJPIBITLSA-N 0.000 description 3
- AKRHKDCELJLTMD-BVSLBCMMSA-N Tyr-Trp-Arg Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N AKRHKDCELJLTMD-BVSLBCMMSA-N 0.000 description 3
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 3
- VUTHNLMCXKLLFI-LAEOZQHASA-N Val-Asp-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VUTHNLMCXKLLFI-LAEOZQHASA-N 0.000 description 3
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 3
- VCAWFLIWYNMHQP-UKJIMTQDSA-N Val-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N VCAWFLIWYNMHQP-UKJIMTQDSA-N 0.000 description 3
- VXDSPJJQUQDCKH-UKJIMTQDSA-N Val-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N VXDSPJJQUQDCKH-UKJIMTQDSA-N 0.000 description 3
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 3
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 3
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 3
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 3
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 3
- 238000007792 addition Methods 0.000 description 3
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 3
- 108010011559 alanylphenylalanine Proteins 0.000 description 3
- 238000000137 annealing Methods 0.000 description 3
- 108010060035 arginylproline Proteins 0.000 description 3
- 108010068265 aspartyltyrosine Proteins 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 108010078144 glutaminyl-glycine Proteins 0.000 description 3
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 3
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 3
- 108010040030 histidinoalanine Proteins 0.000 description 3
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 108010005942 methionylglycine Proteins 0.000 description 3
- 239000013642 negative control Substances 0.000 description 3
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 3
- 239000013600 plasmid vector Substances 0.000 description 3
- 108010015840 seryl-prolyl-lysyl-lysine Proteins 0.000 description 3
- 238000004904 shortening Methods 0.000 description 3
- 238000006467 substitution reaction Methods 0.000 description 3
- PQFMROVJTOPVDF-JBDRJPRFSA-N (2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-3-carboxypropanoyl]amino]-3-carboxypropanoyl]amino]-4-carboxybutanoyl]amino]butanedioic acid Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O PQFMROVJTOPVDF-JBDRJPRFSA-N 0.000 description 2
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 2
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 2
- VBRDBGCROKWTPV-XHNCKOQMSA-N Ala-Glu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N VBRDBGCROKWTPV-XHNCKOQMSA-N 0.000 description 2
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 2
- XSTZMVAYYCJTNR-DCAQKATOSA-N Ala-Met-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O XSTZMVAYYCJTNR-DCAQKATOSA-N 0.000 description 2
- CYBJZLQSUJEMAS-LFSVMHDDSA-N Ala-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C)N)O CYBJZLQSUJEMAS-LFSVMHDDSA-N 0.000 description 2
- OLVCTPPSXNRGKV-GUBZILKMSA-N Ala-Pro-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 OLVCTPPSXNRGKV-GUBZILKMSA-N 0.000 description 2
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 2
- RCAUJZASOAFTAJ-FXQIFTODSA-N Arg-Asp-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N RCAUJZASOAFTAJ-FXQIFTODSA-N 0.000 description 2
- VNFWDYWTSHFRRG-SRVKXCTJSA-N Arg-Gln-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O VNFWDYWTSHFRRG-SRVKXCTJSA-N 0.000 description 2
- RKQRHMKFNBYOTN-IHRRRGAJSA-N Arg-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N RKQRHMKFNBYOTN-IHRRRGAJSA-N 0.000 description 2
- OFIYLHVAAJYRBC-HJWJTTGWSA-N Arg-Ile-Phe Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O OFIYLHVAAJYRBC-HJWJTTGWSA-N 0.000 description 2
- ZEBDYGZVMMKZNB-SRVKXCTJSA-N Arg-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCCN=C(N)N)N ZEBDYGZVMMKZNB-SRVKXCTJSA-N 0.000 description 2
- RYQSYXFGFOTJDJ-RHYQMDGZSA-N Arg-Thr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RYQSYXFGFOTJDJ-RHYQMDGZSA-N 0.000 description 2
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 2
- BDMIFVIWCNLDCT-CIUDSAMLSA-N Asn-Arg-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O BDMIFVIWCNLDCT-CIUDSAMLSA-N 0.000 description 2
- KXFCBAHYSLJCCY-ZLUOBGJFSA-N Asn-Asn-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O KXFCBAHYSLJCCY-ZLUOBGJFSA-N 0.000 description 2
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 2
- UYCPJVYQYARFGB-YDHLFZDLSA-N Asn-Phe-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O UYCPJVYQYARFGB-YDHLFZDLSA-N 0.000 description 2
- UGXYFDQFLVCDFC-CIUDSAMLSA-N Asn-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O UGXYFDQFLVCDFC-CIUDSAMLSA-N 0.000 description 2
- DATSKXOXPUAOLK-KKUMJFAQSA-N Asn-Tyr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DATSKXOXPUAOLK-KKUMJFAQSA-N 0.000 description 2
- SDHFVYLZFBDSQT-DCAQKATOSA-N Asp-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N SDHFVYLZFBDSQT-DCAQKATOSA-N 0.000 description 2
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 2
- RRKCPMGSRIDLNC-AVGNSLFASA-N Asp-Glu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RRKCPMGSRIDLNC-AVGNSLFASA-N 0.000 description 2
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 2
- LIVXPXUVXFRWNY-CIUDSAMLSA-N Asp-Lys-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O LIVXPXUVXFRWNY-CIUDSAMLSA-N 0.000 description 2
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 2
- UAXIKORUDGGIGA-DCAQKATOSA-N Asp-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O UAXIKORUDGGIGA-DCAQKATOSA-N 0.000 description 2
- VXEORMGBKTUUCM-KWBADKCTSA-N Asp-Val-Gly-Pro Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O VXEORMGBKTUUCM-KWBADKCTSA-N 0.000 description 2
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 2
- FEJCUYOGOBCFOQ-ACZMJKKPSA-N Cys-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N FEJCUYOGOBCFOQ-ACZMJKKPSA-N 0.000 description 2
- LBOLGUYQEPZSKM-YUMQZZPRSA-N Cys-Gly-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CS)N LBOLGUYQEPZSKM-YUMQZZPRSA-N 0.000 description 2
- KCPOQGRVVXYLAC-KKUMJFAQSA-N Cys-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CS)N KCPOQGRVVXYLAC-KKUMJFAQSA-N 0.000 description 2
- 102000016911 Deoxyribonucleases Human genes 0.000 description 2
- 108010053770 Deoxyribonucleases Proteins 0.000 description 2
- SHERTACNJPYHAR-ACZMJKKPSA-N Gln-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O SHERTACNJPYHAR-ACZMJKKPSA-N 0.000 description 2
- DXMPMSWUZVNBSG-QEJZJMRPSA-N Gln-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N DXMPMSWUZVNBSG-QEJZJMRPSA-N 0.000 description 2
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 2
- ZNZPKVQURDQFFS-FXQIFTODSA-N Gln-Glu-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZNZPKVQURDQFFS-FXQIFTODSA-N 0.000 description 2
- IULKWYSYZSURJK-AVGNSLFASA-N Gln-Leu-Lys Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O IULKWYSYZSURJK-AVGNSLFASA-N 0.000 description 2
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 2
- VOUSELYGTNGEPB-NUMRIWBASA-N Gln-Thr-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O VOUSELYGTNGEPB-NUMRIWBASA-N 0.000 description 2
- KHHDJQRWIFHXHS-NRPADANISA-N Gln-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N KHHDJQRWIFHXHS-NRPADANISA-N 0.000 description 2
- XXCDTYBVGMPIOA-FXQIFTODSA-N Glu-Asp-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XXCDTYBVGMPIOA-FXQIFTODSA-N 0.000 description 2
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 2
- WTMZXOPHTIVFCP-QEWYBTABSA-N Glu-Ile-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WTMZXOPHTIVFCP-QEWYBTABSA-N 0.000 description 2
- VMKCPNBBPGGQBJ-GUBZILKMSA-N Glu-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N VMKCPNBBPGGQBJ-GUBZILKMSA-N 0.000 description 2
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 2
- DWBBKNPKDHXIAC-SRVKXCTJSA-N Glu-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCC(O)=O DWBBKNPKDHXIAC-SRVKXCTJSA-N 0.000 description 2
- CHDWDBPJOZVZSE-KKUMJFAQSA-N Glu-Phe-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O CHDWDBPJOZVZSE-KKUMJFAQSA-N 0.000 description 2
- CQAHWYDHKUWYIX-YUMQZZPRSA-N Glu-Pro-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O CQAHWYDHKUWYIX-YUMQZZPRSA-N 0.000 description 2
- SWDNPSMMEWRNOH-HJGDQZAQSA-N Glu-Pro-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWDNPSMMEWRNOH-HJGDQZAQSA-N 0.000 description 2
- BPLNJYHNAJVLRT-ACZMJKKPSA-N Glu-Ser-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O BPLNJYHNAJVLRT-ACZMJKKPSA-N 0.000 description 2
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 2
- BEQGFMIBZFNROK-JGVFFNPUSA-N Gly-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)CN)C(=O)O BEQGFMIBZFNROK-JGVFFNPUSA-N 0.000 description 2
- GDOZQTNZPCUARW-YFKPBYRVSA-N Gly-Gly-Glu Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O GDOZQTNZPCUARW-YFKPBYRVSA-N 0.000 description 2
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 2
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 2
- YPLYIXGKCRQZGW-SRVKXCTJSA-N His-Arg-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O YPLYIXGKCRQZGW-SRVKXCTJSA-N 0.000 description 2
- HYWZHNUGAYVEEW-KKUMJFAQSA-N His-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N HYWZHNUGAYVEEW-KKUMJFAQSA-N 0.000 description 2
- UPJODPVSKKWGDQ-KLHWPWHYSA-N His-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)O UPJODPVSKKWGDQ-KLHWPWHYSA-N 0.000 description 2
- 206010020751 Hypersensitivity Diseases 0.000 description 2
- ASCFJMSGKUIRDU-ZPFDUUQYSA-N Ile-Arg-Gln Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O ASCFJMSGKUIRDU-ZPFDUUQYSA-N 0.000 description 2
- NHJKZMDIMMTVCK-QXEWZRGKSA-N Ile-Gly-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N NHJKZMDIMMTVCK-QXEWZRGKSA-N 0.000 description 2
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 2
- CKRFDMPBSWYOBT-PPCPHDFISA-N Ile-Lys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CKRFDMPBSWYOBT-PPCPHDFISA-N 0.000 description 2
- SVZFKLBRCYCIIY-CYDGBPFRSA-N Ile-Pro-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SVZFKLBRCYCIIY-CYDGBPFRSA-N 0.000 description 2
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 2
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 2
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 2
- YKNBJXOJTURHCU-DCAQKATOSA-N Leu-Asp-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKNBJXOJTURHCU-DCAQKATOSA-N 0.000 description 2
- WCTCIIAGNMFYAO-DCAQKATOSA-N Leu-Cys-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O WCTCIIAGNMFYAO-DCAQKATOSA-N 0.000 description 2
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 2
- REPBGZHJKYWFMJ-KKUMJFAQSA-N Leu-Lys-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N REPBGZHJKYWFMJ-KKUMJFAQSA-N 0.000 description 2
- OVZLLFONXILPDZ-VOAKCMCISA-N Leu-Lys-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OVZLLFONXILPDZ-VOAKCMCISA-N 0.000 description 2
- KZZCOWMDDXDKSS-CIUDSAMLSA-N Leu-Ser-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KZZCOWMDDXDKSS-CIUDSAMLSA-N 0.000 description 2
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 2
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 2
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 2
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 2
- PHHYNOUOUWYQRO-XIRDDKMYSA-N Lys-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N PHHYNOUOUWYQRO-XIRDDKMYSA-N 0.000 description 2
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 2
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 2
- OEYKVQKYCHATHO-SZMVWBNQSA-N Lys-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N OEYKVQKYCHATHO-SZMVWBNQSA-N 0.000 description 2
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 2
- VWJFOUBDZIUXGA-AVGNSLFASA-N Lys-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCCCN)N VWJFOUBDZIUXGA-AVGNSLFASA-N 0.000 description 2
- OXIWIYOJVNOKOV-SRVKXCTJSA-N Met-Met-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(O)=O)CCCNC(N)=N OXIWIYOJVNOKOV-SRVKXCTJSA-N 0.000 description 2
- LQTGGXSOMDSWTQ-UNQGMJICSA-N Met-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCSC)N)O LQTGGXSOMDSWTQ-UNQGMJICSA-N 0.000 description 2
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 2
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 2
- BBDSZDHUCPSYAC-QEJZJMRPSA-N Phe-Ala-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BBDSZDHUCPSYAC-QEJZJMRPSA-N 0.000 description 2
- LDSOBEJVGGVWGD-DLOVCJGASA-N Phe-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 LDSOBEJVGGVWGD-DLOVCJGASA-N 0.000 description 2
- LXUJDHOKVUYHRC-KKUMJFAQSA-N Phe-Cys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC1=CC=CC=C1)N LXUJDHOKVUYHRC-KKUMJFAQSA-N 0.000 description 2
- JEBWZLWTRPZQRX-QWRGUYRKSA-N Phe-Gly-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O JEBWZLWTRPZQRX-QWRGUYRKSA-N 0.000 description 2
- MYQCCQSMKNCNKY-KKUMJFAQSA-N Phe-His-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CO)C(=O)O)N MYQCCQSMKNCNKY-KKUMJFAQSA-N 0.000 description 2
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 2
- XQPHBAKJJJZOBX-SRVKXCTJSA-N Pro-Lys-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O XQPHBAKJJJZOBX-SRVKXCTJSA-N 0.000 description 2
- KLOQCCRTPHPIFN-DCAQKATOSA-N Pro-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 KLOQCCRTPHPIFN-DCAQKATOSA-N 0.000 description 2
- SXJOPONICMGFCR-DCAQKATOSA-N Pro-Ser-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O SXJOPONICMGFCR-DCAQKATOSA-N 0.000 description 2
- MKGIILKDUGDRRO-FXQIFTODSA-N Pro-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 MKGIILKDUGDRRO-FXQIFTODSA-N 0.000 description 2
- RMJZWERKFFNNNS-XGEHTFHBSA-N Pro-Thr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMJZWERKFFNNNS-XGEHTFHBSA-N 0.000 description 2
- VQBLHWSPVYYZTB-DCAQKATOSA-N Ser-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N VQBLHWSPVYYZTB-DCAQKATOSA-N 0.000 description 2
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 2
- LWMQRHDTXHQQOV-MXAVVETBSA-N Ser-Ile-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LWMQRHDTXHQQOV-MXAVVETBSA-N 0.000 description 2
- NLOAIFSWUUFQFR-CIUDSAMLSA-N Ser-Leu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O NLOAIFSWUUFQFR-CIUDSAMLSA-N 0.000 description 2
- NNFMANHDYSVNIO-DCAQKATOSA-N Ser-Lys-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NNFMANHDYSVNIO-DCAQKATOSA-N 0.000 description 2
- PTWIYDNFWPXQSD-GARJFASQSA-N Ser-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N)C(=O)O PTWIYDNFWPXQSD-GARJFASQSA-N 0.000 description 2
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 2
- UYLKOSODXYSWMQ-XGEHTFHBSA-N Ser-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CO)N)O UYLKOSODXYSWMQ-XGEHTFHBSA-N 0.000 description 2
- 241000255588 Tephritidae Species 0.000 description 2
- LHUBVKCLOVALIA-HJGDQZAQSA-N Thr-Arg-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LHUBVKCLOVALIA-HJGDQZAQSA-N 0.000 description 2
- GUZGCDIZVGODML-NKIYYHGXSA-N Thr-Gln-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O GUZGCDIZVGODML-NKIYYHGXSA-N 0.000 description 2
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 2
- OQCXTUQTKQFDCX-HTUGSXCWSA-N Thr-Glu-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O OQCXTUQTKQFDCX-HTUGSXCWSA-N 0.000 description 2
- URPSJRMWHQTARR-MBLNEYKQSA-N Thr-Ile-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O URPSJRMWHQTARR-MBLNEYKQSA-N 0.000 description 2
- FIFDDJFLNVAVMS-RHYQMDGZSA-N Thr-Leu-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O FIFDDJFLNVAVMS-RHYQMDGZSA-N 0.000 description 2
- XKWABWFMQXMUMT-HJGDQZAQSA-N Thr-Pro-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XKWABWFMQXMUMT-HJGDQZAQSA-N 0.000 description 2
- BDENGIGFTNYZSJ-RCWTZXSCSA-N Thr-Pro-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(O)=O BDENGIGFTNYZSJ-RCWTZXSCSA-N 0.000 description 2
- NHQVWACSJZJCGJ-FLBSBUHZSA-N Thr-Thr-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NHQVWACSJZJCGJ-FLBSBUHZSA-N 0.000 description 2
- JAWUQFCGNVEDRN-MEYUZBJRSA-N Thr-Tyr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O JAWUQFCGNVEDRN-MEYUZBJRSA-N 0.000 description 2
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 2
- PWONLXBUSVIZPH-RHYQMDGZSA-N Thr-Val-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O PWONLXBUSVIZPH-RHYQMDGZSA-N 0.000 description 2
- COLXBVRHSKPKIE-NYVOZVTQSA-N Trp-Trp-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O COLXBVRHSKPKIE-NYVOZVTQSA-N 0.000 description 2
- CDHQEOXPWBDFPL-QWRGUYRKSA-N Tyr-Gly-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDHQEOXPWBDFPL-QWRGUYRKSA-N 0.000 description 2
- IGXLNVIYDYONFB-UFYCRDLUSA-N Tyr-Phe-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=C(O)C=C1 IGXLNVIYDYONFB-UFYCRDLUSA-N 0.000 description 2
- VXFXIBCCVLJCJT-JYJNAYRXSA-N Tyr-Pro-Pro Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N1CCC[C@H]1C(O)=O VXFXIBCCVLJCJT-JYJNAYRXSA-N 0.000 description 2
- MQGGXGKQSVEQHR-KKUMJFAQSA-N Tyr-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 MQGGXGKQSVEQHR-KKUMJFAQSA-N 0.000 description 2
- VVZDBPBZHLQPPB-XVKPBYJWSA-N Val-Glu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VVZDBPBZHLQPPB-XVKPBYJWSA-N 0.000 description 2
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 description 2
- MJFSRZZJQWZHFQ-SRVKXCTJSA-N Val-Met-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)O)N MJFSRZZJQWZHFQ-SRVKXCTJSA-N 0.000 description 2
- ZXYPHBKIZLAQTL-QXEWZRGKSA-N Val-Pro-Asp Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N ZXYPHBKIZLAQTL-QXEWZRGKSA-N 0.000 description 2
- 101150063416 add gene Proteins 0.000 description 2
- 108010047495 alanylglycine Proteins 0.000 description 2
- 108010021908 aspartyl-aspartyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 108010047857 aspartylglycine Proteins 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 108060003196 globin Proteins 0.000 description 2
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 2
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 108010028295 histidylhistidine Proteins 0.000 description 2
- 108010025306 histidylleucine Proteins 0.000 description 2
- 108010018006 histidylserine Proteins 0.000 description 2
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 108010068488 methionylphenylalanine Proteins 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 108010079317 prolyl-tyrosine Proteins 0.000 description 2
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N puromycin Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 2
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 2
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 2
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 2
- 238000001890 transfection Methods 0.000 description 2
- PAHHYDSPOXDASW-VGWMRTNUSA-N (2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-1-[(2s)-2-amino-3-hydroxypropanoyl]pyrrolidine-2-carbonyl]amino]hexanoyl]amino]hexanoic acid Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO PAHHYDSPOXDASW-VGWMRTNUSA-N 0.000 description 1
- XEXJJJRVTFGWIC-FXQIFTODSA-N Ala-Asn-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XEXJJJRVTFGWIC-FXQIFTODSA-N 0.000 description 1
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 1
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 1
- FVSOUJZKYWEFOB-KBIXCLLPSA-N Ala-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)N FVSOUJZKYWEFOB-KBIXCLLPSA-N 0.000 description 1
- BLGHHPHXVJWCNK-GUBZILKMSA-N Ala-Gln-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BLGHHPHXVJWCNK-GUBZILKMSA-N 0.000 description 1
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 1
- FUSPCLTUKXQREV-ACZMJKKPSA-N Ala-Glu-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O FUSPCLTUKXQREV-ACZMJKKPSA-N 0.000 description 1
- NWVVKQZOVSTDBQ-CIUDSAMLSA-N Ala-Glu-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NWVVKQZOVSTDBQ-CIUDSAMLSA-N 0.000 description 1
- NJPMYXWVWQWCSR-ACZMJKKPSA-N Ala-Glu-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NJPMYXWVWQWCSR-ACZMJKKPSA-N 0.000 description 1
- KXEVYGKATAMXJJ-ACZMJKKPSA-N Ala-Glu-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KXEVYGKATAMXJJ-ACZMJKKPSA-N 0.000 description 1
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 1
- KMGOBAQSCKTBGD-DLOVCJGASA-N Ala-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CN=CN1 KMGOBAQSCKTBGD-DLOVCJGASA-N 0.000 description 1
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 1
- LNNSWWRRYJLGNI-NAKRPEOUSA-N Ala-Ile-Val Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O LNNSWWRRYJLGNI-NAKRPEOUSA-N 0.000 description 1
- QUIGLPSHIFPEOV-CIUDSAMLSA-N Ala-Lys-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O QUIGLPSHIFPEOV-CIUDSAMLSA-N 0.000 description 1
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 1
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 1
- KQESEZXHYOUIIM-CQDKDKBSSA-N Ala-Lys-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KQESEZXHYOUIIM-CQDKDKBSSA-N 0.000 description 1
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 1
- ADSGHMXEAZJJNF-DCAQKATOSA-N Ala-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N ADSGHMXEAZJJNF-DCAQKATOSA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 1
- NHWYNIZWLJYZAG-XVYDVKMFSA-N Ala-Ser-His Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N NHWYNIZWLJYZAG-XVYDVKMFSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- VRTOMXFZHGWHIJ-KZVJFYERSA-N Ala-Thr-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VRTOMXFZHGWHIJ-KZVJFYERSA-N 0.000 description 1
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 1
- QOIGKCBMXUCDQU-KDXUFGMBSA-N Ala-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N)O QOIGKCBMXUCDQU-KDXUFGMBSA-N 0.000 description 1
- QRIYOHQJRDHFKF-UWJYBYFXSA-N Ala-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 QRIYOHQJRDHFKF-UWJYBYFXSA-N 0.000 description 1
- KWTVWJPNHAOREN-IHRRRGAJSA-N Arg-Asn-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KWTVWJPNHAOREN-IHRRRGAJSA-N 0.000 description 1
- IIABBYGHLYWVOS-FXQIFTODSA-N Arg-Asn-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O IIABBYGHLYWVOS-FXQIFTODSA-N 0.000 description 1
- RRGPUNYIPJXJBU-GUBZILKMSA-N Arg-Asp-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O RRGPUNYIPJXJBU-GUBZILKMSA-N 0.000 description 1
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 1
- KBBKCNHWCDJPGN-GUBZILKMSA-N Arg-Gln-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KBBKCNHWCDJPGN-GUBZILKMSA-N 0.000 description 1
- JCAISGGAOQXEHJ-ZPFDUUQYSA-N Arg-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N JCAISGGAOQXEHJ-ZPFDUUQYSA-N 0.000 description 1
- QAODJPUKWNNNRP-DCAQKATOSA-N Arg-Glu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QAODJPUKWNNNRP-DCAQKATOSA-N 0.000 description 1
- RKRSYHCNPFGMTA-CIUDSAMLSA-N Arg-Glu-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O RKRSYHCNPFGMTA-CIUDSAMLSA-N 0.000 description 1
- XLWSGICNBZGYTA-CIUDSAMLSA-N Arg-Glu-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XLWSGICNBZGYTA-CIUDSAMLSA-N 0.000 description 1
- MZRBYBIQTIKERR-GUBZILKMSA-N Arg-Glu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MZRBYBIQTIKERR-GUBZILKMSA-N 0.000 description 1
- OHYQKYUTLIPFOX-ZPFDUUQYSA-N Arg-Glu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OHYQKYUTLIPFOX-ZPFDUUQYSA-N 0.000 description 1
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 1
- PHHRSPBBQUFULD-UWVGGRQHSA-N Arg-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N PHHRSPBBQUFULD-UWVGGRQHSA-N 0.000 description 1
- UPKMBGAAEZGHOC-RWMBFGLXSA-N Arg-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O UPKMBGAAEZGHOC-RWMBFGLXSA-N 0.000 description 1
- HCIUUZGFTDTEGM-NAKRPEOUSA-N Arg-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N HCIUUZGFTDTEGM-NAKRPEOUSA-N 0.000 description 1
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 1
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 1
- RIIVUOJDDQXHRV-SRVKXCTJSA-N Arg-Lys-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O RIIVUOJDDQXHRV-SRVKXCTJSA-N 0.000 description 1
- GRRXPUAICOGISM-RWMBFGLXSA-N Arg-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O GRRXPUAICOGISM-RWMBFGLXSA-N 0.000 description 1
- NPAVRDPEFVKELR-DCAQKATOSA-N Arg-Lys-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NPAVRDPEFVKELR-DCAQKATOSA-N 0.000 description 1
- PYZPXCZNQSEHDT-GUBZILKMSA-N Arg-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N PYZPXCZNQSEHDT-GUBZILKMSA-N 0.000 description 1
- CZUHPNLXLWMYMG-UBHSHLNASA-N Arg-Phe-Ala Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 CZUHPNLXLWMYMG-UBHSHLNASA-N 0.000 description 1
- YTMKMRSYXHBGER-IHRRRGAJSA-N Arg-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YTMKMRSYXHBGER-IHRRRGAJSA-N 0.000 description 1
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 1
- FRBAHXABMQXSJQ-FXQIFTODSA-N Arg-Ser-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O FRBAHXABMQXSJQ-FXQIFTODSA-N 0.000 description 1
- MOGMYRUNTKYZFB-UNQGMJICSA-N Arg-Thr-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MOGMYRUNTKYZFB-UNQGMJICSA-N 0.000 description 1
- AOJYORNRFWWEIV-IHRRRGAJSA-N Arg-Tyr-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 AOJYORNRFWWEIV-IHRRRGAJSA-N 0.000 description 1
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 1
- BVLIJXXSXBUGEC-SRVKXCTJSA-N Asn-Asn-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVLIJXXSXBUGEC-SRVKXCTJSA-N 0.000 description 1
- WPOLSNAQGVHROR-GUBZILKMSA-N Asn-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N WPOLSNAQGVHROR-GUBZILKMSA-N 0.000 description 1
- HCAUEJAQCXVQQM-ACZMJKKPSA-N Asn-Glu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HCAUEJAQCXVQQM-ACZMJKKPSA-N 0.000 description 1
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 1
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 1
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 1
- FTCGGKNCJZOPNB-WHFBIAKZSA-N Asn-Gly-Ser Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FTCGGKNCJZOPNB-WHFBIAKZSA-N 0.000 description 1
- LVHMEJJWEXBMKK-GMOBBJLQSA-N Asn-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)N)N LVHMEJJWEXBMKK-GMOBBJLQSA-N 0.000 description 1
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 1
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 1
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 1
- NLDNNZKUSLAYFW-NHCYSSNCSA-N Asn-Lys-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O NLDNNZKUSLAYFW-NHCYSSNCSA-N 0.000 description 1
- LSJQOMAZIKQMTJ-SRVKXCTJSA-N Asn-Phe-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LSJQOMAZIKQMTJ-SRVKXCTJSA-N 0.000 description 1
- RAUPFUCUDBQYHE-AVGNSLFASA-N Asn-Phe-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RAUPFUCUDBQYHE-AVGNSLFASA-N 0.000 description 1
- YUUIAUXBNOHFRJ-IHRRRGAJSA-N Asn-Phe-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O YUUIAUXBNOHFRJ-IHRRRGAJSA-N 0.000 description 1
- FTNRWCPWDWRPAV-BZSNNMDCSA-N Asn-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CC(N)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FTNRWCPWDWRPAV-BZSNNMDCSA-N 0.000 description 1
- NJSNXIOKBHPFMB-GMOBBJLQSA-N Asn-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)N)N NJSNXIOKBHPFMB-GMOBBJLQSA-N 0.000 description 1
- REQUGIWGOGSOEZ-ZLUOBGJFSA-N Asn-Ser-Asn Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)C(=O)N REQUGIWGOGSOEZ-ZLUOBGJFSA-N 0.000 description 1
- QYRMBFWDSFGSFC-OLHMAJIHSA-N Asn-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QYRMBFWDSFGSFC-OLHMAJIHSA-N 0.000 description 1
- KTDWFWNZLLFEFU-KKUMJFAQSA-N Asn-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O KTDWFWNZLLFEFU-KKUMJFAQSA-N 0.000 description 1
- CGYKCTPUGXFPMG-IHPCNDPISA-N Asn-Tyr-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O CGYKCTPUGXFPMG-IHPCNDPISA-N 0.000 description 1
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 1
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 1
- KVMPVNGOKHTUHZ-GCJQMDKQSA-N Asp-Ala-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KVMPVNGOKHTUHZ-GCJQMDKQSA-N 0.000 description 1
- WSOKZUVWBXVJHX-CIUDSAMLSA-N Asp-Arg-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O WSOKZUVWBXVJHX-CIUDSAMLSA-N 0.000 description 1
- MFMJRYHVLLEMQM-DCAQKATOSA-N Asp-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N MFMJRYHVLLEMQM-DCAQKATOSA-N 0.000 description 1
- CASGONAXMZPHCK-FXQIFTODSA-N Asp-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N CASGONAXMZPHCK-FXQIFTODSA-N 0.000 description 1
- GWTLRDMPMJCNMH-WHFBIAKZSA-N Asp-Asn-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GWTLRDMPMJCNMH-WHFBIAKZSA-N 0.000 description 1
- KNMRXHIAVXHCLW-ZLUOBGJFSA-N Asp-Asn-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)C(=O)O KNMRXHIAVXHCLW-ZLUOBGJFSA-N 0.000 description 1
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 1
- VPSHHQXIWLGVDD-ZLUOBGJFSA-N Asp-Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VPSHHQXIWLGVDD-ZLUOBGJFSA-N 0.000 description 1
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 1
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 1
- FTNVLGCFIJEMQT-CIUDSAMLSA-N Asp-Cys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N FTNVLGCFIJEMQT-CIUDSAMLSA-N 0.000 description 1
- DXQOQMCLWWADMU-ACZMJKKPSA-N Asp-Gln-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DXQOQMCLWWADMU-ACZMJKKPSA-N 0.000 description 1
- IJHUZMGJRGNXIW-CIUDSAMLSA-N Asp-Glu-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IJHUZMGJRGNXIW-CIUDSAMLSA-N 0.000 description 1
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 1
- XAJRHVUUVUPFQL-ACZMJKKPSA-N Asp-Glu-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XAJRHVUUVUPFQL-ACZMJKKPSA-N 0.000 description 1
- QCLHLXDWRKOHRR-GUBZILKMSA-N Asp-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N QCLHLXDWRKOHRR-GUBZILKMSA-N 0.000 description 1
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 1
- SVABRQFIHCSNCI-FOHZUACHSA-N Asp-Gly-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SVABRQFIHCSNCI-FOHZUACHSA-N 0.000 description 1
- CYCKJEFVFNRWEZ-UGYAYLCHSA-N Asp-Ile-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CYCKJEFVFNRWEZ-UGYAYLCHSA-N 0.000 description 1
- QNFRBNZGVVKBNJ-PEFMBERDSA-N Asp-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N QNFRBNZGVVKBNJ-PEFMBERDSA-N 0.000 description 1
- KYQNAIMCTRZLNP-QSFUFRPTSA-N Asp-Ile-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O KYQNAIMCTRZLNP-QSFUFRPTSA-N 0.000 description 1
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 1
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 1
- HKEZZWQWXWGASX-KKUMJFAQSA-N Asp-Leu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HKEZZWQWXWGASX-KKUMJFAQSA-N 0.000 description 1
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 1
- XWSIYTYNLKCLJB-CIUDSAMLSA-N Asp-Lys-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O XWSIYTYNLKCLJB-CIUDSAMLSA-N 0.000 description 1
- QNIACYURSSCLRP-GUBZILKMSA-N Asp-Lys-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O QNIACYURSSCLRP-GUBZILKMSA-N 0.000 description 1
- GYWQGGUCMDCUJE-DLOVCJGASA-N Asp-Phe-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O GYWQGGUCMDCUJE-DLOVCJGASA-N 0.000 description 1
- LTCKTLYKRMCFOC-KKUMJFAQSA-N Asp-Phe-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LTCKTLYKRMCFOC-KKUMJFAQSA-N 0.000 description 1
- JSHWXQIZOCVWIA-ZKWXMUAHSA-N Asp-Ser-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JSHWXQIZOCVWIA-ZKWXMUAHSA-N 0.000 description 1
- JDDYEZGPYBBPBN-JRQIVUDYSA-N Asp-Thr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JDDYEZGPYBBPBN-JRQIVUDYSA-N 0.000 description 1
- USENATHVGFXRNO-SRVKXCTJSA-N Asp-Tyr-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 USENATHVGFXRNO-SRVKXCTJSA-N 0.000 description 1
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 1
- SQIARYGNVQWOSB-BZSNNMDCSA-N Asp-Tyr-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQIARYGNVQWOSB-BZSNNMDCSA-N 0.000 description 1
- WAEDSQFVZJUHLI-BYULHYEWSA-N Asp-Val-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WAEDSQFVZJUHLI-BYULHYEWSA-N 0.000 description 1
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 1
- 108091062157 Cis-regulatory element Proteins 0.000 description 1
- UGPCUUWZXRMCIJ-KKUMJFAQSA-N Cys-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CS)N UGPCUUWZXRMCIJ-KKUMJFAQSA-N 0.000 description 1
- NGOIQDYZMIKCOK-NAKRPEOUSA-N Cys-Val-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NGOIQDYZMIKCOK-NAKRPEOUSA-N 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- PHZYLYASFWHLHJ-FXQIFTODSA-N Gln-Asn-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PHZYLYASFWHLHJ-FXQIFTODSA-N 0.000 description 1
- ZPDVKYLJTOFQJV-WDSKDSINSA-N Gln-Asn-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O ZPDVKYLJTOFQJV-WDSKDSINSA-N 0.000 description 1
- XEYMBRRKIFYQMF-GUBZILKMSA-N Gln-Asp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XEYMBRRKIFYQMF-GUBZILKMSA-N 0.000 description 1
- CGVWDTRDPLOMHZ-FXQIFTODSA-N Gln-Glu-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O CGVWDTRDPLOMHZ-FXQIFTODSA-N 0.000 description 1
- SNLOOPZHAQDMJG-CIUDSAMLSA-N Gln-Glu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SNLOOPZHAQDMJG-CIUDSAMLSA-N 0.000 description 1
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 1
- NNXIQPMZGZUFJJ-AVGNSLFASA-N Gln-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N NNXIQPMZGZUFJJ-AVGNSLFASA-N 0.000 description 1
- GIVHPCWYVWUUSG-HVTMNAMFSA-N Gln-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GIVHPCWYVWUUSG-HVTMNAMFSA-N 0.000 description 1
- CAXXTYYGFYTBPV-IUCAKERBSA-N Gln-Leu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CAXXTYYGFYTBPV-IUCAKERBSA-N 0.000 description 1
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 1
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 1
- UWKPRVKWEKEMSY-DCAQKATOSA-N Gln-Lys-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWKPRVKWEKEMSY-DCAQKATOSA-N 0.000 description 1
- JRHPEMVLTRADLJ-AVGNSLFASA-N Gln-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JRHPEMVLTRADLJ-AVGNSLFASA-N 0.000 description 1
- SXFPZRRVWSUYII-KBIXCLLPSA-N Gln-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N SXFPZRRVWSUYII-KBIXCLLPSA-N 0.000 description 1
- ININBLZFFVOQIO-JHEQGTHGSA-N Gln-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O ININBLZFFVOQIO-JHEQGTHGSA-N 0.000 description 1
- STHSGOZLFLFGSS-SUSMZKCASA-N Gln-Thr-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STHSGOZLFLFGSS-SUSMZKCASA-N 0.000 description 1
- HLRLXVPRJJITSK-IFFSRLJSSA-N Gln-Thr-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HLRLXVPRJJITSK-IFFSRLJSSA-N 0.000 description 1
- WTJIWXMJESRHMM-XDTLVQLUSA-N Gln-Tyr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O WTJIWXMJESRHMM-XDTLVQLUSA-N 0.000 description 1
- ICRKQMRFXYDYMK-LAEOZQHASA-N Gln-Val-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ICRKQMRFXYDYMK-LAEOZQHASA-N 0.000 description 1
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 1
- YKLNMGJYMNPBCP-ACZMJKKPSA-N Glu-Asn-Asp Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YKLNMGJYMNPBCP-ACZMJKKPSA-N 0.000 description 1
- YYOBUPFZLKQUAX-FXQIFTODSA-N Glu-Asn-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YYOBUPFZLKQUAX-FXQIFTODSA-N 0.000 description 1
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 1
- LXAUHIRMWXQRKI-XHNCKOQMSA-N Glu-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)C(=O)O LXAUHIRMWXQRKI-XHNCKOQMSA-N 0.000 description 1
- RDPOETHPAQEGDP-ACZMJKKPSA-N Glu-Asp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RDPOETHPAQEGDP-ACZMJKKPSA-N 0.000 description 1
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 1
- CYHBMLHCQXXCCT-AVGNSLFASA-N Glu-Asp-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CYHBMLHCQXXCCT-AVGNSLFASA-N 0.000 description 1
- ZXQPJYWZSFGWJB-AVGNSLFASA-N Glu-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N ZXQPJYWZSFGWJB-AVGNSLFASA-N 0.000 description 1
- UMIRPYLZFKOEOH-YVNDNENWSA-N Glu-Gln-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UMIRPYLZFKOEOH-YVNDNENWSA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 1
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 1
- PHONAZGUEGIOEM-GLLZPBPUSA-N Glu-Glu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PHONAZGUEGIOEM-GLLZPBPUSA-N 0.000 description 1
- VOORMNJKNBGYGK-YUMQZZPRSA-N Glu-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N VOORMNJKNBGYGK-YUMQZZPRSA-N 0.000 description 1
- XOFYVODYSNKPDK-AVGNSLFASA-N Glu-His-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XOFYVODYSNKPDK-AVGNSLFASA-N 0.000 description 1
- QIQABBIDHGQXGA-ZPFDUUQYSA-N Glu-Ile-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QIQABBIDHGQXGA-ZPFDUUQYSA-N 0.000 description 1
- CXRWMMRLEMVSEH-PEFMBERDSA-N Glu-Ile-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CXRWMMRLEMVSEH-PEFMBERDSA-N 0.000 description 1
- ITBHUUMCJJQUSC-LAEOZQHASA-N Glu-Ile-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O ITBHUUMCJJQUSC-LAEOZQHASA-N 0.000 description 1
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 1
- ZHNHJYYFCGUZNQ-KBIXCLLPSA-N Glu-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O ZHNHJYYFCGUZNQ-KBIXCLLPSA-N 0.000 description 1
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 1
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 1
- ATVYZJGOZLVXDK-IUCAKERBSA-N Glu-Leu-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O ATVYZJGOZLVXDK-IUCAKERBSA-N 0.000 description 1
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 1
- SWRVAQHFBRZVNX-GUBZILKMSA-N Glu-Lys-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SWRVAQHFBRZVNX-GUBZILKMSA-N 0.000 description 1
- OCJRHJZKGGSPRW-IUCAKERBSA-N Glu-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O OCJRHJZKGGSPRW-IUCAKERBSA-N 0.000 description 1
- YKBUCXNNBYZYAY-MNXVOIDGSA-N Glu-Lys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YKBUCXNNBYZYAY-MNXVOIDGSA-N 0.000 description 1
- ZGEJRLJEAMPEDV-SRVKXCTJSA-N Glu-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N ZGEJRLJEAMPEDV-SRVKXCTJSA-N 0.000 description 1
- ZQYZDDXTNQXUJH-CIUDSAMLSA-N Glu-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(=O)O)N ZQYZDDXTNQXUJH-CIUDSAMLSA-N 0.000 description 1
- LKOAAMXDJGEYMS-ZPFDUUQYSA-N Glu-Met-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LKOAAMXDJGEYMS-ZPFDUUQYSA-N 0.000 description 1
- MIIGESVJEBDJMP-FHWLQOOXSA-N Glu-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 MIIGESVJEBDJMP-FHWLQOOXSA-N 0.000 description 1
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 1
- GMVCSRBOSIUTFC-FXQIFTODSA-N Glu-Ser-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMVCSRBOSIUTFC-FXQIFTODSA-N 0.000 description 1
- SYAYROHMAIHWFB-KBIXCLLPSA-N Glu-Ser-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYAYROHMAIHWFB-KBIXCLLPSA-N 0.000 description 1
- QOXDAWODGSIDDI-GUBZILKMSA-N Glu-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N QOXDAWODGSIDDI-GUBZILKMSA-N 0.000 description 1
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 1
- RGJKYNUINKGPJN-RWRJDSDZSA-N Glu-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(=O)O)N RGJKYNUINKGPJN-RWRJDSDZSA-N 0.000 description 1
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 1
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 1
- VIPDPMHGICREIS-GVXVVHGQSA-N Glu-Val-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VIPDPMHGICREIS-GVXVVHGQSA-N 0.000 description 1
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 1
- QXPRJQPCFXMCIY-NKWVEPMBSA-N Gly-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN QXPRJQPCFXMCIY-NKWVEPMBSA-N 0.000 description 1
- JPXNYFOHTHSREU-UWVGGRQHSA-N Gly-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CN JPXNYFOHTHSREU-UWVGGRQHSA-N 0.000 description 1
- NZAFOTBEULLEQB-WDSKDSINSA-N Gly-Asn-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN NZAFOTBEULLEQB-WDSKDSINSA-N 0.000 description 1
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 1
- XEJTYSCIXKYSHR-WDSKDSINSA-N Gly-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN XEJTYSCIXKYSHR-WDSKDSINSA-N 0.000 description 1
- QSTLUOIOYLYLLF-WDSKDSINSA-N Gly-Asp-Glu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QSTLUOIOYLYLLF-WDSKDSINSA-N 0.000 description 1
- LCNXZQROPKFGQK-WHFBIAKZSA-N Gly-Asp-Ser Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O LCNXZQROPKFGQK-WHFBIAKZSA-N 0.000 description 1
- QGZSAHIZRQHCEQ-QWRGUYRKSA-N Gly-Asp-Tyr Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QGZSAHIZRQHCEQ-QWRGUYRKSA-N 0.000 description 1
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 1
- LHRXAHLCRMQBGJ-RYUDHWBXSA-N Gly-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN LHRXAHLCRMQBGJ-RYUDHWBXSA-N 0.000 description 1
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 1
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 1
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 1
- SXJHOPPTOJACOA-QXEWZRGKSA-N Gly-Ile-Arg Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SXJHOPPTOJACOA-QXEWZRGKSA-N 0.000 description 1
- UESJMAMHDLEHGM-NHCYSSNCSA-N Gly-Ile-Leu Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O UESJMAMHDLEHGM-NHCYSSNCSA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 1
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 1
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 1
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 1
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 1
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 1
- IBYOLNARKHMLBG-WHOFXGATSA-N Gly-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IBYOLNARKHMLBG-WHOFXGATSA-N 0.000 description 1
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 1
- FGPLUIQCSKGLTI-WDSKDSINSA-N Gly-Ser-Glu Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O FGPLUIQCSKGLTI-WDSKDSINSA-N 0.000 description 1
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 1
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 1
- FOKISINOENBSDM-WLTAIBSBSA-N Gly-Thr-Tyr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FOKISINOENBSDM-WLTAIBSBSA-N 0.000 description 1
- RCHFYMASWAZQQZ-ZANVPECISA-N Gly-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)CN)=CNC2=C1 RCHFYMASWAZQQZ-ZANVPECISA-N 0.000 description 1
- UIQGJYUEQDOODF-KWQFWETISA-N Gly-Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 UIQGJYUEQDOODF-KWQFWETISA-N 0.000 description 1
- DNVDEMWIYLVIQU-RCOVLWMOSA-N Gly-Val-Asp Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O DNVDEMWIYLVIQU-RCOVLWMOSA-N 0.000 description 1
- SYOJVRNQCXYEOV-XVKPBYJWSA-N Gly-Val-Glu Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SYOJVRNQCXYEOV-XVKPBYJWSA-N 0.000 description 1
- 101150019065 HBD gene Proteins 0.000 description 1
- 108091005903 Hemoglobin subunit delta Proteins 0.000 description 1
- 108091005879 Hemoglobin subunit epsilon Proteins 0.000 description 1
- 108091005886 Hemoglobin subunit gamma Proteins 0.000 description 1
- 102100038617 Hemoglobin subunit gamma-2 Human genes 0.000 description 1
- IPIVXQQRZXEUGW-UWJYBYFXSA-N His-Ala-His Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IPIVXQQRZXEUGW-UWJYBYFXSA-N 0.000 description 1
- XINDHUAGVGCNSF-QSFUFRPTSA-N His-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XINDHUAGVGCNSF-QSFUFRPTSA-N 0.000 description 1
- SVHKVHBPTOMLTO-DCAQKATOSA-N His-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SVHKVHBPTOMLTO-DCAQKATOSA-N 0.000 description 1
- SYMSVYVUSPSAAO-IHRRRGAJSA-N His-Arg-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O SYMSVYVUSPSAAO-IHRRRGAJSA-N 0.000 description 1
- MVADCDSCFTXCBT-CIUDSAMLSA-N His-Asp-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MVADCDSCFTXCBT-CIUDSAMLSA-N 0.000 description 1
- UOAVQQRILDGZEN-SRVKXCTJSA-N His-Asp-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UOAVQQRILDGZEN-SRVKXCTJSA-N 0.000 description 1
- IMCHNUANCIGUKS-SRVKXCTJSA-N His-Glu-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IMCHNUANCIGUKS-SRVKXCTJSA-N 0.000 description 1
- XMENRVZYPBKBIL-AVGNSLFASA-N His-Glu-His Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XMENRVZYPBKBIL-AVGNSLFASA-N 0.000 description 1
- KWBISLAEQZUYIC-UWJYBYFXSA-N His-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CN=CN2)N KWBISLAEQZUYIC-UWJYBYFXSA-N 0.000 description 1
- FSOXZQBMPBQKGJ-QSFUFRPTSA-N His-Ile-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]([NH3+])CC1=CN=CN1 FSOXZQBMPBQKGJ-QSFUFRPTSA-N 0.000 description 1
- ZSKJIISDJXJQPV-BZSNNMDCSA-N His-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 ZSKJIISDJXJQPV-BZSNNMDCSA-N 0.000 description 1
- WCHONUZTYDQMBY-PYJNHQTQSA-N His-Pro-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WCHONUZTYDQMBY-PYJNHQTQSA-N 0.000 description 1
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 1
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- LQSBBHNVAVNZSX-GHCJXIJMSA-N Ile-Ala-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N LQSBBHNVAVNZSX-GHCJXIJMSA-N 0.000 description 1
- FVEWRQXNISSYFO-ZPFDUUQYSA-N Ile-Arg-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N FVEWRQXNISSYFO-ZPFDUUQYSA-N 0.000 description 1
- YOTNPRLPIPHQSB-XUXIUFHCSA-N Ile-Arg-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOTNPRLPIPHQSB-XUXIUFHCSA-N 0.000 description 1
- NBJAAWYRLGCJOF-UGYAYLCHSA-N Ile-Asp-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NBJAAWYRLGCJOF-UGYAYLCHSA-N 0.000 description 1
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 1
- KUHFPGIVBOCRMV-MNXVOIDGSA-N Ile-Gln-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N KUHFPGIVBOCRMV-MNXVOIDGSA-N 0.000 description 1
- LKACSKJPTFSBHR-MNXVOIDGSA-N Ile-Gln-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N LKACSKJPTFSBHR-MNXVOIDGSA-N 0.000 description 1
- LGMUPVWZEYYUMU-YVNDNENWSA-N Ile-Glu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N LGMUPVWZEYYUMU-YVNDNENWSA-N 0.000 description 1
- MTFVYKQRLXYAQN-LAEOZQHASA-N Ile-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O MTFVYKQRLXYAQN-LAEOZQHASA-N 0.000 description 1
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 1
- NZOCIWKZUVUNDW-ZKWXMUAHSA-N Ile-Gly-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O NZOCIWKZUVUNDW-ZKWXMUAHSA-N 0.000 description 1
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 1
- NYEYYMLUABXDMC-NHCYSSNCSA-N Ile-Gly-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)O)N NYEYYMLUABXDMC-NHCYSSNCSA-N 0.000 description 1
- HYLIOBDWPQNLKI-HVTMNAMFSA-N Ile-His-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HYLIOBDWPQNLKI-HVTMNAMFSA-N 0.000 description 1
- HUWYGQOISIJNMK-SIGLWIIPSA-N Ile-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HUWYGQOISIJNMK-SIGLWIIPSA-N 0.000 description 1
- PKGGWLOLRLOPGK-XUXIUFHCSA-N Ile-Leu-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PKGGWLOLRLOPGK-XUXIUFHCSA-N 0.000 description 1
- YGDWPQCLFJNMOL-MNXVOIDGSA-N Ile-Leu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YGDWPQCLFJNMOL-MNXVOIDGSA-N 0.000 description 1
- PMMMQRVUMVURGJ-XUXIUFHCSA-N Ile-Leu-Pro Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O PMMMQRVUMVURGJ-XUXIUFHCSA-N 0.000 description 1
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 1
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 1
- RCMNUBZKIIJCOI-ZPFDUUQYSA-N Ile-Met-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RCMNUBZKIIJCOI-ZPFDUUQYSA-N 0.000 description 1
- UAELWXJFLZBKQS-WHOFXGATSA-N Ile-Phe-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O UAELWXJFLZBKQS-WHOFXGATSA-N 0.000 description 1
- BJECXJHLUJXPJQ-PYJNHQTQSA-N Ile-Pro-His Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N BJECXJHLUJXPJQ-PYJNHQTQSA-N 0.000 description 1
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 1
- KBDIBHQICWDGDL-PPCPHDFISA-N Ile-Thr-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N KBDIBHQICWDGDL-PPCPHDFISA-N 0.000 description 1
- DLEBSGAVWRPTIX-PEDHHIEDSA-N Ile-Val-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)[C@@H](C)CC DLEBSGAVWRPTIX-PEDHHIEDSA-N 0.000 description 1
- NJGXXYLPDMMFJB-XUXIUFHCSA-N Ile-Val-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N NJGXXYLPDMMFJB-XUXIUFHCSA-N 0.000 description 1
- SWNRZNLXMXRCJC-VKOGCVSHSA-N Ile-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 SWNRZNLXMXRCJC-VKOGCVSHSA-N 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 1
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 1
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 1
- 241000713666 Lentivirus Species 0.000 description 1
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 1
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 1
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 1
- ZURHXHNAEJJRNU-CIUDSAMLSA-N Leu-Asp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZURHXHNAEJJRNU-CIUDSAMLSA-N 0.000 description 1
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 1
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 1
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 1
- FMEICTQWUKNAGC-YUMQZZPRSA-N Leu-Gly-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O FMEICTQWUKNAGC-YUMQZZPRSA-N 0.000 description 1
- KGCLIYGPQXUNLO-IUCAKERBSA-N Leu-Gly-Glu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O KGCLIYGPQXUNLO-IUCAKERBSA-N 0.000 description 1
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 1
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 1
- PBGDOSARRIJMEV-DLOVCJGASA-N Leu-His-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O PBGDOSARRIJMEV-DLOVCJGASA-N 0.000 description 1
- KXODZBLFVFSLAI-AVGNSLFASA-N Leu-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KXODZBLFVFSLAI-AVGNSLFASA-N 0.000 description 1
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 1
- HGFGEMSVBMCFKK-MNXVOIDGSA-N Leu-Ile-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HGFGEMSVBMCFKK-MNXVOIDGSA-N 0.000 description 1
- SEMUSFOBZGKBGW-YTFOTSKYSA-N Leu-Ile-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SEMUSFOBZGKBGW-YTFOTSKYSA-N 0.000 description 1
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 1
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 1
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 1
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 1
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 1
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 1
- FKQPWMZLIIATBA-AJNGGQMLSA-N Leu-Lys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FKQPWMZLIIATBA-AJNGGQMLSA-N 0.000 description 1
- FLNPJLDPGMLWAU-UWVGGRQHSA-N Leu-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(C)C FLNPJLDPGMLWAU-UWVGGRQHSA-N 0.000 description 1
- HDHQQEDVWQGBEE-DCAQKATOSA-N Leu-Met-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O HDHQQEDVWQGBEE-DCAQKATOSA-N 0.000 description 1
- KQFZKDITNUEVFJ-JYJNAYRXSA-N Leu-Phe-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CC=CC=C1 KQFZKDITNUEVFJ-JYJNAYRXSA-N 0.000 description 1
- INCJJHQRZGQLFC-KBPBESRZSA-N Leu-Phe-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O INCJJHQRZGQLFC-KBPBESRZSA-N 0.000 description 1
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 1
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 1
- RRVCZCNFXIFGRA-DCAQKATOSA-N Leu-Pro-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RRVCZCNFXIFGRA-DCAQKATOSA-N 0.000 description 1
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 1
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 1
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 1
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 1
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 1
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 1
- JCFYLFOCALSNLQ-GUBZILKMSA-N Lys-Ala-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JCFYLFOCALSNLQ-GUBZILKMSA-N 0.000 description 1
- PNPYKQFJGRFYJE-GUBZILKMSA-N Lys-Ala-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PNPYKQFJGRFYJE-GUBZILKMSA-N 0.000 description 1
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 1
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 1
- IXHKPDJKKCUKHS-GARJFASQSA-N Lys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IXHKPDJKKCUKHS-GARJFASQSA-N 0.000 description 1
- KNKHAVVBVXKOGX-JXUBOQSCSA-N Lys-Ala-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KNKHAVVBVXKOGX-JXUBOQSCSA-N 0.000 description 1
- NTEVEUCLFMWSND-SRVKXCTJSA-N Lys-Arg-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O NTEVEUCLFMWSND-SRVKXCTJSA-N 0.000 description 1
- SJNZALDHDUYDBU-IHRRRGAJSA-N Lys-Arg-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(O)=O SJNZALDHDUYDBU-IHRRRGAJSA-N 0.000 description 1
- FUKDBQGFSJUXGX-RWMBFGLXSA-N Lys-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)C(=O)O FUKDBQGFSJUXGX-RWMBFGLXSA-N 0.000 description 1
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 1
- GGAPIOORBXHMNY-ULQDDVLXSA-N Lys-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)O GGAPIOORBXHMNY-ULQDDVLXSA-N 0.000 description 1
- PXHCFKXNSBJSTQ-KKUMJFAQSA-N Lys-Asn-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)O PXHCFKXNSBJSTQ-KKUMJFAQSA-N 0.000 description 1
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 1
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 1
- WTZUSCUIVPVCRH-SRVKXCTJSA-N Lys-Gln-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WTZUSCUIVPVCRH-SRVKXCTJSA-N 0.000 description 1
- NNCDAORZCMPZPX-GUBZILKMSA-N Lys-Gln-Ser Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N NNCDAORZCMPZPX-GUBZILKMSA-N 0.000 description 1
- KZOHPCYVORJBLG-AVGNSLFASA-N Lys-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N KZOHPCYVORJBLG-AVGNSLFASA-N 0.000 description 1
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 1
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 1
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 1
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 1
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 1
- KKFVKBWCXXLKIK-AVGNSLFASA-N Lys-His-Glu Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCCN)N KKFVKBWCXXLKIK-AVGNSLFASA-N 0.000 description 1
- SPCHLZUWJTYZFC-IHRRRGAJSA-N Lys-His-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O SPCHLZUWJTYZFC-IHRRRGAJSA-N 0.000 description 1
- JYXBNQOKPRQNQS-YTFOTSKYSA-N Lys-Ile-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JYXBNQOKPRQNQS-YTFOTSKYSA-N 0.000 description 1
- ONPDTSFZAIWMDI-AVGNSLFASA-N Lys-Leu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ONPDTSFZAIWMDI-AVGNSLFASA-N 0.000 description 1
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 1
- XIZQPFCRXLUNMK-BZSNNMDCSA-N Lys-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCCCN)N XIZQPFCRXLUNMK-BZSNNMDCSA-N 0.000 description 1
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 1
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 1
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 1
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 1
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 1
- GAHJXEMYXKLZRQ-AJNGGQMLSA-N Lys-Lys-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GAHJXEMYXKLZRQ-AJNGGQMLSA-N 0.000 description 1
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 1
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 1
- BXPHMHQHYHILBB-BZSNNMDCSA-N Lys-Lys-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BXPHMHQHYHILBB-BZSNNMDCSA-N 0.000 description 1
- JYVCOTWSRGFABJ-DCAQKATOSA-N Lys-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N JYVCOTWSRGFABJ-DCAQKATOSA-N 0.000 description 1
- TWPCWKVOZDUYAA-KKUMJFAQSA-N Lys-Phe-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O TWPCWKVOZDUYAA-KKUMJFAQSA-N 0.000 description 1
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 1
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 1
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 1
- UQJOKDAYFULYIX-AVGNSLFASA-N Lys-Pro-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 UQJOKDAYFULYIX-AVGNSLFASA-N 0.000 description 1
- WQDKIVRHTQYJSN-DCAQKATOSA-N Lys-Ser-Arg Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N WQDKIVRHTQYJSN-DCAQKATOSA-N 0.000 description 1
- GHKXHCMRAUYLBS-CIUDSAMLSA-N Lys-Ser-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O GHKXHCMRAUYLBS-CIUDSAMLSA-N 0.000 description 1
- ZUGVARDEGWMMLK-SRVKXCTJSA-N Lys-Ser-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN ZUGVARDEGWMMLK-SRVKXCTJSA-N 0.000 description 1
- WZVSHTFTCYOFPL-GARJFASQSA-N Lys-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCCN)N)C(=O)O WZVSHTFTCYOFPL-GARJFASQSA-N 0.000 description 1
- MEQLGHAMAUPOSJ-DCAQKATOSA-N Lys-Ser-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O MEQLGHAMAUPOSJ-DCAQKATOSA-N 0.000 description 1
- CUHGAUZONORRIC-HJGDQZAQSA-N Lys-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O CUHGAUZONORRIC-HJGDQZAQSA-N 0.000 description 1
- TVOOGUNBIWAURO-KATARQTJSA-N Lys-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N)O TVOOGUNBIWAURO-KATARQTJSA-N 0.000 description 1
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 1
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 1
- RQILLQOQXLZTCK-KBPBESRZSA-N Lys-Tyr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O RQILLQOQXLZTCK-KBPBESRZSA-N 0.000 description 1
- WINFHLHJTRGLCV-BZSNNMDCSA-N Lys-Tyr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=C(O)C=C1 WINFHLHJTRGLCV-BZSNNMDCSA-N 0.000 description 1
- VVURYEVJJTXWNE-ULQDDVLXSA-N Lys-Tyr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O VVURYEVJJTXWNE-ULQDDVLXSA-N 0.000 description 1
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 1
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 1
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 1
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 1
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 1
- OLWAOWXIADGIJG-AVGNSLFASA-N Met-Arg-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(O)=O OLWAOWXIADGIJG-AVGNSLFASA-N 0.000 description 1
- UAPZLLPGGOOCRO-IHRRRGAJSA-N Met-Asn-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N UAPZLLPGGOOCRO-IHRRRGAJSA-N 0.000 description 1
- FWTBMGAKKPSTBT-GUBZILKMSA-N Met-Gln-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FWTBMGAKKPSTBT-GUBZILKMSA-N 0.000 description 1
- VOOINLQYUZOREH-SRVKXCTJSA-N Met-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCSC)N VOOINLQYUZOREH-SRVKXCTJSA-N 0.000 description 1
- KQBJYJXPZBNEIK-DCAQKATOSA-N Met-Glu-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQBJYJXPZBNEIK-DCAQKATOSA-N 0.000 description 1
- FZUNSVYYPYJYAP-NAKRPEOUSA-N Met-Ile-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O FZUNSVYYPYJYAP-NAKRPEOUSA-N 0.000 description 1
- QGRJTULYDZUBAY-ZPFDUUQYSA-N Met-Ile-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O QGRJTULYDZUBAY-ZPFDUUQYSA-N 0.000 description 1
- RRIHXWPHQSXHAQ-XUXIUFHCSA-N Met-Ile-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O RRIHXWPHQSXHAQ-XUXIUFHCSA-N 0.000 description 1
- AFFKUNVPPLQUGA-DCAQKATOSA-N Met-Leu-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O AFFKUNVPPLQUGA-DCAQKATOSA-N 0.000 description 1
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 1
- QEDGNYFHLXXIDC-DCAQKATOSA-N Met-Pro-Gln Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O QEDGNYFHLXXIDC-DCAQKATOSA-N 0.000 description 1
- KSIPKXNIQOWMIC-RCWTZXSCSA-N Met-Thr-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KSIPKXNIQOWMIC-RCWTZXSCSA-N 0.000 description 1
- MUDYEFAKNSTFAI-JYJNAYRXSA-N Met-Tyr-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O MUDYEFAKNSTFAI-JYJNAYRXSA-N 0.000 description 1
- YGNUDKAPJARTEM-GUBZILKMSA-N Met-Val-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O YGNUDKAPJARTEM-GUBZILKMSA-N 0.000 description 1
- IQJMEDDVOGMTKT-SRVKXCTJSA-N Met-Val-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IQJMEDDVOGMTKT-SRVKXCTJSA-N 0.000 description 1
- 101100476480 Mus musculus S100a8 gene Proteins 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- WSXKXSBOJXEZDV-DLOVCJGASA-N Phe-Ala-Asn Chemical compound NC(=O)C[C@@H](C([O-])=O)NC(=O)[C@H](C)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 WSXKXSBOJXEZDV-DLOVCJGASA-N 0.000 description 1
- YYRCPTVAPLQRNC-ULQDDVLXSA-N Phe-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CC1=CC=CC=C1 YYRCPTVAPLQRNC-ULQDDVLXSA-N 0.000 description 1
- UEEVBGHEGJMDDV-AVGNSLFASA-N Phe-Asp-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 UEEVBGHEGJMDDV-AVGNSLFASA-N 0.000 description 1
- RIYZXJVARWJLKS-KKUMJFAQSA-N Phe-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 RIYZXJVARWJLKS-KKUMJFAQSA-N 0.000 description 1
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 1
- PSKRILMFHNIUAO-JYJNAYRXSA-N Phe-Glu-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N PSKRILMFHNIUAO-JYJNAYRXSA-N 0.000 description 1
- KDYPMIZMXDECSU-JYJNAYRXSA-N Phe-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KDYPMIZMXDECSU-JYJNAYRXSA-N 0.000 description 1
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 1
- XZQYIJALMGEUJD-OEAJRASXSA-N Phe-Lys-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZQYIJALMGEUJD-OEAJRASXSA-N 0.000 description 1
- GPSMLZQVIIYLDK-ULQDDVLXSA-N Phe-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O GPSMLZQVIIYLDK-ULQDDVLXSA-N 0.000 description 1
- TXJJXEXCZBHDNA-ACRUOGEOSA-N Phe-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N TXJJXEXCZBHDNA-ACRUOGEOSA-N 0.000 description 1
- DSXPMZMSJHOKKK-HJOGWXRNSA-N Phe-Phe-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DSXPMZMSJHOKKK-HJOGWXRNSA-N 0.000 description 1
- YMIZSYUAZJSOFL-SRVKXCTJSA-N Phe-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O YMIZSYUAZJSOFL-SRVKXCTJSA-N 0.000 description 1
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 1
- KLYYKKGCPOGDPE-OEAJRASXSA-N Phe-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O KLYYKKGCPOGDPE-OEAJRASXSA-N 0.000 description 1
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 1
- DBNGDEAQXGFGRA-ACRUOGEOSA-N Phe-Tyr-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DBNGDEAQXGFGRA-ACRUOGEOSA-N 0.000 description 1
- SJRQWEDYTKYHHL-SLFFLAALSA-N Phe-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O SJRQWEDYTKYHHL-SLFFLAALSA-N 0.000 description 1
- YUPRIZTWANWWHK-DZKIICNBSA-N Phe-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N YUPRIZTWANWWHK-DZKIICNBSA-N 0.000 description 1
- GAMLAXHLYGLQBJ-UFYCRDLUSA-N Phe-Val-Tyr Chemical compound N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)O)CC1=CC=C(C=C1)O)C(C)C)CC1=CC=CC=C1 GAMLAXHLYGLQBJ-UFYCRDLUSA-N 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- OBVCYFIHIIYIQF-CIUDSAMLSA-N Pro-Asn-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OBVCYFIHIIYIQF-CIUDSAMLSA-N 0.000 description 1
- VOHFZDSRPZLXLH-IHRRRGAJSA-N Pro-Asn-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VOHFZDSRPZLXLH-IHRRRGAJSA-N 0.000 description 1
- CJZTUKSFZUSNCC-FXQIFTODSA-N Pro-Asp-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 CJZTUKSFZUSNCC-FXQIFTODSA-N 0.000 description 1
- XUSDDSLCRPUKLP-QXEWZRGKSA-N Pro-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 XUSDDSLCRPUKLP-QXEWZRGKSA-N 0.000 description 1
- DIFXZGPHVCIVSQ-CIUDSAMLSA-N Pro-Gln-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DIFXZGPHVCIVSQ-CIUDSAMLSA-N 0.000 description 1
- KIPIKSXPPLABPN-CIUDSAMLSA-N Pro-Glu-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 KIPIKSXPPLABPN-CIUDSAMLSA-N 0.000 description 1
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 1
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 1
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 1
- AQGUSRZKDZYGGV-GMOBBJLQSA-N Pro-Ile-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O AQGUSRZKDZYGGV-GMOBBJLQSA-N 0.000 description 1
- VZKBJNBZMZHKRC-XUXIUFHCSA-N Pro-Ile-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O VZKBJNBZMZHKRC-XUXIUFHCSA-N 0.000 description 1
- CDGABSWLRMECHC-IHRRRGAJSA-N Pro-Lys-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O CDGABSWLRMECHC-IHRRRGAJSA-N 0.000 description 1
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 1
- KDBHVPXBQADZKY-GUBZILKMSA-N Pro-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 KDBHVPXBQADZKY-GUBZILKMSA-N 0.000 description 1
- FNGOXVQBBCMFKV-CIUDSAMLSA-N Pro-Ser-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O FNGOXVQBBCMFKV-CIUDSAMLSA-N 0.000 description 1
- IMNVAOPEMFDAQD-NHCYSSNCSA-N Pro-Val-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IMNVAOPEMFDAQD-NHCYSSNCSA-N 0.000 description 1
- 238000002123 RNA extraction Methods 0.000 description 1
- 238000010802 RNA extraction kit Methods 0.000 description 1
- 239000012980 RPMI-1640 medium Substances 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 1
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 1
- NLQUOHDCLSFABG-GUBZILKMSA-N Ser-Arg-Arg Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NLQUOHDCLSFABG-GUBZILKMSA-N 0.000 description 1
- ZXLUWXWISXIFIX-ACZMJKKPSA-N Ser-Asn-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZXLUWXWISXIFIX-ACZMJKKPSA-N 0.000 description 1
- YMEXHZTVKDAKIY-GHCJXIJMSA-N Ser-Asn-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO)C(O)=O YMEXHZTVKDAKIY-GHCJXIJMSA-N 0.000 description 1
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 1
- OHKLFYXEOGGGCK-ZLUOBGJFSA-N Ser-Asp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OHKLFYXEOGGGCK-ZLUOBGJFSA-N 0.000 description 1
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 1
- BYIROAKULFFTEK-CIUDSAMLSA-N Ser-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO BYIROAKULFFTEK-CIUDSAMLSA-N 0.000 description 1
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 1
- SWSRFJZZMNLMLY-ZKWXMUAHSA-N Ser-Asp-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O SWSRFJZZMNLMLY-ZKWXMUAHSA-N 0.000 description 1
- ZOHGLPQGEHSLPD-FXQIFTODSA-N Ser-Gln-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZOHGLPQGEHSLPD-FXQIFTODSA-N 0.000 description 1
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 1
- HVKMTOIAYDOJPL-NRPADANISA-N Ser-Gln-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVKMTOIAYDOJPL-NRPADANISA-N 0.000 description 1
- SQBLRDDJTUJDMV-ACZMJKKPSA-N Ser-Glu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQBLRDDJTUJDMV-ACZMJKKPSA-N 0.000 description 1
- YQQKYAZABFEYAF-FXQIFTODSA-N Ser-Glu-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQQKYAZABFEYAF-FXQIFTODSA-N 0.000 description 1
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 1
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 1
- GZBKRJVCRMZAST-XKBZYTNZSA-N Ser-Glu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZBKRJVCRMZAST-XKBZYTNZSA-N 0.000 description 1
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 1
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- BKZYBLLIBOBOOW-GHCJXIJMSA-N Ser-Ile-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O BKZYBLLIBOBOOW-GHCJXIJMSA-N 0.000 description 1
- IFPBAGJBHSNYPR-ZKWXMUAHSA-N Ser-Ile-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O IFPBAGJBHSNYPR-ZKWXMUAHSA-N 0.000 description 1
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 1
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 1
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 1
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 1
- HDBOEVPDIDDEPC-CIUDSAMLSA-N Ser-Lys-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O HDBOEVPDIDDEPC-CIUDSAMLSA-N 0.000 description 1
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 1
- JUTGONBTALQWMK-NAKRPEOUSA-N Ser-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)N JUTGONBTALQWMK-NAKRPEOUSA-N 0.000 description 1
- ZSLFCBHEINFXRS-LPEHRKFASA-N Ser-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ZSLFCBHEINFXRS-LPEHRKFASA-N 0.000 description 1
- KZPRPBLHYMZIMH-MXAVVETBSA-N Ser-Phe-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZPRPBLHYMZIMH-MXAVVETBSA-N 0.000 description 1
- RRVFEDGUXSYWOW-BZSNNMDCSA-N Ser-Phe-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RRVFEDGUXSYWOW-BZSNNMDCSA-N 0.000 description 1
- ADJDNJCSPNFFPI-FXQIFTODSA-N Ser-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO ADJDNJCSPNFFPI-FXQIFTODSA-N 0.000 description 1
- NUEHQDHDLDXCRU-GUBZILKMSA-N Ser-Pro-Arg Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NUEHQDHDLDXCRU-GUBZILKMSA-N 0.000 description 1
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 1
- FLONGDPORFIVQW-XGEHTFHBSA-N Ser-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FLONGDPORFIVQW-XGEHTFHBSA-N 0.000 description 1
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 1
- PCJLFYBAQZQOFE-KATARQTJSA-N Ser-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N)O PCJLFYBAQZQOFE-KATARQTJSA-N 0.000 description 1
- DYEGLQRVMBWQLD-IXOXFDKPSA-N Ser-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CO)N)O DYEGLQRVMBWQLD-IXOXFDKPSA-N 0.000 description 1
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 1
- LGIMRDKGABDMBN-DCAQKATOSA-N Ser-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N LGIMRDKGABDMBN-DCAQKATOSA-N 0.000 description 1
- MQCPGOZXFSYJPS-KZVJFYERSA-N Thr-Ala-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MQCPGOZXFSYJPS-KZVJFYERSA-N 0.000 description 1
- FQPQPTHMHZKGFM-XQXXSGGOSA-N Thr-Ala-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O FQPQPTHMHZKGFM-XQXXSGGOSA-N 0.000 description 1
- VFEHSAJCWWHDBH-RHYQMDGZSA-N Thr-Arg-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VFEHSAJCWWHDBH-RHYQMDGZSA-N 0.000 description 1
- CEXFELBFVHLYDZ-XGEHTFHBSA-N Thr-Arg-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CEXFELBFVHLYDZ-XGEHTFHBSA-N 0.000 description 1
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 1
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 1
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 1
- VXMHQKHDKCATDV-VEVYYDQMSA-N Thr-Asp-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VXMHQKHDKCATDV-VEVYYDQMSA-N 0.000 description 1
- GNHRVXYZKWSJTF-HJGDQZAQSA-N Thr-Asp-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GNHRVXYZKWSJTF-HJGDQZAQSA-N 0.000 description 1
- QILPDQCTQZDHFM-HJGDQZAQSA-N Thr-Gln-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QILPDQCTQZDHFM-HJGDQZAQSA-N 0.000 description 1
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 1
- XXNLGZRRSKPSGF-HTUGSXCWSA-N Thr-Gln-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O XXNLGZRRSKPSGF-HTUGSXCWSA-N 0.000 description 1
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 1
- BNGDYRRHRGOPHX-IFFSRLJSSA-N Thr-Glu-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O BNGDYRRHRGOPHX-IFFSRLJSSA-N 0.000 description 1
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 1
- ADPHPKGWVDHWML-PPCPHDFISA-N Thr-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N ADPHPKGWVDHWML-PPCPHDFISA-N 0.000 description 1
- IHAPJUHCZXBPHR-WZLNRYEVSA-N Thr-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N IHAPJUHCZXBPHR-WZLNRYEVSA-N 0.000 description 1
- IMDMLDSVUSMAEJ-HJGDQZAQSA-N Thr-Leu-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IMDMLDSVUSMAEJ-HJGDQZAQSA-N 0.000 description 1
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- PRNGXSILMXSWQQ-OEAJRASXSA-N Thr-Leu-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PRNGXSILMXSWQQ-OEAJRASXSA-N 0.000 description 1
- SCSVNSNWUTYSFO-WDCWCFNPSA-N Thr-Lys-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O SCSVNSNWUTYSFO-WDCWCFNPSA-N 0.000 description 1
- ZXIHABSKUITPTN-IXOXFDKPSA-N Thr-Lys-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O ZXIHABSKUITPTN-IXOXFDKPSA-N 0.000 description 1
- KKPOGALELPLJTL-MEYUZBJRSA-N Thr-Lys-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKPOGALELPLJTL-MEYUZBJRSA-N 0.000 description 1
- DXPURPNJDFCKKO-RHYQMDGZSA-N Thr-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DXPURPNJDFCKKO-RHYQMDGZSA-N 0.000 description 1
- WRUWXBBEFUTJOU-XGEHTFHBSA-N Thr-Met-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N)O WRUWXBBEFUTJOU-XGEHTFHBSA-N 0.000 description 1
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 1
- MXDOAJQRJBMGMO-FJXKBIBVSA-N Thr-Pro-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O MXDOAJQRJBMGMO-FJXKBIBVSA-N 0.000 description 1
- IWAVRIPRTCJAQO-HSHDSVGOSA-N Thr-Pro-Trp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IWAVRIPRTCJAQO-HSHDSVGOSA-N 0.000 description 1
- DOBIBIXIHJKVJF-XKBZYTNZSA-N Thr-Ser-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O DOBIBIXIHJKVJF-XKBZYTNZSA-N 0.000 description 1
- ABCLYRRGTZNIFU-BWAGICSOSA-N Thr-Tyr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O ABCLYRRGTZNIFU-BWAGICSOSA-N 0.000 description 1
- OGOYMQWIWHGTGH-KZVJFYERSA-N Thr-Val-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O OGOYMQWIWHGTGH-KZVJFYERSA-N 0.000 description 1
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- RSUXQZNWAOTBQF-XIRDDKMYSA-N Trp-Arg-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RSUXQZNWAOTBQF-XIRDDKMYSA-N 0.000 description 1
- UKINEYBQXPMOJO-UBHSHLNASA-N Trp-Asn-Ser Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N UKINEYBQXPMOJO-UBHSHLNASA-N 0.000 description 1
- XZLHHHYSWIYXHD-XIRDDKMYSA-N Trp-Gln-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XZLHHHYSWIYXHD-XIRDDKMYSA-N 0.000 description 1
- MICFJCRQBFSKPA-UMPQAUOISA-N Trp-Met-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O)=CNC2=C1 MICFJCRQBFSKPA-UMPQAUOISA-N 0.000 description 1
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 1
- DLZKEQQWXODGGZ-KWQFWETISA-N Tyr-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DLZKEQQWXODGGZ-KWQFWETISA-N 0.000 description 1
- MBFJIHUHHCJBSN-AVGNSLFASA-N Tyr-Asn-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MBFJIHUHHCJBSN-AVGNSLFASA-N 0.000 description 1
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 1
- SLCSPPCQWUHPPO-JYJNAYRXSA-N Tyr-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SLCSPPCQWUHPPO-JYJNAYRXSA-N 0.000 description 1
- NJLQMKZSXYQRTO-FHWLQOOXSA-N Tyr-Glu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 NJLQMKZSXYQRTO-FHWLQOOXSA-N 0.000 description 1
- GFJXBLSZOFWHAW-JYJNAYRXSA-N Tyr-His-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O GFJXBLSZOFWHAW-JYJNAYRXSA-N 0.000 description 1
- KIJLSRYAUGGZIN-CFMVVWHZSA-N Tyr-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KIJLSRYAUGGZIN-CFMVVWHZSA-N 0.000 description 1
- QSFJHIRIHOJRKS-ULQDDVLXSA-N Tyr-Leu-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QSFJHIRIHOJRKS-ULQDDVLXSA-N 0.000 description 1
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 1
- NKUGCYDFQKFVOJ-JYJNAYRXSA-N Tyr-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NKUGCYDFQKFVOJ-JYJNAYRXSA-N 0.000 description 1
- TYFLVOUZHQUBGM-IHRRRGAJSA-N Tyr-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 TYFLVOUZHQUBGM-IHRRRGAJSA-N 0.000 description 1
- SQUMHUZLJDUROQ-YDHLFZDLSA-N Tyr-Val-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O SQUMHUZLJDUROQ-YDHLFZDLSA-N 0.000 description 1
- ABSXSJZNRAQDDI-KJEVXHAQSA-N Tyr-Val-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ABSXSJZNRAQDDI-KJEVXHAQSA-N 0.000 description 1
- IZFVRRYRMQFVGX-NRPADANISA-N Val-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N IZFVRRYRMQFVGX-NRPADANISA-N 0.000 description 1
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 1
- JIODCDXKCJRMEH-NHCYSSNCSA-N Val-Arg-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N JIODCDXKCJRMEH-NHCYSSNCSA-N 0.000 description 1
- LIQJSDDOULTANC-QSFUFRPTSA-N Val-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N LIQJSDDOULTANC-QSFUFRPTSA-N 0.000 description 1
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 1
- PVPAOIGJYHVWBT-KKHAAJSZSA-N Val-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N)O PVPAOIGJYHVWBT-KKHAAJSZSA-N 0.000 description 1
- HZYOWMGWKKRMBZ-BYULHYEWSA-N Val-Asp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZYOWMGWKKRMBZ-BYULHYEWSA-N 0.000 description 1
- NYTKXWLZSNRILS-IFFSRLJSSA-N Val-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N)O NYTKXWLZSNRILS-IFFSRLJSSA-N 0.000 description 1
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 1
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 1
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 1
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 1
- JVYIGCARISMLMV-HOCLYGCPSA-N Val-Gly-Trp Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N JVYIGCARISMLMV-HOCLYGCPSA-N 0.000 description 1
- FTKXYXACXYOHND-XUXIUFHCSA-N Val-Ile-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O FTKXYXACXYOHND-XUXIUFHCSA-N 0.000 description 1
- AGXGCFSECFQMKB-NHCYSSNCSA-N Val-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N AGXGCFSECFQMKB-NHCYSSNCSA-N 0.000 description 1
- WBAJDGWKRIHOAC-GVXVVHGQSA-N Val-Lys-Gln Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O WBAJDGWKRIHOAC-GVXVVHGQSA-N 0.000 description 1
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 1
- UEPLNXPLHJUYPT-AVGNSLFASA-N Val-Met-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O UEPLNXPLHJUYPT-AVGNSLFASA-N 0.000 description 1
- GQMNEJMFMCJJTD-NHCYSSNCSA-N Val-Pro-Gln Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O GQMNEJMFMCJJTD-NHCYSSNCSA-N 0.000 description 1
- LTTQCQRTSHJPPL-ZKWXMUAHSA-N Val-Ser-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N LTTQCQRTSHJPPL-ZKWXMUAHSA-N 0.000 description 1
- SUGRIIAOLCDLBD-ZOBUZTSGSA-N Val-Trp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SUGRIIAOLCDLBD-ZOBUZTSGSA-N 0.000 description 1
- VTIAEOKFUJJBTC-YDHLFZDLSA-N Val-Tyr-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VTIAEOKFUJJBTC-YDHLFZDLSA-N 0.000 description 1
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 108010068380 arginylarginine Proteins 0.000 description 1
- 108010036533 arginylvaline Proteins 0.000 description 1
- 229930189065 blasticidin Natural products 0.000 description 1
- 229910002092 carbon dioxide Inorganic materials 0.000 description 1
- 239000001569 carbon dioxide Substances 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 210000003855 cell nucleus Anatomy 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000013020 embryo development Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 1
- 238000001976 enzyme digestion Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000012091 fetal bovine serum Substances 0.000 description 1
- 108010049041 glutamylalanine Proteins 0.000 description 1
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- 102000056478 human LDB1 Human genes 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 1
- 108010000761 leucylarginine Proteins 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 125000003729 nucleotide group Chemical group 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 108010070643 prolylglutamic acid Proteins 0.000 description 1
- 229950010131 puromycin Drugs 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 239000008223 sterile water Substances 0.000 description 1
- 108700029760 synthetic LTSP Proteins 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及生物技术领域,特别是涉及一种融合蛋白及其用途。The present invention relates to the field of biotechnology, and in particular to a fusion protein and application thereof.
背景技术Background Art
细胞核中染色质的折叠和不同区域的相互作用产生了染色质三维结构,染色质三维结构又在基因表达中扮演关键作用。启动子和增强子序列是调控基因表达的顺式作用元件。启动子是临近基因上游一段DNA区域,能够招募转录因子结合启动基因表达。增强子是位于目标基因上下游一定距离的一段序列,并通过DNA成环与目标基因启动子相互作用而激活或提高目标基因表达。基于不同胚胎发育阶段的研究更详细的描述出在发育过程中增强子和增强子成环的动态变化对基因时空表达的调控。其中研究最深入的是人β-globin基因簇。该基因簇中的远程增强子被称为“基因簇调控区(LCR)”。在发育过程中,LCR通过与不同基因启动子结合成环依次调控表达胚胎的ε-globin(HBE)、婴儿的γ-globin(HBG)、以及成人的δ-globin(HBE)和β-globin(HBB)。因此,人工成环作为一个可行的策略可以被用来研究增强子对内源基因表达的调控,甚至具有运用在疾病治疗中的潜力。The folding of chromatin in the cell nucleus and the interaction between different regions produce the three-dimensional structure of chromatin, which plays a key role in gene expression. Promoter and enhancer sequences are cis-acting elements that regulate gene expression. The promoter is a DNA region adjacent to the upstream of the gene that can recruit transcription factors to bind and initiate gene expression. The enhancer is a sequence located a certain distance upstream and downstream of the target gene, and activates or increases the expression of the target gene by interacting with the promoter of the target gene through DNA looping. Studies based on different embryonic development stages have described in more detail the regulation of gene spatiotemporal expression by the dynamic changes of enhancers and enhancer looping during development. Among them, the most intensively studied is the human β-globin gene cluster. The long-range enhancers in this gene cluster are called "gene cluster regulatory regions (LCRs)". During development, LCRs regulate the expression of embryonic ε-globin (HBE), infant γ-globin (HBG), and adult δ-globin (HBE) and β-globin (HBB) by binding to different gene promoters to form loops. Therefore, artificial looping can be used as a feasible strategy to study the regulation of endogenous gene expression by enhancers and even has the potential to be used in disease treatment.
发明内容Summary of the invention
鉴于以上所述现有技术的缺点,本发明的目的在于提供一种融合蛋白及其用途,用于解决现有技术中的问题。In view of the above-mentioned shortcomings of the prior art, the object of the present invention is to provide a fusion protein and use thereof to solve the problems in the prior art.
为实现上述目的及其他相关目的,本发明一方面提供一种融合蛋白,包括同源二聚体片段和dCas9蛋白片段,所述同源二聚体片段包括LDB1蛋白片段或LDB1蛋白的Dimerdomain片段。To achieve the above objectives and other related objectives, the present invention provides a fusion protein, including a homodimer fragment and a dCas9 protein fragment, wherein the homodimer fragment includes an LDB1 protein fragment or a Dimerdomain fragment of the LDB1 protein.
在本发明一些实施方式中,所述LDB1蛋白片段的氨基酸序列包括:In some embodiments of the present invention, the amino acid sequence of the LDB1 protein fragment comprises:
a)如SEQ ID NO.1所示的氨基酸序列;或,a) the amino acid sequence shown in SEQ ID NO.1; or,
b)与SEQ ID NO.1具有80%以上序列相似性的氨基酸序列、且具有a)所限定的氨基酸序列的功能,优选为能够形成同源二聚体。b) an amino acid sequence having a sequence similarity of 80% or more to SEQ ID NO. 1, having the function of the amino acid sequence defined in a), and preferably being able to form a homodimer.
在本发明一些实施方式中,所述LDB1蛋白的Dimer domain片段的氨基酸序列包括:In some embodiments of the present invention, the amino acid sequence of the dimer domain fragment of the LDB1 protein includes:
c)如SEQ ID NO.2所示的氨基酸序列;或,c) the amino acid sequence shown in SEQ ID NO.2; or,
d)与SEQ ID NO.2具有80%以上序列相似性的氨基酸序列、且具有c)所限定的氨基酸序列的功能,优选为能够形成同源二聚体。d) an amino acid sequence having a sequence similarity of 80% or more to SEQ ID NO. 2, and having the function of the amino acid sequence defined in c), preferably being able to form a homodimer.
在本发明一些实施方式中,所述dCas9蛋白片段的氨基酸序列包括:In some embodiments of the present invention, the amino acid sequence of the dCas9 protein fragment comprises:
e)如SEQ ID NO.3所示的氨基酸序列;或,e) the amino acid sequence shown in SEQ ID NO.3; or,
f)与SEQ ID NO.3具有80%以上序列相似性的氨基酸序列、且具有e)所限定的氨基酸序列的功能,优选为能够特异性靶向位点的sgRNA相配合。f) an amino acid sequence having a sequence similarity of 80% or more to SEQ ID NO. 3 and having the function of the amino acid sequence defined in e), preferably in combination with an sgRNA capable of specifically targeting the site.
在本发明一些实施方式中,所述融合蛋白自5’端至3’端依次包括同源二聚体片段和dCas9蛋白片段。In some embodiments of the present invention, the fusion protein includes a homodimer fragment and a dCas9 protein fragment in sequence from the 5' end to the 3' end.
在本发明一些实施方式中,所述融合蛋白自5’端至3’端依次包括dCas9蛋白片段和同源二聚体片段。In some embodiments of the present invention, the fusion protein includes a dCas9 protein fragment and a homodimer fragment from the 5' end to the 3' end.
在本发明一些实施方式中,所述融合蛋白还包括柔性连接肽段,所述柔性连接肽段位于同源二聚体片段和dCas9蛋白片段之间,优选的,所述柔性连接肽段的氨基酸序列如SEQ ID NO.41~42所示。In some embodiments of the present invention, the fusion protein further comprises a flexible connecting peptide segment, and the flexible connecting peptide segment is located between the homodimer fragment and the dCas9 protein fragment. Preferably, the amino acid sequence of the flexible connecting peptide segment is shown in SEQ ID NO.41-42.
在本发明一些实施方式中,所述融合蛋白的氨基酸序列如SEQ ID No.37~40其中之一所示。In some embodiments of the present invention, the amino acid sequence of the fusion protein is shown in one of SEQ ID No. 37-40.
本发明另一方面提供一种分离的多核苷酸,编码所述的融合蛋白。Another aspect of the present invention provides an isolated polynucleotide encoding the fusion protein.
本发明另一方面提供一种DNA成环体系,包括所述的融合蛋白,还包括靶向启动子的sgRNA和靶向增强子的sgRNA。On the other hand, the present invention provides a DNA looping system, comprising the fusion protein, and also comprising a sgRNA targeting a promoter and a sgRNA targeting an enhancer.
在本发明一些实施方式中,所述靶向启动子的sgRNA靶向基因TSS上游-100至-200bp区间。In some embodiments of the present invention, the promoter-targeting sgRNA targets the -100 to -200 bp interval upstream of the gene TSS.
在本发明一些实施方式中,所述靶向启动子的sgRNA具有gnnnnnnnnnnnnnnnnnnnNGG特征(SEQ ID NO.43)。In some embodiments of the present invention, the sgRNA targeting the promoter has the feature of gnnnnnnnnnnnnnnnnnnnNGG (SEQ ID NO.43).
在本发明一些实施方式中,所述靶向启动子的GC含量在40-60%之间。In some embodiments of the present invention, the GC content of the targeted promoter is between 40-60%.
在本发明一些实施方式中,所述靶向启动子的sgRNA靶向HBB基因的启动子区域,优选的,所述靶向启动子的sgRNA的序列如SEQ ID NO.4~6所示。In some embodiments of the present invention, the promoter-targeting sgRNA targets the promoter region of the HBB gene. Preferably, the sequence of the promoter-targeting sgRNA is shown in SEQ ID NOs. 4 to 6.
在本发明一些实施方式中,所述靶向增强子的sgRNA靶向增强子的DHS区域。In some embodiments of the present invention, the sgRNA targeting the enhancer targets the DHS region of the enhancer.
在本发明一些实施方式中,所述靶向增强子的sgRNA靶向β-globin的LCR区域的DHS2附近。In some embodiments of the present invention, the sgRNA targeting the enhancer targets the vicinity of DHS2 in the LCR region of β-globin.
在本发明一些实施方式中,所述靶向增强子的sgRNA的序列如SEQ ID NO.7~9所示。In some embodiments of the present invention, the sequence of the sgRNA targeting the enhancer is shown in SEQ ID NOs. 7 to 9.
本发明另一方面提供一种表达系统,所述表达系统包括能够表达所述融合蛋白、所述靶向启动子的sgRNA和所述靶向增强子的sgRNA的宿主细胞。Another aspect of the present invention provides an expression system, comprising a host cell capable of expressing the fusion protein, the sgRNA targeting the promoter, and the sgRNA targeting the enhancer.
在本发明一些实施方式中,所述表达系统包括含有编码所述融合蛋白的多核苷酸的表达载体的宿主细胞、或染色体中整合有编码所述融合蛋白的多核苷酸的宿主细胞。In some embodiments of the present invention, the expression system comprises a host cell containing an expression vector of a polynucleotide encoding the fusion protein, or a host cell in which a polynucleotide encoding the fusion protein is integrated into its chromosome.
在本发明一些实施方式中,所述表达系统包括含有编码所述靶向启动子的sgRNA的多核苷酸的表达载体的宿主细胞、或染色体中整合有编码所述靶向启动子的sgRNA的多核苷酸的宿主细胞。In some embodiments of the present invention, the expression system includes a host cell containing an expression vector encoding a polynucleotide of the sgRNA targeting the promoter, or a host cell in which a polynucleotide encoding the sgRNA targeting the promoter is integrated into the chromosome.
在本发明一些实施方式中,所述表达系统包括含有编码所述靶向增强子的sgRNA的多核苷酸的表达载体的宿主细胞,或染色体中整合有编码所述靶向增强子的sgRNA的多核苷酸的宿主细胞。In some embodiments of the present invention, the expression system includes a host cell containing an expression vector encoding a polynucleotide of the sgRNA targeting the enhancer, or a host cell in which a polynucleotide encoding the sgRNA targeting the enhancer is integrated into the chromosome.
在本发明一些实施方式中,所述表达系统还包括能够表达目标基因的宿主细胞。In some embodiments of the present invention, the expression system further comprises a host cell capable of expressing the target gene.
在本发明一些实施方式中,所述宿主细胞选自真核细胞。In some embodiments of the present invention, the host cell is selected from eukaryotic cells.
在本发明一些实施方式中,所述宿主细胞选自后生动物来源的原代细胞或永生化细胞系。In some embodiments of the present invention, the host cell is selected from primary cells or immortalized cell lines derived from metazoans.
在本发明一些实施方式中,所述宿主细胞选自血系细胞系。In some embodiments of the invention, the host cell is selected from a blood cell line.
在本发明一些实施方式中,所述宿主细胞选自人K562细胞。In some embodiments of the present invention, the host cell is selected from human K562 cells.
本发明另一方面提供所述的DNA成环分子、所述的多核苷酸、所述的成环体系、所述的表达系统在基因表达中的用途。Another aspect of the present invention provides uses of the DNA circularization molecule, the polynucleotide, the circularization system, and the expression system in gene expression.
在本发明一些实施方式中,所述基因表达中的用途为真核生物的基因表达中的用途。In some embodiments of the present invention, the use in gene expression is use in gene expression in eukaryotic organisms.
在本发明一些实施方式中,所述真核生物选自后生动物。In some embodiments of the invention, the eukaryotic organism is selected from metazoa.
在本发明一些实施方式中,所述真核生物选自人、小鼠、线虫、果蝇中的一种或多种的组合。In some embodiments of the present invention, the eukaryotic organism is selected from a combination of one or more of humans, mice, nematodes, and fruit flies.
本发明另一方面提供一种基因表达方法,包括:通过所述的融合蛋白、或所述的成环体系,拉近靶向位点的三维空间距离,进行基因表达。Another aspect of the present invention provides a gene expression method, comprising: shortening the three-dimensional spatial distance of the target site by the fusion protein or the looping system to perform gene expression.
在本发明一些实施方式中,所述基因表达方法包括:在所述成环体系存在的条件下,在适当条件下培养能够表达目标基因的宿主细胞。In some embodiments of the present invention, the gene expression method comprises: culturing a host cell capable of expressing the target gene under appropriate conditions in the presence of the looping system.
在本发明一些实施方式中,所述基因表达方法为体外基因表达方法。In some embodiments of the present invention, the gene expression method is an in vitro gene expression method.
在本发明一些实施方式中,所述基因表达方法包括:在适当条件下培养求所述的表达系统。In some embodiments of the present invention, the gene expression method comprises: culturing the expression system under appropriate conditions.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1显示为本发明LDB1-dCas9介导的DNA成环对目的基因空间位置重编程的示意图。FIG1 is a schematic diagram showing the spatial position reprogramming of the target gene by DNA looping mediated by LDB1-dCas9 of the present invention.
图2显示为本发明LDB1-dCas9与dCas9-LDB1介导DNA成环后β-globin基因簇里各基因表达的变化。FIG. 2 shows the changes in the expression of genes in the β-globin gene cluster after DNA circularization mediated by LDB1-dCas9 and dCas9-LDB1 of the present invention.
图3显示为本发明基因簇里其他globin基因的表达情况示意图。FIG3 is a schematic diagram showing the expression of other globin genes in the gene cluster of the present invention.
图4显示为本发明LDB1-dCas9、dCas9-LDB1、dCas9-DD与DD-dCas9激活HBB基因的效率比较。FIG4 shows a comparison of the efficiency of activating the HBB gene by LDB1-dCas9, dCas9-LDB1, dCas9-DD and DD-dCas9 of the present invention.
具体实施方式DETAILED DESCRIPTION
本发明发明人经过大量探索性研究,提供了一种新型DNA环化分子,所述DNA环化分子包括由LDB1与dCas9所形成的融合蛋白,可以通过重编程基因的空间位置调控基因表达,在此基础上完成了本发明。After a lot of exploratory research, the inventors of the present invention provided a new type of DNA cyclization molecule, which includes a fusion protein formed by LDB1 and dCas9, and can regulate gene expression by reprogramming the spatial position of the gene. On this basis, the present invention was completed.
本发明第一方面提供一种融合蛋白,包括同源二聚体片段和dCas9蛋白片段,所述同源二聚体片段包括LDB1蛋白片段或LDB1蛋白的Dimer domain(DD)片段。本发明所提供的环化分子通常可以是融合蛋白,可以通过重编程基因的空间位置,从而调控基因的表达,由于同源二聚体片段的存在,所以只需要一个CRISPR-Cas9就能同时靶向两个位点成环,从而可以从空间位置上拉近启动子、增强子和目标基因之间的距离,提升DNA成环的效率,已达到调控目标基因表达的效果。The first aspect of the present invention provides a fusion protein, including a homodimer fragment and a dCas9 protein fragment, wherein the homodimer fragment includes an LDB1 protein fragment or a Dimer domain (DD) fragment of the LDB1 protein. The cyclized molecule provided by the present invention can generally be a fusion protein, which can regulate the expression of the gene by reprogramming the spatial position of the gene. Due to the presence of the homodimer fragment, only one CRISPR-Cas9 is needed to simultaneously target two sites to form a loop, thereby shortening the distance between the promoter, enhancer and target gene in terms of spatial position, improving the efficiency of DNA looping, and achieving the effect of regulating the expression of the target gene.
本发明所提供的融合蛋白中,所述LDB1蛋白片段的氨基酸序列可以包括:a)如SEQID NO.1所示的氨基酸序列;或,b)与SEQ ID NO.1具有80%以上序列相似性的氨基酸序列、且具有a)所限定的氨基酸序列的功能;具体的,所述b)中的氨基酸序列具体指:如SEQ IDNo.1其中之一所示的氨基酸序列经过取代、缺失或者添加一个或多个(具体可以是1-50、1-30个、1-20个、1-10个、1-5个、或1-3个)氨基酸而得到的,或者在N-末端和/或C-末端添加一个或多个(具体可以是1-50个、1-30个、1-20个、1-10个、1-5个、或1-3个)氨基酸而得到的,且具有氨基酸如SEQ ID No.1所示的多肽片段的功能的多肽片段,例如,可以形成同源二聚体。所述b)中的氨基酸序列可与SEQ ID No.1具有80%、85%、90%、93%、95%、97%、或99%以上的相似性。In the fusion protein provided by the present invention, the amino acid sequence of the LDB1 protein fragment may include: a) an amino acid sequence as shown in SEQ ID NO.1; or, b) an amino acid sequence having a sequence similarity of more than 80% with SEQ ID NO.1 and having the function of the amino acid sequence defined in a); specifically, the amino acid sequence in b) specifically refers to: an amino acid sequence as shown in one of SEQ ID No.1 obtained by substitution, deletion or addition of one or more (specifically 1-50, 1-30, 1-20, 1-10, 1-5, or 1-3) amino acids, or an amino acid sequence as shown in one of SEQ ID No.1 obtained by addition of one or more (specifically 1-50, 1-30, 1-20, 1-10, 1-5, or 1-3) amino acids at the N-terminus and/or C-terminus, and having the function of the polypeptide fragment as shown in SEQ ID No.1, for example, can form a homodimer. The amino acid sequence in b) may have 80%, 85%, 90%, 93%, 95%, 97%, or 99% or more similarity to SEQ ID No.1.
MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSNSTLNYLRLCVILEPMQELMSRHKTYSLSPRDCLKTCLFQKWQRMVAPPAEPTRQQPSKRRKRKMSGGSTMSSGGGNTNNSNSKKKSPASTFALSSQVPDVMVVGEPTLMGGEFGDEDERLITRLENTQFDAANGIDDEDSFNNSPALGANSPWNSKPPSSQESKSENPTSQASQ(SEQ ID NO.1)MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSNSTLNYLRLCVILEPMQELMSRH KTYSLSPRDCLKTCLFQKWQRMVAPPAEPTRQQPSKRRKRKMSGGSTMSSGGGNTNNSNSKKKSPASTFALSSQVPDVMVVGEPTLMMGGEFGDEDERLITRLENTQFDAANGIDDEDSFNNSPALGANSPWNSKPPSSQESKSENPTSQASQ(SEQ ID NO.1)
本发明所提供的融合蛋白中,所述LDB1蛋白的Dimer domain片段是LDB1蛋白中用于形成同源二聚体的片段,所述LDB1蛋白的Dimer domain片段的氨基酸序列可以包括:c)如SEQ ID NO.2所示的氨基酸序列;或,d)与SEQ ID NO.2具有80%以上序列相似性的氨基酸序列、且具有c)所限定的氨基酸序列的功能;具体的,所述d)中的氨基酸序列具体指:如SEQ ID No.2其中之一所示的氨基酸序列经过取代、缺失或者添加一个或多个(具体可以是1-50、1-30个、1-20个、1-10个、1-5个、或1-3个)氨基酸而得到的,或者在N-末端和/或C-末端添加一个或多个(具体可以是1-50个、1-30个、1-20个、1-10个、1-5个、或1-3个)氨基酸而得到的,且具有氨基酸如SEQ ID No.2所示的多肽片段的功能的多肽片段,例如,可以形成同源二聚体。所述d)中的氨基酸序列可与SEQ ID No.2具有80%、85%、90%、93%、95%、97%、或99%以上的相似性。In the fusion protein provided by the present invention, the dimer domain fragment of the LDB1 protein is a fragment of the LDB1 protein used to form a homodimer, and the amino acid sequence of the dimer domain fragment of the LDB1 protein may include: c) an amino acid sequence as shown in SEQ ID NO.2; or, d) an amino acid sequence having a sequence similarity of more than 80% with SEQ ID NO.2, and having the function of the amino acid sequence defined in c); specifically, the amino acid sequence in d) specifically refers to: an amino acid sequence as shown in one of SEQ ID No.2 obtained by substitution, deletion or addition of one or more (specifically 1-50, 1-30, 1-20, 1-10, 1-5, or 1-3) amino acids, or an amino acid sequence as shown in one of SEQ ID No.2 obtained by adding one or more (specifically 1-50, 1-30, 1-20, 1-10, 1-5, or 1-3) amino acids to the N-terminus and/or C-terminus, and having the function of the polypeptide fragment as shown in SEQ ID No.2, for example, can form a homodimer. The amino acid sequence in d) may have 80%, 85%, 90%, 93%, 95%, 97%, or 99% or more similarity to SEQ ID No.2.
MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLS(SEQID NO.2)MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLS(SEQID NO.2)
本发明所提供的融合蛋白中,所述dCas9蛋白片段的氨基酸序列可以包括:e)如SEQ IDNO.3所示的氨基酸序列;或,f)与SEQ ID NO.3具有80%以上序列相似性的氨基酸序列、且具有e)所限定的氨基酸序列的功能;具体的,所述f)中的氨基酸序列具体指:如SEQIDNo.3其中之一所示的氨基酸序列经过取代、缺失或者添加一个或多个(具体可以是1-50、1-30个、1-20个、1-10个、1-5个、或1-3个)氨基酸而得到的,或者在N-末端和/或C-末端添加一个或多个(具体可以是1-50个、1-30个、1-20个、1-10个、1-5个、或1-3个)氨基酸而得到的,且具有氨基酸如SEQ ID No.3所示的多肽片段的功能的多肽片段,例如,与特异性靶向位点(例如,靶向启动子、增强子等)的sgRNA相配合,识别靶向位点,从而可以拉近靶向位点之间的三维空间距离。所述f)中的氨基酸序列可与SEQ ID No.3具有80%、85%、90%、93%、95%、97%、或99%以上的相似性。In the fusion protein provided by the present invention, the amino acid sequence of the dCas9 protein fragment may include: e) an amino acid sequence as shown in SEQ ID NO.3; or, f) an amino acid sequence having a sequence similarity of more than 80% with SEQ ID NO.3, and having the function of the amino acid sequence defined in e); specifically, the amino acid sequence in f) specifically refers to: an amino acid sequence as shown in one of SEQ ID No.3 obtained by substitution, deletion or addition of one or more (specifically 1-50, 1-30, 1-20, 1-10, 1-5, or 1-3) amino acids, or a polypeptide fragment obtained by adding one or more (specifically 1-50, 1-30, 1-20, 1-10, 1-5, or 1-3) amino acids to the N-terminus and/or C-terminus, and having the function of a polypeptide fragment as shown in SEQ ID No.3, for example, cooperating with an sgRNA of a specific targeting site (e.g., targeting a promoter, enhancer, etc.) to recognize the targeting site, thereby shortening the three-dimensional spatial distance between the targeting sites. The amino acid sequence in f) may have 80%, 85%, 90%, 93%, 95%, 97%, or 99% or more similarity to SEQ ID No.3.
DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD(SEQ ID NO.3)DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSAR LSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQL PEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIV DLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGLSRKLINGIRDKQSGKTILDFLKSD GFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAK LITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEI GKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLK GSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD(SEQ ID NO.3)
本发明所提供的融合蛋白中,所述融合蛋白自5’端至3’端可以依次包括同源二聚体片段和dCas9蛋白片段,例如,同源二聚体片段可以与dCas9蛋白氨基端连接。所述融合蛋白自5’端至3’端可以依次包括dCas9蛋白片段和同源二聚体片段,例如,同源二聚体片段可以与dCas9蛋白羧基端连接。In the fusion protein provided by the present invention, the fusion protein may include a homodimer fragment and a dCas9 protein fragment in sequence from the 5' end to the 3' end, for example, the homodimer fragment may be connected to the amino terminus of the dCas9 protein. The fusion protein may include a dCas9 protein fragment and a homodimer fragment in sequence from the 5' end to the 3' end, for example, the homodimer fragment may be connected to the carboxyl terminus of the dCas9 protein.
本发明所提供的融合蛋白中,所述融合蛋白还包括柔性连接肽段,所述柔性连接肽段通常位于同源二聚体片段和dCas9蛋白片段之间。本领域技术人员通常可以选择合适的柔性连接肽段以连接同源二聚体片段和dCas9蛋白片段,例如,当融合蛋白自5’端至3’端可以依次包括同源二聚体片段和dCas9蛋白片段,连接同源二聚体片段和dCas9蛋白片段的柔性连接肽段的氨基酸序列可以是SGSETPGTSESATPES(SEQ ID NO.41)。再例如,当融合蛋白自5’端至3’端可以依次包括dCas9蛋白片段和同源二聚体片段,连接同源二聚体片段和dCas9蛋白片段的柔性连接肽段的氨基酸序列可以是GRAGGGSGGGSGGGS(SEQ ID NO.42)。In the fusion protein provided by the present invention, the fusion protein further comprises a flexible connecting peptide segment, which is generally located between the homodimer fragment and the dCas9 protein fragment. Those skilled in the art can generally select a suitable flexible connecting peptide segment to connect the homodimer fragment and the dCas9 protein fragment. For example, when the fusion protein can include the homodimer fragment and the dCas9 protein fragment in sequence from the 5' end to the 3' end, the amino acid sequence of the flexible connecting peptide segment connecting the homodimer fragment and the dCas9 protein fragment can be SGSETPGTSESATPES (SEQ ID NO.41). For another example, when the fusion protein can include the dCas9 protein fragment and the homodimer fragment in sequence from the 5' end to the 3' end, the amino acid sequence of the flexible connecting peptide segment connecting the homodimer fragment and the dCas9 protein fragment can be GRAGGGSGGGSGGGS (SEQ ID NO.42).
在本发明一具体实施例中,所述融合蛋白的氨基酸序列可以如SEQ ID No.37~40所示。In a specific embodiment of the present invention, the amino acid sequence of the fusion protein may be as shown in SEQ ID No.37-40.
DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKVGRAGGGSGGGSGGGSMLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSNSTLNYLRLCVILEPMQELMSRHKTYSLSPRDCLKTCLFQKWQRMVAPPAEPTRQQPSKRRKRKMSGGSTMSSGGGNTNNSNSKKKSPASTFALSSQVPDVMVVGEPTLMGGEFGDEDERLITRLENTQFDAANGIDDEDSFNNSPALGANSPWNSKPPSSQESKSENPTSQASQG(SEQ ID No.37).DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSAR LSKSRRL ENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ EDFYPFLKDNREKIEKI LTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIIEERLKTYAHLFDDKVMKQLK RRRYTGWGRLSR KLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVP SEEVVKKMKNYWRQL LNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQ VNIVKKTEVQTGGFSK ESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA AFKYFDT TIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKVGRAGGGSGGGSGGGSMLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRI KTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSNSTLNYLRLCVILEPMQELMSRHKTYSLSPRDCLKTCLFQKWQRMVAPPAEPTRQQPSKRRKRKMSGGSTMSSGGGNTNNSNSKKKSPASTFALSSQVPDVMVVGEPTLMGGEFGDEDERLITRLENTQFDAANGIDDEDSFNNSPALGANSPWNSKPPSSQESKSENPTSQASQ G(SEQ ID No.37).
MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSNSTLNYLRLCVILEPMQELMSRHKTYSLSPRDCLKTCLFQKWQRMVAPPAEPTRQQPSKRRKRKMSGGSTMSSGGGNTNNSNSKKKSPASTFALSSQVPDVMVVGEPTLMGGEFGDEDERLITRLENTQFDAANGIDDEDSFNNSPALGANSPWNSKPPSSQESKSENPTSQASQSGSETPGTSESATPESDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKV(SEQ ID No.38)MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSNSTLNYLRLCVILEPMQELMS RHKTYSLSPRDCLKTCLFQKWQRMVAPPAEPTRQQPSKRRKRKMSGGSTMSSGGGNTNNSNSKKKSPASTFALSSQVPDVMVVGEPTLMGGEFGDEDERLITRLENTQFDAANGIDDEDSFNNSPALGANSPWNSKPPSSQESKSENPTSQASQSGSETPGTSESATPESDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALL FDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTY DD DLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE EVVDKGASAQSFIERM TNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLT FKEDIQKAQVSGQ GDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQ LVETRQITKHVAQILDSR MNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVA KVEKGKSKKLKS VKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRK V(SEQ ID No.38)
MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSSGSETPGTSESATPESDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKV(SEQ ID No.39)MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSSGSETPGTSESATPESDKKYS IGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKL FIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKM DGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGV EDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGS QILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYH HAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDL IIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKV(SEQ ID No.39)
DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKVGRAGGGSGGGSGGGSMLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSG(SEQ ID No.40)DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSAR LSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDF LDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEH PVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALI KKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKL PKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKVGRAGGGSGGGSGGGSMLDRDVGPTPMYPPTYLEPGIGRHHTPY GNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSG(SEQ ID No. 40)
本发明第二方面提供一种分离的多核苷酸,编码本发明第一方面所提供的融合蛋白。The second aspect of the present invention provides an isolated polynucleotide encoding the fusion protein provided by the first aspect of the present invention.
本发明第三方面提供一种DNA成环体系,包括本发明第一方面所提供的融合蛋白,还包括靶向启动子的sgRNA和靶向增强子的sgRNA。本领域技术人员可以根据目标表达基因,选择合适的靶向启动子的sgRNA和/或靶向增强子的sgRNA。例如,所述靶向启动子的sgRNA的序列通常可以与目标基因的启动子至少部分互补,再例如,所述靶向增强子的sgRNA的序列通常可以与目标基因的增强子至少部分互补,从而可以通过所述环化分子所形成的二聚体,拉近靶向位点的三维空间距离。The third aspect of the present invention provides a DNA looping system, including the fusion protein provided by the first aspect of the present invention, and also including a sgRNA targeting a promoter and a sgRNA targeting an enhancer. Those skilled in the art can select a suitable sgRNA targeting a promoter and/or a sgRNA targeting an enhancer according to the target expression gene. For example, the sequence of the sgRNA targeting the promoter can generally be at least partially complementary to the promoter of the target gene, and for another example, the sequence of the sgRNA targeting the enhancer can generally be at least partially complementary to the enhancer of the target gene, so that the dimer formed by the cyclization molecule can be used to close the three-dimensional space distance of the targeting site.
本发明所提供的DNA成环体系中,所述靶向启动子的sgRNA通常可以靶向基因TSS上游-100至-200bp区间,所述靶向启动子的sgRNA的序列通常可以设计为具有gnnnnnnnnnnnnnnnnnnnNGG特征(SEQ ID NO.43),所述靶向启动子的sgRNA的GC含量通常可以在40-60%之间。在本发明一具体实施例中,所述靶向启动子的sgRNA靶向HBB基因的启动子区域,具体的,所述靶向启动子的sgRNA的序列可以如SEQ ID NO.4~6所示。In the DNA looping system provided by the present invention, the sgRNA targeting the promoter can generally target the -100 to -200bp interval upstream of the gene TSS, and the sequence of the sgRNA targeting the promoter can generally be designed to have a gnnnnnnnnnnnnnnnnnnnNGG feature (SEQ ID NO.43), and the GC content of the sgRNA targeting the promoter can generally be between 40-60%. In a specific embodiment of the present invention, the sgRNA targeting the promoter targets the promoter region of the HBB gene, and specifically, the sequence of the sgRNA targeting the promoter can be shown as SEQ ID NOs.4 to 6.
本发明所提供的DNA成环体系中,所述靶向增强子的sgRNA可以靶向增强子的DHS(DNase Hypersensitive Site)区域,具体可以是靶向增强子的DHS(DNaseHypersensitive Site)的附近或内部,所述靶向增强子的sgRNA的序列通常可以设计为具有gnnnnnnnnnnnnnnnnnnnNGG特征(SEQ ID NO.43),所述靶向增强子的sgRNA的GC含量通常可以在40-60%之间。在本发明一具体实施例中,所述靶向增强子的sgRNA靶向LCR区域,所述LCR区域可以优选为β-globin基因簇的LCR区域,LCR区域也可以优选为LCR区域的高敏位点DHS2,具体的,所述靶向增强子的sgRNA的序列可以如SEQ ID NO.7~9所示。In the DNA looping system provided by the present invention, the sgRNA targeting the enhancer can target the DHS (DNase Hypersensitive Site) region of the enhancer, and specifically can be near or inside the DHS (DNase Hypersensitive Site) of the targeting enhancer. The sequence of the sgRNA targeting the enhancer can generally be designed to have a gnnnnnnnnnnnnnnnnnnnNGG feature (SEQ ID NO.43), and the GC content of the sgRNA targeting the enhancer can generally be between 40-60%. In a specific embodiment of the present invention, the sgRNA targeting the enhancer targets the LCR region, and the LCR region can preferably be the LCR region of the β-globin gene cluster, and the LCR region can also preferably be the high-sensitivity site DHS2 of the LCR region. Specifically, the sequence of the sgRNA targeting the enhancer can be as shown in SEQ ID NO.7 to 9.
本发明第三方面提供一种表达系统,所述表达系统包括能够表达所述融合蛋白、所述靶向启动子的sgRNA和所述靶向增强子的sgRNA的宿主细胞。从而可以通过所述融合蛋白所形成的二聚体,拉近靶向位点的三维空间距离,以实现目标基因的顺利表达。使所述表达系统能够表达所述融合蛋白、所述靶向启动子的sgRNA和所述靶向增强子的sgRNA的方法对于本领域技术人员来说应该是已知的,例如,可以使所述表达系统包括含有编码所述融合蛋白的多核苷酸的表达载体的宿主细胞、或染色体中整合有编码所述融合蛋白的多核苷酸的宿主细胞;再例如,可以使所述表达系统包括含有编码所述靶向启动子的sgRNA的多核苷酸的表达载体的宿主细胞、或染色体中整合有编码所述靶向启动子的sgRNA的多核苷酸的宿主细胞;再例如,可以使所述表达系统包括含有编码所述靶向增强子的sgRNA的多核苷酸的表达载体的宿主细胞、或染色体中整合有编码所述靶向增强子的sgRNA的多核苷酸的宿主细胞。所述表达系统还可以包括能够表达目标基因的宿主细胞,在本发明一具体实施例中,所述目标基因可以是β-globin基因簇中的基因,更具体可以是沉默基因,更具体可以是HBB基因。在本发明另一具体实施例中,所述宿主细胞可以是真核细胞,更具体可以是后生动物的细胞,更具体的可以是后生动物(例如,包括但不限于人、小鼠等)来源的原代细胞细胞或永生化细胞系,例如,可以是血系细胞系,更具体可以是人K562细胞。本领域技术人员可以根据宿主细胞的种类,选择合适的表达载体,例如,所述表达载体可以是包括但不限于pCDNA3.1、pST1374等瞬转载体或lenti病毒载体等。所述表达系统中,所述宿主细胞可以是能够表达目标基因、所述融合蛋白、所述靶向启动子的sgRNA、所述靶向增强子的sgRNA中的一个或多个,从而可以在表达系统中形成所述的DNA成环体系。The third aspect of the present invention provides an expression system, the expression system includes a host cell capable of expressing the fusion protein, the sgRNA targeting the promoter, and the sgRNA targeting the enhancer. Thus, the dimer formed by the fusion protein can be used to shorten the three-dimensional spatial distance of the targeting site to achieve the smooth expression of the target gene. The method of enabling the expression system to express the fusion protein, the sgRNA targeting the promoter, and the sgRNA targeting the enhancer should be known to those skilled in the art. For example, the expression system may include a host cell containing an expression vector of a polynucleotide encoding the fusion protein, or a host cell in which a polynucleotide encoding the fusion protein is integrated into a chromosome; for another example, the expression system may include a host cell containing an expression vector of a polynucleotide encoding the sgRNA targeting the promoter, or a host cell in which a polynucleotide encoding the sgRNA targeting the promoter is integrated into a chromosome; for another example, the expression system may include a host cell containing an expression vector of a polynucleotide encoding the sgRNA targeting the enhancer, or a host cell in which a polynucleotide encoding the sgRNA targeting the enhancer is integrated into a chromosome. The expression system may also include a host cell capable of expressing a target gene. In a specific embodiment of the present invention, the target gene may be a gene in the β-globin gene cluster, more specifically a silent gene, and more specifically an HBB gene. In another specific embodiment of the present invention, the host cell may be a eukaryotic cell, more specifically a cell of a metazoan, and more specifically a primary cell or immortalized cell line derived from a metazoan (e.g., including but not limited to humans, mice, etc.), for example, a blood cell line, more specifically a human K562 cell. A person skilled in the art may select a suitable expression vector according to the type of host cell, for example, the expression vector may include but is not limited to a transient vector such as pCDNA3.1, pST1374, or a lenti virus vector, etc. In the expression system, the host cell may be capable of expressing one or more of the target gene, the fusion protein, the sgRNA targeting the promoter, and the sgRNA targeting the enhancer, so that the DNA looping system may be formed in the expression system.
本发明第五方面提供本发明第一方面所提供的DNA成环分子、或本发明第二方面所提供的成环体系、或本发明第三方面所提供的表达系统在基因表达中的用途,优选为真核生物的基因表达中的用途,所述真核生物具体可以是后生动物,具体可以是包括但不限于人、小鼠、果蝇、线虫等。在本发明一具体实施例中,被表达的目标基因可以是β-globin基因簇中的基因,更具体可以是沉默基因,更具体可以是HBB基因。The fifth aspect of the present invention provides the use of the DNA looping molecule provided in the first aspect of the present invention, or the looping system provided in the second aspect of the present invention, or the expression system provided in the third aspect of the present invention in gene expression, preferably in gene expression of eukaryotic organisms, wherein the eukaryotic organisms may be metazoans, including but not limited to humans, mice, fruit flies, nematodes, etc. In a specific embodiment of the present invention, the target gene to be expressed may be a gene in the β-globin gene cluster, more specifically a silent gene, and more specifically the HBB gene.
本发明第六方面提供一种基因表达方法,所述基因表达方法可以为体外基因表达方法,包括:通过本发明第一方面所提供的融合蛋白、或本发明第二方面所提供的成环体系,拉近靶向位点的三维空间距离,进行基因表达。例如,所述基因表达方法可以包括:在所述成环体系存在的条件下,在适当条件下培养能够表达目标基因的宿主细胞。再例如,所述基因表达方法可以包括:在适当条件下培养本发明第三方面所提供的表达系统。The sixth aspect of the present invention provides a gene expression method, which can be an in vitro gene expression method, including: through the fusion protein provided by the first aspect of the present invention, or the looping system provided by the second aspect of the present invention, the three-dimensional spatial distance of the target site is shortened to perform gene expression. For example, the gene expression method may include: in the presence of the looping system, culturing a host cell capable of expressing the target gene under appropriate conditions. For another example, the gene expression method may include: culturing the expression system provided by the third aspect of the present invention under appropriate conditions.
本发明针对现有的DNA成环系统的不足,将LDB1与spdCas9形成融合蛋白,形成一种新的融合蛋白LDB1-dCas9,与现有技术相比,本发明是通过染色质三维结构改变达到,无需改变基因组序列信息或者表观遗传修饰,且具有操作简单,实验准备周期短,大幅降低了时间和工作成本,不需要添加任何小分子来诱导形成二聚体,且由于同源二聚体单体的存在,因此只需要一种CRISPR-Cas9就能同时靶向两个位点成环,大幅降低了时间和工作成本,且当增加基因的靶向位点时,能提高DNA成环的效率。In view of the shortcomings of the existing DNA looping system, the present invention forms a fusion protein of LDB1 and spdCas9 to form a new fusion protein LDB1-dCas9. Compared with the prior art, the present invention is achieved by changing the three-dimensional structure of chromatin, without changing the genome sequence information or epigenetic modification, and has the advantages of simple operation and short experimental preparation cycle, which greatly reduces the time and work cost, and does not need to add any small molecules to induce the formation of dimers. Moreover, due to the presence of homologous dimer monomers, only one CRISPR-Cas9 is needed to simultaneously target two sites for looping, which greatly reduces the time and work cost, and when the targeting site of the gene is increased, the efficiency of DNA looping can be improved.
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。The following describes the embodiments of the present invention through specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the contents disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and the details in this specification can also be modified or changed in various ways based on different viewpoints and applications without departing from the spirit of the present invention.
在进一步描述本发明具体实施方式之前,应理解,本发明的保护范围不局限于下述特定的具体实施方案;还应当理解,本发明实施例中使用的术语是为了描述特定的具体实施方案,而不是为了限制本发明的保护范围;在本发明说明书和权利要求书中,除非文中另外明确指出,单数形式“一个”、“一”和“这个”包括复数形式。Before further describing the specific embodiments of the present invention, it should be understood that the scope of protection of the present invention is not limited to the specific specific embodiments described below; it should also be understood that the terms used in the examples of the present invention are for describing the specific specific embodiments rather than for limiting the scope of protection of the present invention; in the present specification and claims, unless otherwise expressly stated herein, the singular forms "a", "an" and "the" include plural forms.
当实施例给出数值范围时,应理解,除非本发明另有说明,每个数值范围的两个端点以及两个端点之间任何一个数值均可选用。除非另外定义,本发明中使用的所有技术和科学术语与本技术领域技术人员通常理解的意义相同。除实施例中使用的具体方法、设备、材料外,根据本技术领域的技术人员对现有技术的掌握及本发明的记载,还可以使用与本发明实施例中所述的方法、设备、材料相似或等同的现有技术的任何方法、设备和材料来实现本发明。When the embodiments give numerical ranges, it should be understood that, unless otherwise specified in the present invention, both endpoints of each numerical range and any numerical value between the two endpoints can be selected. Unless otherwise defined, all technical and scientific terms used in the present invention have the same meaning as those generally understood by those skilled in the art. In addition to the specific methods, equipment, and materials used in the embodiments, according to the grasp of the prior art by those skilled in the art and the record of the present invention, any methods, equipment, and materials of the prior art similar or equivalent to the methods, equipment, and materials described in the embodiments of the present invention can also be used to realize the present invention.
除非另外说明,本发明中所公开的实验方法、检测方法、制备方法均采用本技术领域常规的分子生物学、生物化学、染色质结构和分析、分析化学、细胞培养、重组DNA技术及相关领域的常规技术。这些技术在现有文献中已有完善说明,具体可参见Sambrook等MOLECULAR CLONING:A LABORATORY MANUAL,Second edition,Cold Spring HarborLaboratory Press,1989and Third edition,2001;Ausubel等,CURRENT PROTOCOLS INMOLECULAR BIOLOGY,John Wiley&Sons,New York,1987and periodic updates;theseries METHODS IN ENZYMOLOGY,Academic Press,San Diego;Wolffe,CHROMATINSTRUCTURE AND FUNCTION,Third edition,Academic Press,San Diego,1998;METHODS INENZYMOLOGY,Vol.304,Chromatin(P.M.Wassarman and A.P.Wolffe,eds.),AcademicPress,San Diego,1999;和METHODS IN MOLECULAR BIOLOGY,Vol.119,ChromatinProtocols(P.B.Becker,ed.)Humana Press,Totowa,1999等。Unless otherwise stated, the experimental methods, detection methods, and preparation methods disclosed in the present invention all adopt conventional techniques in the field of molecular biology, biochemistry, chromatin structure and analysis, analytical chemistry, cell culture, recombinant DNA technology, and related fields. These techniques are well described in the literature, see Sambrook et al., MOLECULAR CLONING: A LABORATORY MANUAL, Second edition, Cold Spring Harbor Laboratory Press, 1989 and Third edition, 2001; Ausubel et al., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, New York, 1987 and periodic updates; these series METHODS IN ENZYMOLOGY, Academic Press, San Diego; Wolffe, CHROMATINSTRUCTURE AND FUNCTION, Third edition, Academic Press, San Diego, 1998; METHODS IN ENZYMOLOGY, Vol. 304, Chromatin (P. M. Wassarman and A. P. Wolffe, eds.), Academic Press, San Diego, 1999; and METHODS IN MOLECULAR BIOLOGY, Vol. 304, Chromatin (P. M. Wassarman and A. P. Wolffe, eds.), Academic Press, San Diego, 1999. BIOLOGY, Vol. 119, Chromatin Protocols (P.B. Becker, ed.) Humana Press, Totowa, 1999, etc.
实施例1Example 1
LDB1-dCas9质粒的构建Construction of LDB1-dCas9 plasmid
将人类LDB1的cDNA,核苷酸序列(NM_001113407.2)为atgctggatagggatgtgggtccaactcccatgtatccgcctacatacctggagccagggattgggaggcacacaccatatggcaaccaaactgactacagaatatttgagcttaacaaacggcttcagaactggacagaggagtgtgacaatctctggtgggatgcattcacgactgagttctttgaggatgatgccatgttgaccatcactttctgcctggaggatggaccaaagagatataccattggccggaccctgatcccacgctacttccgcagcatctttgaggggggtgctacggagctgtactatgttcttaagcaccccaaggaggcattccacagcaactttgtgtccctcgactgtgaccagggcagcatggtgacccagcatggcaagcccatgttcacccaggtgtgtgtggagggccggttgtacctggagttcatgtttgacgacatgatgcggataaagacgtggcacttcagcatccggcagcaccgagagctcatcccccgcagcatccttgccatgcatgcccaagacccccagatgttggatcagctctccaaaaacatcactcggtgtgggctgtccaattccactctcaactacctccgactctgtgtgatactcgagcccatgcaagagctcatgtcacgccacaagacctacagcctcagcccccgcgactgcctcaagacctgccttttccagaagtggcagcgcatggtagcaccccctgcggagcccacacgtcagcagcccagcaaacggcggaaacggaagatgtcagggggcagcaccatgagctctggtggtggcaacaccaacaacagcaacagcaagaagaagagcccagctagcaccttcgccctctccagccaggtacctgatgtgatggtggtgggggagcccaccctgatgggcggggagttcggggacgaggacgagaggctcatcacccggctggagaacacccagtttgacgcagccaacggcattgacgacgaggacagctttaacaactcccctgcactgggcgccaacagcccctggaacagcaagcctccgtccagccaagaaagcaaatcggagaaccccacgtcacaggcctcccag(SEQ ID NO.37),稀释至10μL作为PCR模板。设计正向引物带有NotI酶切位点:gggacctaagaaaaagaggaaggtggcggccgctggcggcagcatgctggatagggatgtgggtccaactcccatgtatccg(SEQ IDNO.29),反向引物带有KpnI酶切位点ctctcgggggtggcgctctcgctggtaccgggggtctcgctgccgctctgggaggcctgtgacgt(SEQ ID NO.30),加水溶解至10μM。使用诺唯赞高保真酶试剂盒(Vazyme,p501-d2)扩增LDB1的cDNA序列片段。扩增体系和PCR反应条件如所示:The cDNA of human LDB1, the nucleotide sequence (NM_001113407.2) is atgctggatagggatgtgggtccaactcccatgtatccgcctacatacctggagccagggattgggaggcacaaccatatggcaaccaaactgactacagaatatttgagcttaacaaacggcttcagaactggacagaggagtgtgacaatctctggtgggatgcattcacgactga gttctttgaggatgatgccatgttgaccatcactttctgcctggaggatggaccaaagagatataccattggccgga ccctgatcccacgctacttccgcagcatctttgaggggggtgctacggagctgtactatgttcttaagcaccccaaggaggcattccacagcaactttgtgtccctcgactgtgaccagggcagcatggtgacccagcatggcaagcccatgttcacccaggtgtgtgtggagggccggttgtacctggagttcatgtttg acgacatgatgcggataaagacgtggcacttcagcatccggcagcaccgagagctcatcccccgcagcatccttgccatgcatgcccaag accccgatgttggatcagctctccaaaaacatcactcggtgtgggctgtccaattccactctcaactacctccgactctgtgtgatactcgagcccatgcaagagctcatgtcacgccacaagacctacagcctcagcccccgcgactgcctcaagacctgccttttccagaagtggcagcgcatggtagcaccccctgcggag cccacacgtcagcagcccagcaaacggcggaaacggaagatgtcaggggggcagcaccatgagctctggtggtggcaacaccaaca acagcaacagcaagaagaagagcccagctagcaccttcgccctctccagccaggtacctgatgtgatggtggtgggggagcccaccctgatgggcggggagttcggggacgaggacgagaggctcatcacccggctggagaacacccagtttgacgcagccaacggcattgacgacgaggacagctttaacaactcccctgcactgggcgc caacagcccctggaacagcaagcctccgtccagccaagaaagcaaatcggagaaccccacgtcacaggcctcccag (SEQ ID NO. 37), diluted to 10 μL as a PCR template. The forward primer was designed with a NotI restriction site: gggacctaagaaaaagaggaaggtggcggccgctggcggcagcatgctggatagggatgtgggtccaactcccatgtatccg (SEQ ID NO.29), and the reverse primer was designed with a KpnI restriction site ctctcgggggtggcgctctcgctggtaccgggggtctcgctgccgctctgggaggcctgtgacgt (SEQ ID NO.30), and dissolved in water to 10 μM. The cDNA sequence fragment of LDB1 was amplified using an enzyme kit (Vazyme, p501-d2). The amplification system and PCR reaction conditions are as follows:
PCR扩增产物通过AxyPrep PCR Clean-up试剂盒(Axygen,AP-PCR-500G)纯化回收。另取pST1374-N-NLS-flag-linker-dCas9载体1μg,用NotI-HF(NEB,R3189S)和KpnI-HF(NEB,R3142S)做酶切,37℃孵育2h。酶切体系如下:The PCR amplification product was purified and recovered by AxyPrep PCR Clean-up Kit (Axygen, AP-PCR-500G). 1 μg of pST1374-N-NLS-flag-linker-dCas9 vector was taken and digested with NotI-HF (NEB, R3189S) and KpnI-HF (NEB, R3142S) and incubated at 37°C for 2 h. The digestion system is as follows:
酶切产物用AxyPrep DNA凝胶回收试剂盒(Axygen,AP-GX-250G)做割胶回收。通过Vazyme重组试剂盒(Vazyme,C112-01)重组连接PCR片段和酶切后载体片段,连接体系如下:The digested product was recovered by gel extraction using the AxyPrep DNA gel recovery kit (Axygen, AP-GX-250G). The PCR fragment and the digested vector fragment were recombined and connected using the Vazyme recombination kit (Vazyme, C112-01). The connection system is as follows:
连接产物在37℃孵育0.5h,转化涂板,经Sanger测序得到正确的LDB1-dCas9质粒,序列信息见SEQ ID NO.11。The ligation product was incubated at 37°C for 0.5 h, transformed and plated, and the correct LDB1-dCas9 plasmid was obtained by Sanger sequencing. The sequence information is shown in SEQ ID NO.11.
dCas9-LDB1质粒的构建Construction of dCas9-LDB1 plasmid
以LDB1的cDNA作为PCR模板,设计正向引物带有BssHII酶切位点gggcgcgctggaggaggatccggaggaggatccggaggaggatccatgctggatagggatgtgggtccaactcccatgtatccg(SEQIDNO.31),反向引物带有ApaI酶切位点gaagggcccctgggaggcctgtgacgt(SEQ ID NO.32),加水溶解至10μM。使用诺唯赞高保真酶试剂盒(Vazyme,p501-d2)扩增LDB1的cDNA序列片段。扩增体系和PCR反应条件如下:Using LDB1 cDNA as a PCR template, the forward primer was designed with a BssHII restriction site gggcgcgctggaggaggatccggaggaggatccggaggaggatccatgctggatagggatgtgggtccaactcccatgtatccg (SEQ ID NO.31), and the reverse primer was designed with an ApaI restriction site gaagggcccctgggaggcctgtgacgt (SEQ ID NO.32), and dissolved in water to 10 μM. The cDNA sequence fragment of LDB1 was amplified using the Vazyme high-fidelity enzyme kit (Vazyme, p501-d2). The amplification system and PCR reaction conditions are as follows:
PCR扩增产物通过AxyPrep PCR Clean-up试剂盒(Axygen,AP-PCR-500G)纯化回收并取1μg,另取pST1374-N-NLS-flag-linker-dCas9载体1μg,用ApaI(NEB,R0114S)和BssHII(NEB,R0119S)分别酶切PCR目的片段或载体,25℃孵育2h。酶切体系如下:The PCR amplification product was purified and recovered by AxyPrep PCR Clean-up Kit (Axygen, AP-PCR-500G) and 1 μg was taken. Another 1 μg of pST1374-N-NLS-flag-linker-dCas9 vector was taken and the PCR target fragment or vector was digested with ApaI (NEB, R0114S) and BssHII (NEB, R0119S) respectively, and incubated at 25°C for 2 hours. The digestion system is as follows:
酶切产物用AxyPrep DNA凝胶回收试剂盒(Axygen,AP-GX-250G)做割胶回收。通过T4连接酶(NEB,M0202S)连接酶切后的PCR片段和载体片段,连接体系如下:The digested product was recovered by gel extraction using the AxyPrep DNA Gel Recovery Kit (Axygen, AP-GX-250G). The digested PCR fragment and the vector fragment were connected using T4 ligase (NEB, M0202S). The connection system is as follows:
连接产物在16℃孵育2h,转化涂板,经Sanger测序得到正确的dCas9-LDB1质粒,序列信息见SEQ ID NO.12。The ligation product was incubated at 16°C for 2 h, transformed and plated, and the correct dCas9-LDB1 plasmid was obtained by Sanger sequencing. The sequence information is shown in SEQ ID NO.12.
DD-dCas9质粒的构建Construction of DD-dCas9 plasmid
以LDB1的cDNA作为PCR模板,设计正向引物带有NotI酶切位点gtggcggccgctggcggcagcatgctggatagggatgtgggtccaactcccatgtatccg(SEQ ID NO.33),反向引物带有KpnI酶切位点cgctggtaccgggggtctcgctgccgctggacagcccacaccgagtgatgtttttgg(SEQ IDNO.34),加水溶解至10μM。使用诺唯赞高保真酶试剂盒(Vazyme,p501-d2)扩增LDB1的dimmer domain(DD)片段。扩增体系和PCR反应条件如下:Using LDB1 cDNA as a PCR template, the forward primer was designed with NotI restriction site gtggcggccgctggcggcagcatgctggatagggatgtgggtccaactcccatgtatccg (SEQ ID NO.33), and the reverse primer was designed with KpnI restriction site cgctggtaccgggggtctcgctgccgctggacagcccacaccgagtgatgttttttgg (SEQ ID NO.34), and dissolved in water to 10 μM. The dimmer domain (DD) fragment of LDB1 was amplified using the Vazyme high-fidelity enzyme kit (Vazyme, p501-d2). The amplification system and PCR reaction conditions are as follows:
PCR扩增产物经AxyPrep PCR Clean-up试剂盒(Axygen,AP-PCR-500G)纯化回收并取1μg,另取pST1374-N-NLS-flag-linker-dCas9载体1μg,用NotI-HF(NEB,R3189S)和KpnI-HF(NEB,R3142S)做酶切,37℃孵育2h。酶切体系如下:The PCR amplification product was purified and recovered by AxyPrep PCR Clean-up Kit (Axygen, AP-PCR-500G) and 1 μg was taken. Another 1 μg of pST1374-N-NLS-flag-linker-dCas9 vector was taken and digested with NotI-HF (NEB, R3189S) and KpnI-HF (NEB, R3142S) and incubated at 37°C for 2 hours. The digestion system is as follows:
酶切产物用AxyPrep DNA凝胶回收试剂盒(Axygen,AP-GX-250G)做割胶回收。通过T4连接酶(NEB,M0202S)连接酶切后的PCR片段和载体片段,连接体系如下:The digested product was recovered by gel extraction using the AxyPrep DNA Gel Recovery Kit (Axygen, AP-GX-250G). The digested PCR fragment and the vector fragment were connected using T4 ligase (NEB, M0202S). The connection system is as follows:
连接产物在16℃孵育2h,转化涂板,经Sanger测序得到正确的DD-dCas9质粒,序列信息见SEQ ID NO.13。The ligation product was incubated at 16°C for 2 h, transformed and plated, and the correct DD-dCas9 plasmid was obtained by Sanger sequencing. The sequence information is shown in SEQ ID NO.13.
dCas9-DD质粒的构建Construction of dCas9-DD plasmid
以LDB1的cDNA作为PCR模板,设计正向引物带有BssHII酶切位点gggcgcgctggaggaggatccggaggaggatccggaggaggatccatgctggatagggatgtgggtccaactcccatgtatccg(SEQIDNO.35),反向引物带有ApaI酶切位点tcgaagggcccggacagcccacaccgagtgatgtt(SEQ IDNO.36),加水溶解至10μM。使用诺唯赞高保真酶试剂盒(Vazyme,p501-d2)扩增LDB1的cDNA序列片段。扩增体系和PCR反应条件如所示:Using the cDNA of LDB1 as a PCR template, the forward primer was designed with a BssHII restriction site gggcgcgctggaggaggatccggaggaggatccggaggaggatccatgctggatagggatgtgggtccaactcccatgtatccg (SEQ ID NO.35), and the reverse primer was designed with an ApaI restriction site tcgaagggcccggacagcccacaccgagtgatgtt (SEQ ID NO.36), and dissolved in water to 10 μM. The cDNA sequence fragment of LDB1 was amplified using the Vazyme high-fidelity enzyme kit (Vazyme, p501-d2). The amplification system and PCR reaction conditions are shown as follows:
PCR扩增产物经AxyPrep PCR Clean-up试剂盒(Axygen,AP-PCR-500G)纯化回收并取1μg,另取pST1374-N-NLS-flag-linker-dCas9载体1μg,用ApaI(NEB,R0114S)和BssHII(NEB,R0119S)分别酶切PCR目的片段或载体,25℃孵育2h。酶切体系如下:The PCR amplification product was purified and recovered by AxyPrep PCR Clean-up Kit (Axygen, AP-PCR-500G) and 1 μg was taken. Another 1 μg of pST1374-N-NLS-flag-linker-dCas9 vector was taken and the PCR target fragment or vector was digested with ApaI (NEB, R0114S) and BssHII (NEB, R0119S) respectively, and incubated at 25°C for 2 hours. The digestion system is as follows:
酶切产物用AxyPrep DNA凝胶回收试剂盒(Axygen,AP-GX-250G)做割胶回收。通过T4连接酶(NEB,M0202S)连接酶切后的PCR片段和载体片段,连接体系如下:The digested product was recovered by gel extraction using the AxyPrep DNA Gel Recovery Kit (Axygen, AP-GX-250G). The digested PCR fragment and the vector fragment were connected using T4 ligase (NEB, M0202S). The connection system is as follows:
连接产物在16℃孵育2h,转化涂板,经Sanger测序得到正确的dCas9-DDThe ligation products were incubated at 16°C for 2 h, transformed and plated, and the correct dCas9-DD was obtained by Sanger sequencing.
质粒,序列信息见SEQ ID NO.14。Plasmid, sequence information see SEQ ID NO.14.
靶向位点sgRNA质粒的构建Construction of targeting site sgRNA plasmid
对K562细胞的β-globin基因簇的LCR区域DHS2设计3个靶向sgRNA分别命名为L-sg1序列为aatatgtcacattctgtctc(SEQ ID NO.7);L-sg3,序列为ggactatgggaggtcactaa(SEQ ID NO.8);L-sg4,序列为gaaggttacacagaaccaga(SEQ ID NO.9)。对HBB基因的promoter区域设计3个sgRNA,分别命名为P-sg1序列为ggccaagagatatatcttag(SEQ IDNO.4);P-sg3序列为gtgccagaagagccaaggac(SEQ ID NO.5)、P-sg4序列为gtggagccacaccctagggt(SEQ ID NO.6),。阴性对照sgRNA靶向EGFP,命名为sg-egfp,序列为ggagcgcaccatcttcttca(SEQ ID NO.10)。根据sgRNA序列设计碱基互补配对的正负链引物,正链在5’端加碱基ACCG,负链5’端加碱基AAAC,加灭菌水溶解至100μM。经退火后形成有overhang的双链DNA片段,连接到BsaI(NEB,R0535S)酶切后的pGL3-U6-sgRNA(Addgene#51133)线性载体上,以构建靶向特异性sgRNA。所有靶向位点的sgRNA的引物序列如SEQ IDNO.所示,具体如下:Three targeting sgRNAs were designed for the LCR region DHS2 of the β-globin gene cluster of K562 cells, named L-sg1 with a sequence of aatatgtcacattctgtctc (SEQ ID NO.7); L-sg3 with a sequence of ggactatgggaggtcactaa (SEQ ID NO.8); and L-sg4 with a sequence of gaaggttacacagaaccaga (SEQ ID NO.9). Three sgRNAs were designed for the promoter region of the HBB gene, named P-sg1 with a sequence of ggccaagagatatatcttag (SEQ ID NO.4); P-sg3 with a sequence of gtgccagaagagccaaggac (SEQ ID NO.5), and P-sg4 with a sequence of gtggagccacaccctagggt (SEQ ID NO.6). The negative control sgRNA targets EGFP and is named sg-egfp, with a sequence of ggagcgcaccatcttcttca (SEQ ID NO.10). According to the sgRNA sequence, the positive and negative strand primers with complementary base pairing were designed. The positive strand was added with base ACCG at the 5' end, and the negative strand was added with base AAAC at the 5' end. Sterile water was added to dissolve to 100 μM. After annealing, a double-stranded DNA fragment with overhang was formed, which was connected to the pGL3-U6-sgRNA (Addgene#51133) linear vector after BsaI (NEB, R0535S) digestion to construct a targeting specific sgRNA. The primer sequences of sgRNAs for all targeting sites are shown in SEQ ID NO., as follows:
L-sg1正链引物序列:ACCG AATATGTCACATTCTGTCTC(SEQ ID NO.15)L-sg1 positive strand primer sequence: ACCG AATATGTCACATTCTGTCTC (SEQ ID NO. 15)
L-sg1负链引物序列:AAAC GAGACAGAATGTGACATATT(SEQ ID NO.16)L-sg1 negative strand primer sequence: AAAC GAGACAGAATGTGACATATT (SEQ ID NO.16)
L-sg3正链引物序列:ACCG GGACTATGGGAGGTCACTAA(SEQ ID NO.17)L-sg3 positive strand primer sequence: ACCG GGACTATGGGAGGTCACTAA (SEQ ID NO.17)
L-sg3负链引物序列:AAAC TTAGTGACCTCCCATAGTCC(SEQ ID NO.18)L-sg3 negative strand primer sequence: AAAC TTAGTGACCTCCCATAGTCC (SEQ ID NO.18)
L-sg4正链引物序列:ACCG GAAGGTTACACAGAACCAGA(SEQ ID NO.19)L-sg4 positive strand primer sequence: ACCG GAAGGTTACACAGAACCAGA (SEQ ID NO. 19)
L-sg4负链引物序列:AAAC TCTGGTTCTGTGTAACCTTC(SEQ ID NO.20)L-sg4 negative strand primer sequence: AAAC TCTGGTTCTGTGTAACCTTC (SEQ ID NO.20)
P-sg1正链引物序列:ACCG GGCCAAGAGATATATCTTAG(SEQ ID NO.21)P-sg1 positive strand primer sequence: ACCG GGCCAAGAGATATATCTTAG (SEQ ID NO.21)
P-sg1负链引物序列:AAAC CTAAGATATATCTCTTGGCC(SEQ ID NO.22)P-sg1 negative strand primer sequence: AAAC CTAAGATATATCTCTTGGCC (SEQ ID NO.22)
P-sg3正链引物序列:ACCG GTGCCAGAAGAGCCAAGGAC(SEQ ID NO.23)P-sg3 positive strand primer sequence: ACCG GTGCCAGAAGAGCCAAGGAC (SEQ ID NO.23)
P-sg3负链引物序列:AAAC GTCCTTGGCTCTTCTGGCAC(SEQ ID NO.24)P-sg3 negative strand primer sequence: AAAC GTCCTTGGCTCTTCTGGCAC (SEQ ID NO.24)
P-sg4正链引物序列:ACCG GTGGAGCCACACCCTAGGGT(SEQ ID NO.25)P-sg4 positive strand primer sequence: ACCG GTGGAGCCACACCCTAGGGT (SEQ ID NO.25)
P-sg4负链引物序列:AAAC ACCCTAGGGTGTGGCTCCAC(SEQ ID NO.26)P-sg4 negative strand primer sequence: AAAC ACCCTAGGGTGTGGCTCCAC (SEQ ID NO.26)
sg-egfp正链引物序列:ACCG GGAGCGCACCATCTTCTTCA(SEQ ID NO.27)sg-egfp positive strand primer sequence: ACCG GGAGCGCACCATCTTCTTCA (SEQ ID NO.27)
sg-egfp负链引物序列:AAAC TGAAGAAGATGGTGCGCTCC(SEQ ID NO.28)sg-egfp negative strand primer sequence: AAAC TGAAGAAGATGGTGCGCTCC (SEQ ID NO.28)
退火体系和退火程序具体如下:The annealing system and annealing procedure are as follows:
利用BsaI(NEB,R0535S)对pGL3-U6-sgRNA(Addgene#51133)质粒进行酶切以得到线性化sgRNA载体。酶切体系如下所示:The pGL3-U6-sgRNA (Addgene #51133) plasmid was digested with BsaI (NEB, R0535S) to obtain a linearized sgRNA vector. The digestion system is as follows:
酶切产物用AxyPrep DNA凝胶回收试剂盒(Axygen,AP-GX-250G)做割胶回收得到线性化载体。取50ng线性化载体与3μl退火产物通过T4连接酶(NEB,M0202S)连接,16℃孵育2小时后并转化涂板,经Sanger测序得到正确的靶向特异性sgRNA。The enzyme digestion product was cut and recovered using the AxyPrep DNA gel recovery kit (Axygen, AP-GX-250G) to obtain the linearized vector. 50 ng of the linearized vector was ligated with 3 μl of the annealed product using T4 ligase (NEB, M0202S), incubated at 16°C for 2 hours, and transformed to a plate. The correct targeting specific sgRNA was obtained by Sanger sequencing.
连接体系如下:The connection system is as follows:
LDB1-dCas9通过人工DNA成环重编程目的基因空间位置的示意图如图1所示。A schematic diagram of LDB1-dCas9 reprogramming the spatial position of the target gene through artificial DNA looping is shown in Figure 1.
实施例2Example 2
LDB1-dCas9和dCas9-LDB1通过DNA成环重编程基因空间位置激活HBB表达:LDB1-dCas9 and dCas9-LDB1 activate HBB expression by reprogramming gene spatial location through DNA looping:
利用上述的LDB1-dCas9和dCas9-LDB1系统用电穿孔的方法转染K562细胞,过程如下:The above-mentioned LDB1-dCas9 and dCas9-LDB1 systems were used to transfect K562 cells by electroporation, and the process was as follows:
1)K562细胞(来自ATCC)复苏,在10cm培养皿(Corning,430167)中培养,培养基为混有10%的胎牛血清(HyClone,SV30087)的RPMI 1640培养基(Gibco,11875093)。培养温度为37℃,二氧化碳浓度为5%。1) K562 cells (from ATCC) were revived and cultured in 10 cm culture dishes (Corning, 430167) in RPMI 1640 medium (Gibco, 11875093) mixed with 10% fetal bovine serum (HyClone, SV30087) at 37°C and a carbon dioxide concentration of 5%.
2)当细胞浓度为1x106/ml时收集细胞,每管1x106个并1000r/min,离心收集细胞。使用Lonza电转试剂盒Amaxa cell line Nucleofector Kit V(Lonza,VCA-1003),每孔转染的质粒的量分别是LDB1-dCas9或者dCas9-LDB1质粒1μg,靶向LCR区域DHS2的sgRNA质粒0.5μg和靶向HBB基因启动子区的sgRNA 0.5μg,电转程序为T-016(Lonza 2b)。三种靶向LCR的sgRNA与3种靶向HBB启动子的sgRNA共9种组合。阴性对照组电转的sgRNA为靶向egfp的sgRNA。2) When the cell concentration was 1x10 6 /ml, cells were collected, 1x10 6 cells per tube and 1000r/min, and the cells were collected by centrifugation. Using the Lonza electroporation kit Amaxa cell line Nucleofector Kit V (Lonza, VCA-1003), the amount of plasmid transfected in each well was 1μg of LDB1-dCas9 or dCas9-LDB1 plasmid, 0.5μg of sgRNA plasmid targeting DHS2 in the LCR region, and 0.5μg of sgRNA targeting the promoter region of the HBB gene, and the electroporation program was T-016 (Lonza 2b). There were 9 combinations of three sgRNAs targeting LCR and three sgRNAs targeting the HBB promoter. The sgRNA electroporated in the negative control group was the sgRNA targeting egfp.
3)电转结束后用500μl培养基轻轻冲出细胞,移至12孔板,孔板中各加1.5ml的1640培养基。3) After electroporation, the cells were gently flushed out with 500 μl of culture medium and transferred to a 12-well plate. 1.5 ml of 1640 culture medium was added to each well of the plate.
4)转染24小时后,用终浓度为2ng/ml的Puromycin(InvivoGen,nt-pr-1)和10ng/ml的Blasticidin(InvivoGen,ant-bl-1)做药杀处理。4) 24 hours after transfection, the cells were treated with Puromycin (InvivoGen, nt-pr-1) at a final concentration of 2 ng/ml and Blasticidin (InvivoGen, ant-bl-1) at a final concentration of 10 ng/ml.
转染72小时后收细胞,用500μl的Trizol(Invitrogen,15596018)裂解并用RNA提取试剂盒TransZol Up Plus RNA Kit(ER501-01)抽提RNA,分别取500ng的RNA用TOYOBO逆转录试剂盒(Toyobo,FSQ-301)转录cDNA稀释10倍使用。用Biotool的Sybr greenqPCRMastermix(Biotool,B21703)做qPCR,体系如下:72 hours after transfection, cells were harvested, lysed with 500 μl Trizol (Invitrogen, 15596018), and RNA was extracted using the RNA extraction kit TransZol Up Plus RNA Kit (ER501-01). 500 ng of RNA was used to transcribe cDNA using the TOYOBO reverse transcription kit (Toyobo, FSQ-301) and diluted 10 times for use. qPCR was performed using Biotool's Sybr green qPCR Mastermix (Biotool, B21703), and the system was as follows:
测得LDB1-dCas9和dCas9-LDB1在各sgRNA组合下对HBB基因表达的调控,如图2所示。相比较阴性对照组,LDB-dCas9和dCas9-LDB1都能通过DNA成环重编程HBB基因空间位置特异性地上调其表达。LDB1-dCas9对HBB的激活效果高于dCas9-LDB1,且当靶向sgRNA选择为L-sg3和p-sg3组合时HBB的表达上调最高LDB1-dCas9提高12倍,dCas9-LDB1提高将近8倍。The regulation of HBB gene expression by LDB1-dCas9 and dCas9-LDB1 under various sgRNA combinations was measured, as shown in Figure 2. Compared with the negative control group, both LDB-dCas9 and dCas9-LDB1 can upregulate the expression of HBB gene spatially specifically by DNA looping reprogramming. LDB1-dCas9 has a higher activation effect on HBB than dCas9-LDB1, and when the targeting sgRNA is selected as a combination of L-sg3 and p-sg3, the expression of HBB is upregulated the most, with LDB1-dCas9 increasing by 12 times and dCas9-LDB1 increasing by nearly 8 times.
为了验证对HBB的表达是由于将HBB的空间位置拉近到LCR导致其特异性表达,我们又检测了在靶向sgRNA不同组合时为L-sg3和P-sg1、L-sg3和P-sg3、L-sg1和P-sg3时基因簇里其他globin基因的表达情况,如图3所示,相比较空白对照组,HBB被LDB1-dCas9在靶向sgRNA为L-sg3和P-sg1组合时提高了27倍,在L-sg3和P-sg3组合时提高了25倍;HBB被dCas9-LDB1在靶向sgRNA为L-sg1和P-sg3组合时提高了15倍,在L-sg3和P-sg3组合时提高了14倍。同时我们发现HBD的表达也有4-5倍的提高,我们推测由于HBD基因距离HBB基因距离较近,在DNA成环时也拉近了LCR与HBD的空间位置进而提高了其表达量。In order to verify that the expression of HBB was specifically due to the spatial proximity of HBB to LCR, we detected the expression of other globin genes in the gene cluster when targeting different combinations of sgRNAs: L-sg3 and P-sg1, L-sg3 and P-sg3, and L-sg1 and P-sg3. As shown in Figure 3, compared with the blank control group, HBB was increased 27-fold by LDB1-dCas9 when the targeting sgRNA was the combination of L-sg3 and P-sg1, and 25-fold when the targeting sgRNA was the combination of L-sg3 and P-sg3; HBB was increased 15-fold by dCas9-LDB1 when the targeting sgRNA was the combination of L-sg1 and P-sg3, and 14-fold when the targeting sgRNA was the combination of L-sg3 and P-sg3. At the same time, we found that the expression of HBD also increased by 4-5 times. We speculate that because the HBD gene is close to the HBB gene, the spatial position of LCR and HBD is also shortened during DNA looping, thereby increasing its expression level.
全长的LDB1蛋白对HBB基因表达的激活更高:The full-length LDB1 protein has a higher activation of HBB gene expression:
为了验证LDB1与其DD domain对调控HBB基因的效率差异,分别同时以LDB1-dCas9,DD-dCas9,dCas9-LDB1,dCas9-DD对HBB基因进行空间位置重编程。同时为了验证靶向promoter的多个位点对DNA成环的效率是否有加乘作用,在LCR或者HBB启动子区分别选择两个靶向位点。我们将上述融合蛋白的质粒载体各1μg混合并分别与三组混合的sgRNA质粒P-sg3、P-sg1+3、L-sg3&P-sg1+3、L-sg1+3&P-sg1+3电转,每组sgRNA混合物总共0.5μg,当sgRNA混合物中包括多种质粒载体时,各质粒载体的用量相同,电转程序为T-016,具体参照上文。In order to verify the difference in efficiency of LDB1 and its DD domain in regulating the HBB gene, the spatial position of the HBB gene was reprogrammed by LDB1-dCas9, DD-dCas9, dCas9-LDB1, and dCas9-DD. At the same time, in order to verify whether the multiple sites of the targeted promoter have a multiplicative effect on the efficiency of DNA looping, two targeting sites were selected in the LCR or HBB promoter region. We mixed 1 μg of the plasmid vectors of the above fusion proteins and electroporated them with three groups of mixed sgRNA plasmids P-sg3, P-
RNA提取和qPCR检测步骤如上所述。HBB表达变化结果如图4所示。在各组的靶向sgRNA作用下,全长的LDB1对HBB基因的激活效率均高于DD domain。值得注意的是,当增加promoter区的靶向位点时,LDB1-dCas9和dCas9-LDB1对HBB的激活效率具有加乘作用。因此,增加目的基因启动子区的靶向sgRNA能显著提高DNA效率进而提高基因表达。The RNA extraction and qPCR detection steps are as described above. The results of HBB expression changes are shown in Figure 4. Under the action of targeted sgRNA in each group, the activation efficiency of the full-length LDB1 on the HBB gene is higher than that of the DD domain. It is worth noting that when the targeting site in the promoter region is increased, LDB1-dCas9 and dCas9-LDB1 have a multiplying effect on the activation efficiency of HBB. Therefore, increasing the targeted sgRNA in the promoter region of the target gene can significantly improve the DNA efficiency and thus improve gene expression.
综上所述,本发明有效克服了现有技术中的种种缺点而具高度产业利用价值。In summary, the present invention effectively overcomes various shortcomings of the prior art and has high industrial utilization value.
上述实施例仅例示性说明本发明的原理及其功效,而非用于限制本发明。任何熟悉此技术的人士皆可在不违背本发明的精神及范畴下,对上述实施例进行修饰或改变。因此,举凡所属技术领域中具有通常知识者在未脱离本发明所揭示的精神与技术思想下所完成的一切等效修饰或改变,仍应由本发明的权利要求所涵盖。The above embodiments are merely illustrative of the principles and effects of the present invention, and are not intended to limit the present invention. Anyone familiar with the art may modify or alter the above embodiments without departing from the spirit and scope of the present invention. Therefore, all equivalent modifications or alterations made by a person of ordinary skill in the art without departing from the spirit and technical ideas disclosed by the present invention shall still be covered by the claims of the present invention.
序列表Sequence Listing
<110> 上海科技大学<110> ShanghaiTech University
<120> 一种DNA环化分子及其用途<120> A DNA cyclization molecule and its use
<160> 43<160> 43
<170> SIPOSequenceListing 1.0<170> SIPOSequenceListing 1.0
<210> 1<210> 1
<211> 375<211> 375
<212> PRT<212> PRT
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 1<400> 1
Met Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr TyrMet Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr
1 5 10 151 5 10 15
Leu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr AspLeu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp
20 25 3020 25 30
Tyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu GluTyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu
35 40 4535 40 45
Cys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu AspCys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp
50 55 6050 55 60
Asp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys ArgAsp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg
65 70 75 8065 70 75 80
Tyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile PheTyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe
85 90 9585 90 95
Glu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys GluGlu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu
100 105 110100 105 110
Ala Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser MetAla Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met
115 120 125115 120 125
Val Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu GlyVal Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly
130 135 140130 135 140
Arg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys ThrArg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr
145 150 155 160145 150 155 160
Trp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser IleTrp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile
165 170 175165 170 175
Leu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser LysLeu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys
180 185 190180 185 190
Asn Ile Thr Arg Cys Gly Leu Ser Asn Ser Thr Leu Asn Tyr Leu ArgAsn Ile Thr Arg Cys Gly Leu Ser Asn Ser Thr Leu Asn Tyr Leu Arg
195 200 205195 200 205
Leu Cys Val Ile Leu Glu Pro Met Gln Glu Leu Met Ser Arg His LysLeu Cys Val Ile Leu Glu Pro Met Gln Glu Leu Met Ser Arg His Lys
210 215 220210 215 220
Thr Tyr Ser Leu Ser Pro Arg Asp Cys Leu Lys Thr Cys Leu Phe GlnThr Tyr Ser Leu Ser Pro Arg Asp Cys Leu Lys Thr Cys Leu Phe Gln
225 230 235 240225 230 235 240
Lys Trp Gln Arg Met Val Ala Pro Pro Ala Glu Pro Thr Arg Gln GlnLys Trp Gln Arg Met Val Ala Pro Pro Ala Glu Pro Thr Arg Gln Gln
245 250 255245 250 255
Pro Ser Lys Arg Arg Lys Arg Lys Met Ser Gly Gly Ser Thr Met SerPro Ser Lys Arg Arg Lys Arg Lys Met Ser Gly Gly Ser Thr Met Ser
260 265 270260 265 270
Ser Gly Gly Gly Asn Thr Asn Asn Ser Asn Ser Lys Lys Lys Ser ProSer Gly Gly Gly Asn Thr Asn Asn Ser Asn Ser Lys Lys Lys Ser Pro
275 280 285275 280 285
Ala Ser Thr Phe Ala Leu Ser Ser Gln Val Pro Asp Val Met Val ValAla Ser Thr Phe Ala Leu Ser Ser Gln Val Pro Asp Val Met Val Val
290 295 300290 295 300
Gly Glu Pro Thr Leu Met Gly Gly Glu Phe Gly Asp Glu Asp Glu ArgGly Glu Pro Thr Leu Met Gly Gly Glu Phe Gly Asp Glu Asp Glu Arg
305 310 315 320305 310 315 320
Leu Ile Thr Arg Leu Glu Asn Thr Gln Phe Asp Ala Ala Asn Gly IleLeu Ile Thr Arg Leu Glu Asn Thr Gln Phe Asp Ala Ala Asn Gly Ile
325 330 335325 330 335
Asp Asp Glu Asp Ser Phe Asn Asn Ser Pro Ala Leu Gly Ala Asn SerAsp Asp Glu Asp Ser Phe Asn Asn Ser Pro Ala Leu Gly Ala Asn Ser
340 345 350340 345 350
Pro Trp Asn Ser Lys Pro Pro Ser Ser Gln Glu Ser Lys Ser Glu AsnPro Trp Asn Ser Lys Pro Pro Ser Ser Gln Glu Ser Lys Ser Glu Asn
355 360 365355 360 365
Pro Thr Ser Gln Ala Ser GlnPro Thr Ser Gln Ala Ser Gln
370 375370 375
<210> 2<210> 2
<211> 200<211> 200
<212> PRT<212> PRT
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 2<400> 2
Met Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr TyrMet Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr
1 5 10 151 5 10 15
Leu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr AspLeu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp
20 25 3020 25 30
Tyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu GluTyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu
35 40 4535 40 45
Cys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu AspCys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp
50 55 6050 55 60
Asp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys ArgAsp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg
65 70 75 8065 70 75 80
Tyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile PheTyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe
85 90 9585 90 95
Glu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys GluGlu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu
100 105 110100 105 110
Ala Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser MetAla Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met
115 120 125115 120 125
Val Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu GlyVal Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly
130 135 140130 135 140
Arg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys ThrArg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr
145 150 155 160145 150 155 160
Trp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser IleTrp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile
165 170 175165 170 175
Leu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser LysLeu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys
180 185 190180 185 190
Asn Ile Thr Arg Cys Gly Leu SerAsn Ile Thr Arg Cys Gly Leu Ser
195 200195 200
<210> 3<210> 3
<211> 1367<211> 1367
<212> PRT<212> PRT
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 3<400> 3
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val GlyAsp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 151 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe LysTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 3020 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile GlyVal Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 4535 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu LysAla Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 6050 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys TyrArg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 8065 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser PheLeu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 9585 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys HisPhe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr HisGlu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp SerGlu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His MetThr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro AspIle Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr AsnAsn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala LysGln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn LeuAla Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn LeuIle Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe AspIle Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp AspLeu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp LeuAsp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp IlePhe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser MetLeu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys AlaIle Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe AspLeu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser GlnGln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp GlyGlu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg LysThr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu GlyGln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe LeuGlu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile ProLys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp MetTyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu ValThr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr AsnVal Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser LeuPhe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys TyrLeu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln LysVal Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr ValLys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp SerLys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly ThrVal Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp AsnTyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr LeuGlu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala HisPhe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr ThrLeu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp LysGly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe AlaGln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe LysAsn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu HisGlu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly IleGlu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly ArgLeu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln ThrHis Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile GluThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro ValGlu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu GlnGlu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg LeuAsn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830820 825 830
Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys AspSer Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg GlyAsp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly
850 855 860850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys AsnLys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys PheTyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp LysAsp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr LysAla Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp GluHis Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser LysAsn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg GluLeu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val ValIle Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe ValGly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys SerTyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser
1010 1015 10201010 1015 1020
Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser AsnGlu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn
1025 1030 1035 10401025 1030 1035 1040
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu IleIle Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile
1045 1050 10551045 1050 1055
Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile ValArg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val
1060 1065 10701060 1065 1070
Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser MetTrp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met
1075 1080 10851075 1080 1085
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly PhePro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe
1090 1095 11001090 1095 1100
Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile AlaSer Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala
1105 1110 1115 11201105 1110 1115 1120
Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser ProArg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro
1125 1130 11351125 1130 1135
Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly LysThr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1140 1145 11501140 1145 1150
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile MetSer Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met
1155 1160 11651155 1160 1165
Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala LysGlu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1170 1175 11801170 1175 1180
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys TyrGly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr
1185 1190 1195 12001185 1190 1195 1200
Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser AlaSer Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1205 1210 12151205 1210 1215
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr ValGly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 12301220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser ProAsn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 12451235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His TyrGlu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr
1250 1255 12601250 1255 1260
Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val IleLeu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile
1265 1270 1275 12801265 1270 1275 1280
Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys HisLeu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His
1285 1290 12951285 1290 1295
Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu PheArg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe
1300 1305 13101300 1305 1310
Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp ThrThr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr
1315 1320 13251315 1320 1325
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp AlaThr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala
1330 1335 13401330 1335 1340
Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile AspThr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp
1345 1350 1355 13601345 1350 1355 1360
Leu Ser Gln Leu Gly Gly AspLeu Ser Gln Leu Gly Gly Asp
13651365
<210> 4<210> 4
<211> 20<211> 20
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 4<400> 4
ggccaagaga tatatcttag 20ggccaagaga tatatcttag 20
<210> 5<210> 5
<211> 20<211> 20
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 5<400> 5
gtgccagaag agccaaggac 20gtgccagaag agccaaggac 20
<210> 6<210> 6
<211> 20<211> 20
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 6<400> 6
gtggagccac accctagggt 20gtggagccac accctagggt 20
<210> 7<210> 7
<211> 20<211> 20
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 7<400> 7
aatatgtcac attctgtctc 20aatatgtcac attctgtctc 20
<210> 8<210> 8
<211> 20<211> 20
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 8<400> 8
ggactatggg aggtcactaa 20ggactatggg aggtcactaa 20
<210> 9<210> 9
<211> 20<211> 20
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 9<400> 9
gaaggttaca cagaaccaga 2020
<210> 10<210> 10
<211> 20<211> 20
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 10<400> 10
ggagcgcacc atcttcttca 20ggagcgcacc atcttcttca 20
<210> 11<210> 11
<211> 10428<211> 10428
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 11<400> 11
gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720actcacgggg atttccaagt ctccacccca ttgacgtcaa tggggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780aaaatcaacg ggactttcca aaatgtcgta acaactccgcccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840gtaggcgtgt acggtggggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900ctgcttactg gctttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900
accatgggac ctaagaaaaa gaggaaggtg gcggccgctg gcggcagcat gctggatagg 960accatgggac ctaagaaaaa gaggaaggtg gcggccgctg gcggcagcat gctggatagg 960
gatgtgggtc caactcccat gtatccgcct acatacctgg agccagggat tgggaggcac 1020gatgtgggtc caactcccat gtatccgcct acatacctgg agccagggat tgggaggcac 1020
acaccatatg gcaaccaaac tgactacaga atatttgagc ttaacaaacg gcttcagaac 1080acaccatatg gcaaccaaac tgactacaga atatttgagc ttaacaaacg gcttcagaac 1080
tggacagagg agtgtgacaa tctctggtgg gatgcattca cgactgagtt ctttgaggat 1140tggacagagg agtgtgacaa tctctggtgg gatgcattca cgactgagtt ctttgaggat 1140
gatgccatgt tgaccatcac tttctgcctg gaggatggac caaagagata taccattggc 1200gatgccatgt tgaccatcac tttctgcctg gaggatggac caaagagata taccattggc 1200
cggaccctga tcccacgcta cttccgcagc atctttgagg ggggtgctac ggagctgtac 1260cggaccctga tcccacgcta cttccgcagc atctttgagg ggggtgctac ggagctgtac 1260
tatgttctta agcaccccaa ggaggcattc cacagcaact ttgtgtccct cgactgtgac 1320tatgttctta agcaccccaa ggaggcattc cacagcaact ttgtgtccct cgactgtgac 1320
cagggcagca tggtgaccca gcatggcaag cccatgttca cccaggtgtg tgtggagggc 1380cagggcagca tggtgaccca gcatggcaag cccatgttca cccaggtgtg tgtggagggc 1380
cggttgtacc tggagttcat gtttgacgac atgatgcgga taaagacgtg gcacttcagc 1440cggttgtacc tggagttcat gtttgacgac atgatgcgga taaagacgtg gcacttcagc 1440
atccggcagc accgagagct catcccccgc agcatccttg ccatgcatgc ccaagacccc 1500atccggcagc accgagagct catcccccgc agcatccttg ccatgcatgc ccaagacccc 1500
cagatgttgg atcagctctc caaaaacatc actcggtgtg ggctgtccaa ttccactctc 1560cagatgttgg atcagctctc caaaaacatc actcggtgtg ggctgtccaa ttccactctc 1560
aactacctcc gactctgtgt gatactcgag cccatgcaag agctcatgtc acgccacaag 1620aactacctcc gactctgtgt gatactcgag cccatgcaag agctcatgtc acgccacaag 1620
acctacagcc tcagcccccg cgactgcctc aagacctgcc ttttccagaa gtggcagcgc 1680acctacagcc tcagcccccg cgactgcctc aagacctgcc ttttccagaa gtggcagcgc 1680
atggtagcac cccctgcgga gcccacacgt cagcagccca gcaaacggcg gaaacggaag 1740atggtagcac cccctgcgga gcccaacgt cagcagccca gcaaacggcg gaaacggaag 1740
atgtcagggg gcagcaccat gagctctggt ggtggcaaca ccaacaacag caacagcaag 1800atgtcagggg gcagcaccat gagctctggt ggtggcaaca ccaacaacag caacagcaag 1800
aagaagagcc cagctagcac cttcgccctc tccagccagg tacctgatgt gatggtggtg 1860aagaagagcc cagctagcac cttcgccctc tccagccagg tacctgatgt gatggtggtg 1860
ggggagccca ccctgatggg cggggagttc ggggacgagg acgagaggct catcacccgg 1920ggggagccca ccctgatggg cggggagttc ggggacgagg acgagaggct catcacccgg 1920
ctggagaaca cccagtttga cgcagccaac ggcattgacg acgaggacag ctttaacaac 1980ctggagaaca cccagtttga cgcagccaac ggcattgacg acgaggacag ctttaacaac 1980
tcccctgcac tgggcgccaa cagcccctgg aacagcaagc ctccgtccag ccaagaaagc 2040tcccctgcac tgggcgccaa cagcccctgg aacagcaagc ctccgtccag ccaagaaagc 2040
aaatcggaga accccacgtc acaggcctcc cagagcggca gcgagacccc cggtaccagc 2100aaatcggaga accccacgtc acaggcctcc cagagcggca gcgagacccc cggtaccagc 2100
gagagcgcca cccccgagag cgacaagaaa tactctattg gactggctat cgggacaaac 2160gagagcgcca cccccgagag cgacaagaaa tactctattg gactggctat cgggacaaac 2160
tccgttggct gggccgtcat aaccgacgag tataaggtgc caagcaagaa attcaaggtg 2220tccgttggct gggccgtcat aaccgacgag tataaggtgc caagcaagaa attcaaggtg 2220
ctgggtaata ctgaccgcca ttcaatcaag aagaacctga tcggagcact cctcttcgac 2280ctgggtaata ctgaccgcca ttcaatcaag aagaacctga tcggagcact cctcttcgac 2280
tccggtgaaa ccgctgaagc tactcggctg aagcggaccg caaggcggag atacacccgc 2340tccggtgaaa ccgctgaagc tactcggctg aagcggaccg caaggcggag atacacccgc 2340
cgcaagaatc ggatatgtta tctgcaagag atctttagca acgaaatggc taaggtggac 2400cgcaagaatc ggatatgtta tctgcaagag atctttagca acgaaatggc taaggtggac 2400
gactccttct ttcaccgcct ggaagagagc tttctggtgg aggaggataa gaaacacgag 2460gactccttct ttcaccgcct ggaagagagc tttctggtgg aggaggataa gaaacacgag 2460
aggcacccta tattcggaaa tatcgtggat gaggtggctt accatgaaaa gtatcctaca 2520aggcacccta tattcggaaa tatcgtggat gaggtggctt accatgaaaa gtatcctaca 2520
atctaccatc tgaggaagaa gctggtggac agcaccgata aagcagacct gaggctcatc 2580atctaccatc tgaggaagaa gctggtggac agcaccgata aagcagacct gaggctcatc 2580
tatctggccc tggctcatat gataaagttt agaggacact ttctgatcga gggcgacctg 2640tatctggccc tggctcatat gataaagttt agaggacact ttctgatcga gggcgacctg 2640
aatcccgata attccgatgt ggataaactc ttcattcaac tggtgcagac atataaccaa 2700aatcccgata attccgatgt ggataaactc ttcattcaac tggtgcagac atataaccaa 2700
ctgttcgagg agaatcccat aaacgcttct ggtgtggatg ccaaggctat tctgtccgct 2760ctgttcgagg agaatcccat aaacgcttct ggtgtggatg ccaaggctat tctgtccgct 2760
cggctgtcca agtcacgcag actggagaat ctgattgccc aactgccagg agaaaagaag 2820cggctgtcca agtcacgcag actggagaat ctgattgccc aactgccagg agaaaagaag 2820
aacggcctgt ttgggaacct catcgccctg agcctgggcc tgacacctaa cttcaagtcc 2880aacggcctgt ttgggaacct catcgccctg agcctgggcc tgacacctaa cttcaagtcc 2880
aattttgatc tggccgaaga tgctaaactc cagctctcca aggacaccta tgacgatgat 2940aattttgatc tggccgaaga tgctaaactc cagctctcca aggacacccta tgacgatgat 2940
ctggacaacc tgctcgcaca gataggcgac cagtacgccg atctctttct ggctgctaag 3000ctggacaacc tgctcgcaca gataggcgac cagtacgccg atctctttct ggctgctaag 3000
aatctctccg acgccattct gctgagcgac atactccggg tcaacactga gatcaccaaa 3060aatctctccg acgccattct gctgagcgac atactccggg tcaacactga gatcaccaaa 3060
gcacctctga gcgcctccat gataaaacgc tatgatgaac accatcaaga cctgactctg 3120gcacctctga gcgcctccat gataaaacgc tatgatgaac accatcaaga cctgactctg 3120
ctcaaagccc tcgtgaggca acagctgcca gagaagtaca aagagatatt cttcgaccag 3180ctcaaagccc tcgtgaggca acagctgcca gagaagtaca aagagatatt cttcgaccag 3180
agcaagaatg gatatgccgg atacatcgat ggcggagcat cacaggaaga attttacaag 3240agcaagaatg gatatgccgg atacatcgat ggcggagcat cacaggaaga attttacaag 3240
ttcatcaaac caatcctcga gaagatggac ggtactgaag agctgctggt gaagctgaac 3300ttcatcaaac caatcctcga gaagatggac ggtactgaag agctgctggt gaagctgaac 3300
agggaggacc tgctgaggaa gcagaggacc tttgataatg gctccattcc acatcagata 3360agggaggacc tgctgaggaa gcagaggacc tttgataatg gctccattcc acatcagata 3360
cacctgggag agctgcatgc aatcctccgc aggcaggagg atttctatcc tttcctgaag 3420cacctgggag agctgcatgc aatcctccgc aggcaggagg atttctatcc tttcctgaag 3420
gataaccggg agaagataga gaagatcctg accttcagga tcccttatta cgtcggccct 3480gataaccggg agaagataga gaagatcctg accttcagga tcccttatta cgtcggccct 3480
ctggctagag gcaactcccg cttcgcttgg atgaccagga aatctgagga gacaattact 3540ctggctagag gcaactcccg cttcgcttgg atgaccagga aatctgagga gacaattact 3540
ccttggaact tcgaagaggt cgtggataag ggcgcaagcg cccagtcatt catcgaacgg 3600ccttggaact tcgaagaggt cgtggataag ggcgcaagcg cccagtcatt catcgaacgg 3600
atgaccaatt tcgataagaa cctgccaaac gagaaggtcc tgcccaaaca ttcactcctg 3660atgaccaatt tcgataagaa cctgccaaac gagaaggtcc tgcccaaaca ttcactcctg 3660
tacgagtatt tcaccgtcta taacgagctg actaaagtga agtacgtgac cgagggcatg 3720tacgagtatt tcaccgtcta taacgagctg actaaagtga agtacgtgac cgagggcatg 3720
aggaagcctg ccttcctgtc cggagagcag aagaaggcta tcgttgatct gctcttcaag 3780aggaagcctg ccttcctgtc cggagagcag aagaaggcta tcgttgatct gctcttcaag 3780
actaatagaa aggtgacagt gaagcagctc aaggaggatt actttaagaa gatcgaatgc 3840actaatagaa aggtgacagt gaagcagctc aaggaggatt actttaagaa gatcgaatgc 3840
tttgactcag tggaaatctc tggcgtggag gaccgcttta atgccagcct gggcacttac 3900tttgactcag tggaaatctc tggcgtggag gaccgcttta atgccagcct gggcacttac 3900
catgatctgc tgaagataat caaagacaaa gatttcctcg ataatgagga gaacgaggac 3960catgatctgc tgaagataat caaagacaaa gatttcctcg ataatgagga gaacgaggac 3960
atcctggaag atatcgtgct gaccctgact ctgttcgagg atagagagat gatcgaagag 4020atcctggaag atatcgtgct gaccctgact ctgttcgagg atagagagat gatcgaagag 4020
cgcctgaaga cctatgccca tctgtttgac gataaagtca tgaaacagct caagcggcgg 4080cgcctgaaga cctatgccca tctgtttgac gataaagtca tgaaacagct caagcggcgg 4080
cgctacactg ggtggggtag actctccagg aaactcataa acggcatccg cgacaaacag 4140cgctacactg ggtggggtag actctccagg aaactcataa acggcatccg cgacaaacag 4140
agcggaaaga ccatcctgga tttcctgaaa tccgacggat tcgctaacag gaacttcatg 4200agcggaaaga ccatcctgga tttcctgaaa tccgacggat tcgctaacag gaacttcatg 4200
caactgattc acgatgactc tctgacattt aaagaggaca tccagaaggc acaggtgagc 4260caactgattc acgatgactc tctgacattt aaagaggaca tccagaaggc acaggtgagc 4260
ggtcaaggcg acagcctgca cgagcacatc gccaacctcg ctggatcacc cgccataaag 4320ggtcaaggcg acagcctgca cgagcacatc gccaacctcg ctggatcacc cgccataaag 4320
aagggaatac tgcagacagt caaggtcgtg gacgaactcg tcaaagtgat gggtcggcac 4380aagggaatac tgcagacagt caaggtcgtg gacgaactcg tcaaagtgat gggtcggcac 4380
aagccagaga atatcgttat cgaaatggca agggagaacc aaaccaccca gaagggccag 4440aagccagaga atatcgttat cgaaatggca agggagaacc aaaccaccca gaagggccag 4440
aagaactctc gggaacggat gaaaagaatc gaagagggaa ttaaggagct gggatctcag 4500aagaactctc gggaacggat gaaaagaatc gaagagggaa ttaaggagct gggatctcag 4500
atactgaagg agcaccctgt ggagaataca cagctccaga acgagaaact ctacctgtac 4560atactgaagg agcaccctgt ggagaataca cagctccaga acgagaaact ctacctgtac 4560
tacctccaga acgggcggga catgtacgtt gaccaggaac tcgacatcaa ccggctgtcc 4620tacctccaga acgggcggga catgtacgtt gaccaggaac tcgacatcaa ccggctgtcc 4620
gattatgacg tggacgctat tgttccacag tccttcctca aagatgactc cattgacaac 4680gattatgacg tggacgctat tgttccacag tccttcctca aagatgactc cattgacaac 4680
aaggtgctga ccagatccga taaggcccgc ggtaagtctg acaatgttcc atcagaagag 4740aaggtgctga ccagatccga taaggcccgc ggtaagtctg acaatgttcc atcagaagag 4740
gtggtcaaga agatgaagaa ttactggcgg cagctcctca acgccaaact gatcacccag 4800gtggtcaaga agatgaagaa ttactggcgg cagctcctca acgccaaact gatcacccag 4800
cggaagtttg acaatctgac taaggcagaa agaggaggtc tgagcgaact cgacaaggcc 4860cggaagtttg acaatctgac taaggcagaa agaggaggtc tgagcgaact cgacaaggcc 4860
ggctttatta agaggcaact ggtcgaaaca cgccagatta ccaaacacgt ggcacaaatc 4920ggctttatta agaggcaact ggtcgaaaca cgccagatta ccaaacacgt ggcacaaatc 4920
ctcgactcta ggatgaacac taagtacgat gagaacgata agctgatcag ggaagtgaaa 4980ctcgactcta ggatgaacac taagtacgat gagaacgata agctgatcag ggaagtgaaa 4980
gtgataactc tgaagagcaa gctggtgtct gacttccgga aggactttca attctacaaa 5040gtgataactc tgaagagcaa gctggtgtct gacttccgga aggactttca attctacaaa 5040
gttcgcgaaa taaacaatta ccatcatgct cacgatgcct atctcaatgc tgtcgttggc 5100gttcgcgaaa taaacaatta ccatcatgct cacgatgcct atctcaatgc tgtcgttggc 5100
accgccctga tcaagaaata ccctaaactg gagtctgagt tcgtgtacgg tgactataaa 5160accgccctga tcaagaaata ccctaaactg gagtctgagt tcgtgtacgg tgactataaa 5160
gtctacgatg tgaggaagat gatagcaaag tctgagcaag agattggcaa agccaccgcc 5220gtctacgatg tgaggaagat gatagcaaag tctgagcaag agattggcaa agccaccgcc 5220
aagtacttct tctactctaa tatcatgaat ttctttaaga ctgagataac cctggctaac 5280aagtacttct tctactctaa tatcatgaat ttctttaaga ctgagataac cctggctaac 5280
ggcgaaatcc ggaagcgccc actgatcgaa acaaacggag aaacaggaga aatcgtgtgg 5340ggcgaaatcc ggaagcgccc actgatcgaa acaaacggag aaacaggaga aatcgtgtgg 5340
gataaaggca gggacttcgc aactgtgcgg aaggtgctgt ccatgccaca agtcaatatc 5400gataaaggca gggacttcgc aactgtgcgg aaggtgctgt ccatgccaca agtcaatatc 5400
gtgaagaaga ccgaagtgca gaccggcgga ttctcaaagg agagcatcct gccaaagcgg 5460gtgaagaaga ccgaagtgca gaccggcgga ttctcaaagg agagcatcct gccaaagcgg 5460
aactctgaca agctgatcgc caggaagaaa gattgggacc caaagaagta tggcggtttc 5520aactctgaca agctgatcgc caggaagaaa gattgggacc caaagaagta tggcggtttc 5520
gattccccta cagtggctta ttccgttctg gtcgtggcaa aagtggagaa aggcaagtcc 5580gattccccta cagtggctta ttccgttctg gtcgtggcaa aagtggagaa aggcaagtcc 5580
aagaaactca agtctgttaa ggagctgctc ggaattacta ttatggagag atccagcttc 5640aagaaactca agtctgttaa ggagctgctc ggaattacta ttatggagag atccagcttc 5640
gagaagaatc caatcgattt cctggaagct aagggctata aagaagtgaa gaaagatctc 5700gagaagaatc caatcgattt cctggaagct aagggctata aagaagtgaa gaaagatctc 5700
atcatcaaac tgcccaagta ctctctcttt gagctggaga atggtaggaa gcggatgctg 5760atcatcaaac tgcccaagta ctctctcttt gagctggaga atggtaggaa gcggatgctg 5760
gcctccgccg gagagctgca gaaaggaaac gagctggctc tgccctccaa atacgtgaac 5820gcctccgccg gagagctgca gaaaggaaac gagctggctc tgccctccaa atacgtgaac 5820
ttcctgtatc tggcctccca ctacgagaaa ctcaaaggta gccctgaaga caatgagcag 5880ttcctgtatc tggcctccca ctacgagaaa ctcaaaggta gccctgaaga caatgagcag 5880
aagcaactct ttgttgagca acataaacac tacctggacg aaatcattga acagattagc 5940aagcaactctttgttgagca acataaacac tacctggacg aaatcattga acagattagc 5940
gagttcagca agcgggttat tctggccgat gcaaacctcg ataaagtgct gagcgcatat 6000gagttcagca agcgggttat tctggccgat gcaaacctcg ataaagtgct gagcgcatat 6000
aataagcaca gggacaagcc aattcgcgaa caagcagaga atattatcca cctctttact 6060aataagcaca gggacaagcc aattcgcgaa caagcagaga atattatcca cctctttact 6060
ctgactaatc tgggcgctcc tgctgccttc aagtatttcg atacaactat tgacaggaag 6120ctgactaatc tgggcgctcc tgctgccttc aagtatttcg atacaactat tgacaggaag 6120
cggtacacct ctaccaaaga agttctcgat gccaccctga tacaccagtc aattaccgga 6180cggtacacct ctaccaaaga agttctcgat gccaccctga tacaccagtc aattaccgga 6180
ctgtacgaga ctcgcatcga cctgtctcag ctcggcggcg acggttctcc caagaagaag 6240ctgtacgaga ctcgcatcga cctgtctcag ctcggcggcg acggttctcc caagaagaag 6240
aggaaagtct cgagcggtgg agctgcagga taggaattcg ggcccttcga aggtaagcct 6300aggaaagtct cgagcggtgg agctgcagga taggaattcg ggcccttcga aggtaagcct 6300
atccctaacc ctctcctcgg tctcgattct acgcgtaccg gtcatcatca ccatcaccat 6360atccctaacc ctctcctcgg tctcgattct acgcgtaccg gtcatcatca ccatcaccat 6360
tgagtttaaa cccgctgatc agcctcgact gtgccttcta gttgccagcc atctgttgtt 6420tgagtttaaa cccgctgatc agcctcgact gtgccttcta gttgccagcc atctgttgtt 6420
tgcccctccc ccgtgccttc cttgaccctg gaaggtgcca ctcccactgt cctttcctaa 6480tgcccctccc ccgtgccttc cttgaccctg gaaggtgcca ctcccactgt cctttcctaa 6480
taaaatgagg aaattgcatc gcattgtctg agtaggtgtc attctattct ggggggtggg 6540taaaatgagg aaattgcatc gcattgtctg agtaggtgtc attctattct ggggggtggg 6540
gtggggcagg acagcaaggg ggaggattgg gaagacaata gcaggcatgc tggggatgcg 6600gtggggcagg acagcaaggg ggaggattgg gaagacaata gcaggcatgc tggggatgcg 6600
gtgggctcta tggcttctga ggcggaaaga accagctggg gctctagggg gtatccccac 6660gtgggctcta tggcttctga ggcggaaaga accagctggg gctctagggg gtatccccac 6660
gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct 6720gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct 6720
acacttgcca gcgccctagc gcccgctcct ttcgctttct tcccttcctt tctcgccacg 6780acacttgcca gcgccctagc gcccgctcct ttcgctttct tcccttcctt tctcgccacg 6780
ttcgccggct ttccccgtca agctctaaat cggggcatcc ctttagggtt ccgatttagt 6840ttcgccggct ttccccgtca agctctaaat cggggcatcc ctttagggtt ccgatttagt 6840
gctttacggc acctcgaccc caaaaaactt gattagggtg atggttcacg tagtgggcca 6900gctttacggc acctcgaccc caaaaaactt gattagggtg atggttcacg tagtgggcca 6900
tcgccctgat agacggtttt tcgccctttg acgttggagt ccacgttctt taatagtgga 6960tcgccctgat agacggtttt tcgccctttg acgttggagt ccacgttctt taatagtgga 6960
ctcttgttcc aaactggaac aacactcaac cctatctcgg tctattcttt tgatttataa 7020ctcttgttcc aaactggaac aacactcaac cctatctcgg tctattcttt tgatttataa 7020
gggattttgg ggatttcggc ctattggtta aaaaatgagc tgatttaaca aaaatttaac 7080gggattttgg ggatttcggc ctattggtta aaaaatgagc tgatttaaca aaaatttaac 7080
gcgaattaat tctgtggaat gtgtgtcagt tagggtgtgg aaagtcccca ggctccccag 7140gcgaattaat tctgtggaat gtgtgtcagt tagggtgtgg aaagtcccca ggctccccag 7140
gcaggcagaa gtatgcaaag catgcatctc aattagtcag caaccaggtg tggaaagtcc 7200gcaggcagaa gtatgcaaag catgcatctc aattagtcag caaccaggtg tggaaagtcc 7200
ccaggctccc cagcaggcag aagtatgcaa agcatgcatc tcaattagtc agcaaccata 7260ccaggctccc cagcaggcag aagtatgcaa agcatgcatc tcaattagtc agcaaccata 7260
gtcccgcccc taactccgcc catcccgccc ctaactccgc ccagttccgc ccattctccg 7320gtcccgcccc taactccgcc catcccgccc ctaactccgc ccagttccgc ccattctccg 7320
ccccatggct gactaatttt ttttatttat gcagaggccg aggccgcctc tgcctctgag 7380ccccatggct gactaattttttttatttat gcagaggccg aggccgcctc tgcctctgag 7380
ctattccaga agtagtgagg aggctttttt ggaggcctag gcttttgcaa aaagctcccg 7440ctattccaga agtagtgagg aggctttttt ggaggcctag gcttttgcaa aaagctcccg 7440
ggagcttgta tatccatttt cggatctgat cagcacgtgt tgacaattaa tcatcggcat 7500ggagcttgta tatccatttt cggatctgat cagcacgtgt tgacaattaa tcatcggcat 7500
agtatatcgg catagtataa tacgacaagg tgaggaacta aaccatggcc aagcctttgt 7560agtatatcgg catagtataa tacgacaagg tgaggaacta aaccatggcc aagcctttgt 7560
ctcaagaaga atccaccctc attgaaagag caacggctac aatcaacagc atccccatct 7620ctcaagaaga atccaccctc attgaaagag caacggctac aatcaacagc atccccatct 7620
ctgaagacta cagcgtcgcc agcgcagctc tctctagcga cggccgcatc ttcactggtg 7680ctgaagacta cagcgtcgcc agcgcagctc tctctagcga cggccgcatc ttcactggtg 7680
tcaatgtata tcattttact gggggacctt gtgcagaact cgtggtgctg ggcactgctg 7740tcaatgtata tcattttact gggggacctt gtgcagaact cgtggtgctg ggcactgctg 7740
ctgctgcggc agctggcaac ctgacttgta tcgtcgcgat cggaaatgag aacaggggca 7800ctgctgcggc agctggcaac ctgacttgta tcgtcgcgat cggaaatgag aacaggggca 7800
tcttgagccc ctgcggacgg tgtcgacagg tgcttctcga tctgcatcct gggatcaaag 7860tcttgagccc ctgcggacgg tgtcgacagg tgcttctcga tctgcatcct gggatcaaag 7860
cgatagtgaa ggacagtgat ggacagccga cggcagttgg gattcgtgaa ttgctgccct 7920cgatagtgaa ggacagtgat ggacagccga cggcagttgg gattcgtgaa ttgctgccct 7920
ctggttatgt gtgggagggc taagcacttc gtggccgagg agcaggactg acacgtgcta 7980ctggttatgt gtggggagggc taagcacttc gtggccgagg agcaggactg acacgtgcta 7980
cgagatttcg attccaccgc cgccttctat gaaaggttgg gcttcggaat cgttttccgg 8040cgagatttcg attccaccgc cgccttctat gaaaggttgg gcttcggaat cgttttccgg 8040
gacgccggct ggatgatcct ccagcgcggg gatctcatgc tggagttctt cgcccacccc 8100gacgccggct ggatgatcct ccagcgcggg gatctcatgc tggagttctt cgcccacccc 8100
aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca 8160aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca 8160
aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct 8220aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct 8220
tatcatgtct gtataccgtc gacctctagc tagagcttgg cgtaatcatg gtcatagctg 8280tatcatgtct gtataccgtc gacctctagc tagagcttgg cgtaatcatg gtcatagctg 8280
tttcctgtgt gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata 8340tttcctgtgt gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata 8340
aagtgtaaag cctggggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca 8400aagtgtaaag cctggggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca 8400
ctgcccgctt tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc 8460ctgcccgctt tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc 8460
gcggggagag gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg 8520gcggggagag gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg 8520
cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 8580cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 8580
tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 8640tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 8640
aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 8700aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 8700
catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 8760catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 8760
caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 8820caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 8820
ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcaatg ctcacgctgt 8880ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcaatg ctcacgctgt 8880
aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 8940aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 8940
gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 9000gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 9000
cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 9060cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 9060
ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 9120ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 9120
tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 9180tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 9180
tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 9240tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 9240
cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 9300cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 9300
tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 9360tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 9360
tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact 9420tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact 9420
tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt 9480tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt 9480
cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta 9540cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta 9540
ccatctggcc ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta 9600ccatctggcc ccagtgctgc aatgataccg cgagaccccac gctcaccggc tccagattta 9600
tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc 9660tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc 9660
gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat 9720gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat 9720
agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt 9780agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt 9780
atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg 9840atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg 9840
tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca 9900tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca 9900
gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta 9960gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta 9960
agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg 10020agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg 10020
cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact 10080cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact 10080
ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 10140ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 10140
ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 10200ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 10200
actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 10260actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 10260
ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 10320ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 10320
atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa 10380atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa 10380
caaatagggg ttccgcgcac atttccccga aaagtgccac ctgacgtc 10428caaatagggg ttccgcgcac atttccccga aaagtgccac ctgacgtc 10428
<210> 12<210> 12
<211> 10458<211> 10458
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 12<400> 12
gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720actcacgggg atttccaagt ctccacccca ttgacgtcaa tggggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780aaaatcaacg ggactttcca aaatgtcgta acaactccgcccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840gtaggcgtgt acggtggggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900ctgcttactg gctttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900
accatgggac ctaagaaaaa gaggaaggtg gcggccgctg actacaaaga ccatgacggt 960accatgggac ctaagaaaaa gaggaaggtg gcggccgctg actacaaaga ccatgacggt 960
gattataaag atcatgacat cgactacaag gatgacgatg acaagtctag agacaagaaa 1020gattataaag atcatgacat cgactacaag gatgacgatg acaagtctag agacaagaaa 1020
tactctattg gactggctat cgggacaaac tccgttggct gggccgtcat aaccgacgag 1080tactctattg gactggctat cgggacaaac tccgttggct gggccgtcat aaccgacgag 1080
tataaggtgc caagcaagaa attcaaggtg ctgggtaata ctgaccgcca ttcaatcaag 1140tataaggtgc caagcaagaa attcaaggtg ctgggtaata ctgaccgcca ttcaatcaag 1140
aagaacctga tcggagcact cctcttcgac tccggtgaaa ccgctgaagc tactcggctg 1200aagaacctga tcggagcact cctcttcgac tccggtgaaa ccgctgaagc tactcggctg 1200
aagcggaccg caaggcggag atacacccgc cgcaagaatc ggatatgtta tctgcaagag 1260aagcggaccg caaggcggag atacacccgc cgcaagaatc ggatatgtta tctgcaagag 1260
atctttagca acgaaatggc taaggtggac gactccttct ttcaccgcct ggaagagagc 1320atctttagca acgaaatggc taaggtggac gactccttct ttcaccgcct ggaagagagc 1320
tttctggtgg aggaggataa gaaacacgag aggcacccta tattcggaaa tatcgtggat 1380tttctggtgg aggaggataa gaaacacgag aggcacccta tattcggaaa tatcgtggat 1380
gaggtggctt accatgaaaa gtatcctaca atctaccatc tgaggaagaa gctggtggac 1440gaggtggctt accatgaaaa gtatcctaca atctaccatc tgaggaagaa gctggtggac 1440
agcaccgata aagcagacct gaggctcatc tatctggccc tggctcatat gataaagttt 1500agcaccgata aagcagacct gaggctcatc tatctggccc tggctcatat gataaagttt 1500
agaggacact ttctgatcga gggcgacctg aatcccgata attccgatgt ggataaactc 1560agaggacact ttctgatcga gggcgacctg aatcccgata attccgatgt ggataaactc 1560
ttcattcaac tggtgcagac atataaccaa ctgttcgagg agaatcccat aaacgcttct 1620ttcattcaac tggtgcagac atataaccaa ctgttcgagg agaatcccat aaacgcttct 1620
ggtgtggatg ccaaggctat tctgtccgct cggctgtcca agtcacgcag actggagaat 1680ggtgtggatg ccaaggctat tctgtccgct cggctgtcca agtcacgcag actggagaat 1680
ctgattgccc aactgccagg agaaaagaag aacggcctgt ttgggaacct catcgccctg 1740ctgattgccc aactgccagg agaaaagaag aacggcctgt ttgggaacct catcgccctg 1740
agcctgggcc tgacacctaa cttcaagtcc aattttgatc tggccgaaga tgctaaactc 1800agcctgggcc tgacacctaa cttcaagtcc aattttgatc tggccgaaga tgctaaactc 1800
cagctctcca aggacaccta tgacgatgat ctggacaacc tgctcgcaca gataggcgac 1860cagctctcca aggacacccta tgacgatgat ctggacaacc tgctcgcaca gataggcgac 1860
cagtacgccg atctctttct ggctgctaag aatctctccg acgccattct gctgagcgac 1920cagtacgccg atctctttct ggctgctaag aatctctccg acgccattct gctgagcgac 1920
atactccggg tcaacactga gatcaccaaa gcacctctga gcgcctccat gataaaacgc 1980atactccggg tcaacactga gatcaccaaa gcacctctga gcgcctccat gataaaacgc 1980
tatgatgaac accatcaaga cctgactctg ctcaaagccc tcgtgaggca acagctgcca 2040tatgatgaac accatcaaga cctgactctg ctcaaagccc tcgtgaggca acagctgcca 2040
gagaagtaca aagagatatt cttcgaccag agcaagaatg gatatgccgg atacatcgat 2100gagaagtaca aagagatatt cttcgaccag agcaagaatg gatatgccgg atacatcgat 2100
ggcggagcat cacaggaaga attttacaag ttcatcaaac caatcctcga gaagatggac 2160ggcggagcat cacaggaaga attttacaag ttcatcaaac caatcctcga gaagatggac 2160
ggtactgaag agctgctggt gaagctgaac agggaggacc tgctgaggaa gcagaggacc 2220ggtactgaag agctgctggt gaagctgaac agggaggacc tgctgaggaa gcagaggacc 2220
tttgataatg gctccattcc acatcagata cacctgggag agctgcatgc aatcctccgc 2280tttgataatg gctccattcc acatcagata cacctgggag agctgcatgc aatcctccgc 2280
aggcaggagg atttctatcc tttcctgaag gataaccggg agaagataga gaagatcctg 2340aggcaggagg atttctatcc tttcctgaag gataaccggg agaagataga gaagatcctg 2340
accttcagga tcccttatta cgtcggccct ctggctagag gcaactcccg cttcgcttgg 2400accttcagga tcccttatta cgtcggccct ctggctagag gcaactcccg cttcgcttgg 2400
atgaccagga aatctgagga gacaattact ccttggaact tcgaagaggt cgtggataag 2460atgaccagga aatctgagga gacaattact ccttggaact tcgaagaggt cgtggataag 2460
ggcgcaagcg cccagtcatt catcgaacgg atgaccaatt tcgataagaa cctgccaaac 2520ggcgcaagcg cccagtcatt catcgaacgg atgaccaatt tcgataagaa cctgccaaac 2520
gagaaggtcc tgcccaaaca ttcactcctg tacgagtatt tcaccgtcta taacgagctg 2580gagaaggtcc tgcccaaaca ttcactcctg tacgagtatt tcaccgtcta taacgagctg 2580
actaaagtga agtacgtgac cgagggcatg aggaagcctg ccttcctgtc cggagagcag 2640actaaagtga agtacgtgac cgagggcatg aggaagcctg ccttcctgtc cggagagcag 2640
aagaaggcta tcgttgatct gctcttcaag actaatagaa aggtgacagt gaagcagctc 2700aagaaggcta tcgttgatct gctcttcaag actaatagaa aggtgacagt gaagcagctc 2700
aaggaggatt actttaagaa gatcgaatgc tttgactcag tggaaatctc tggcgtggag 2760aaggaggatt actttaagaa gatcgaatgc tttgactcag tggaaatctc tggcgtggag 2760
gaccgcttta atgccagcct gggcacttac catgatctgc tgaagataat caaagacaaa 2820gaccgcttta atgccagcct gggcacttac catgatctgc tgaagataat caaagacaaa 2820
gatttcctcg ataatgagga gaacgaggac atcctggaag atatcgtgct gaccctgact 2880gatttcctcg ataatgagga gaacgaggac atcctggaag atatcgtgct gaccctgact 2880
ctgttcgagg atagagagat gatcgaagag cgcctgaaga cctatgccca tctgtttgac 2940ctgttcgagg atagagagat gatcgaagag cgcctgaaga cctatgccca tctgtttgac 2940
gataaagtca tgaaacagct caagcggcgg cgctacactg ggtggggtag actctccagg 3000gataaagtca tgaaacagct caagcggcgg cgctacactg ggtggggtag actctccagg 3000
aaactcataa acggcatccg cgacaaacag agcggaaaga ccatcctgga tttcctgaaa 3060aaactcataa acggcatccg cgacaaacag agcggaaaga ccatcctgga tttcctgaaa 3060
tccgacggat tcgctaacag gaacttcatg caactgattc acgatgactc tctgacattt 3120tccgacggat tcgctaacag gaacttcatg caactgattc acgatgactc tctgacattt 3120
aaagaggaca tccagaaggc acaggtgagc ggtcaaggcg acagcctgca cgagcacatc 3180aaagaggaca tccagaaggc acaggtgagc ggtcaaggcg acagcctgca cgagcacatc 3180
gccaacctcg ctggatcacc cgccataaag aagggaatac tgcagacagt caaggtcgtg 3240gccaacctcg ctggatcacc cgccataaag aagggaatac tgcagacagt caaggtcgtg 3240
gacgaactcg tcaaagtgat gggtcggcac aagccagaga atatcgttat cgaaatggca 3300gacgaactcg tcaaagtgat gggtcggcac aagccagaga atatcgttat cgaaatggca 3300
agggagaacc aaaccaccca gaagggccag aagaactctc gggaacggat gaaaagaatc 3360agggagaacc aaaccaccca gaagggccag aagaactctc gggaacggat gaaaagaatc 3360
gaagagggaa ttaaggagct gggatctcag atactgaagg agcaccctgt ggagaataca 3420gaagagggaa ttaaggagct gggatctcag atactgaagg agcaccctgt ggagaataca 3420
cagctccaga acgagaaact ctacctgtac tacctccaga acgggcggga catgtacgtt 3480cagctccaga acgagaaact ctacctgtac tacctccaga acgggcggga catgtacgtt 3480
gaccaggaac tcgacatcaa ccggctgtcc gattatgacg tggacgctat tgttccacag 3540gaccaggaac tcgacatcaa ccggctgtcc gattatgacg tggacgctat tgttccacag 3540
tccttcctca aagatgactc cattgacaac aaggtgctga ccagatccga taaggcccgc 3600tccttcctca aagatgactc cattgacaac aaggtgctga ccagatccga taaggcccgc 3600
ggtaagtctg acaatgttcc atcagaagag gtggtcaaga agatgaagaa ttactggcgg 3660ggtaagtctg acaatgttcc atcagaagag gtggtcaaga agatgaagaa ttactggcgg 3660
cagctcctca acgccaaact gatcacccag cggaagtttg acaatctgac taaggcagaa 3720cagctcctca acgccaaact gatcacccag cggaagtttg acaatctgac taaggcagaa 3720
agaggaggtc tgagcgaact cgacaaggcc ggctttatta agaggcaact ggtcgaaaca 3780agaggaggtc tgagcgaact cgacaaggcc ggctttatta agaggcaact ggtcgaaaca 3780
cgccagatta ccaaacacgt ggcacaaatc ctcgactcta ggatgaacac taagtacgat 3840cgccagatta ccaaacacgt ggcacaaatc ctcgactcta ggatgaacac taagtacgat 3840
gagaacgata agctgatcag ggaagtgaaa gtgataactc tgaagagcaa gctggtgtct 3900gagaacgata agctgatcag ggaagtgaaa gtgataactc tgaagagcaa gctggtgtct 3900
gacttccgga aggactttca attctacaaa gttcgcgaaa taaacaatta ccatcatgct 3960gacttccgga aggactttca attctacaaa gttcgcgaaa taaacaatta ccatcatgct 3960
cacgatgcct atctcaatgc tgtcgttggc accgccctga tcaagaaata ccctaaactg 4020cacgatgcct atctcaatgc tgtcgttggc accgccctga tcaagaaata ccctaaactg 4020
gagtctgagt tcgtgtacgg tgactataaa gtctacgatg tgaggaagat gatagcaaag 4080gagtctgagt tcgtgtacgg tgactataaa gtctacgatg tgaggaagat gatagcaaag 4080
tctgagcaag agattggcaa agccaccgcc aagtacttct tctactctaa tatcatgaat 4140tctgagcaag agattggcaa agccaccgcc aagtacttct tctactctaa tatcatgaat 4140
ttctttaaga ctgagataac cctggctaac ggcgaaatcc ggaagcgccc actgatcgaa 4200ttctttaaga ctgagataac cctggctaac ggcgaaatcc ggaagcgccc actgatcgaa 4200
acaaacggag aaacaggaga aatcgtgtgg gataaaggca gggacttcgc aactgtgcgg 4260acaaacggag aaacaggaga aatcgtgtgg gataaaggca gggacttcgc aactgtgcgg 4260
aaggtgctgt ccatgccaca agtcaatatc gtgaagaaga ccgaagtgca gaccggcgga 4320aaggtgctgt ccatgccaca agtcaatatc gtgaagaaga ccgaagtgca gaccggcgga 4320
ttctcaaagg agagcatcct gccaaagcgg aactctgaca agctgatcgc caggaagaaa 4380ttctcaaagg agagcatcct gccaaagcgg aactctgaca agctgatcgc caggaagaaa 4380
gattgggacc caaagaagta tggcggtttc gattccccta cagtggctta ttccgttctg 4440gattgggacc caaagaagta tggcggtttc gattccccta cagtggctta ttccgttctg 4440
gtcgtggcaa aagtggagaa aggcaagtcc aagaaactca agtctgttaa ggagctgctc 4500gtcgtggcaa aagtggagaa aggcaagtcc aagaaactca agtctgttaa ggagctgctc 4500
ggaattacta ttatggagag atccagcttc gagaagaatc caatcgattt cctggaagct 4560ggaattacta ttatggagag atccagcttc gagaagaatc caatcgattt cctggaagct 4560
aagggctata aagaagtgaa gaaagatctc atcatcaaac tgcccaagta ctctctcttt 4620aagggctata aagaagtgaa gaaagatctc atcatcaaac tgcccaagta ctctctcttt 4620
gagctggaga atggtaggaa gcggatgctg gcctccgccg gagagctgca gaaaggaaac 4680gagctggaga atggtaggaa gcggatgctg gcctccgccg gagagctgca gaaaggaaac 4680
gagctggctc tgccctccaa atacgtgaac ttcctgtatc tggcctccca ctacgagaaa 4740gagctggctc tgccctccaa atacgtgaac ttcctgtatc tggcctccca ctacgagaaa 4740
ctcaaaggta gccctgaaga caatgagcag aagcaactct ttgttgagca acataaacac 4800ctcaaaggta gccctgaaga caatgagcag aagcaactct ttgttgagca acataaacac 4800
tacctggacg aaatcattga acagattagc gagttcagca agcgggttat tctggccgat 4860tacctggacg aaatcattga acagattagc gagttcagca agcgggttat tctggccgat 4860
gcaaacctcg ataaagtgct gagcgcatat aataagcaca gggacaagcc aattcgcgaa 4920gcaaacctcg ataaagtgct gagcgcatat aataagcaca gggacaagcc aattcgcgaa 4920
caagcagaga atattatcca cctctttact ctgactaatc tgggcgctcc tgctgccttc 4980caagcagaga atattatcca cctctttat ctgactaatc tgggcgctcc tgctgccttc 4980
aagtatttcg atacaactat tgacaggaag cggtacacct ctaccaaaga agttctcgat 5040aagtatttcg atacaactat tgacaggaag cggtacacct ctaccaaaga agttctcgat 5040
gccaccctga tacaccagtc aattaccgga ctgtacgaga ctcgcatcga cctgtctcag 5100gccaccctga tacaccagtc aattaccgga ctgtacgaga ctcgcatcga cctgtctcag 5100
ctcggcggcg acggttctcc caagaagaag aggaaagtcg ggcgcgctgg aggaggatcc 5160ctcggcggcg acggttctcc caagaagaag aggaaagtcg ggcgcgctgg aggaggatcc 5160
ggaggaggat ccggaggagg atccatgctg gatagggatg tgggtccaac tcccatgtat 5220ggaggaggat ccggaggagg atccatgctg gatagggatg tgggtccaac tcccatgtat 5220
ccgcctacat acctggagcc agggattggg aggcacacac catatggcaa ccaaactgac 5280ccgcctacat acctggagcc agggattggg aggcacacac catatggcaa ccaaactgac 5280
tacagaatat ttgagcttaa caaacggctt cagaactgga cagaggagtg tgacaatctc 5340tacagaatat ttgagcttaa caaacggctt cagaactgga cagaggagtg tgacaatctc 5340
tggtgggatg cattcacgac tgagttcttt gaggatgatg ccatgttgac catcactttc 5400tggtggggatg cattcacgac tgagttcttt gaggatgatg ccatgttgac catcactttc 5400
tgcctggagg atggaccaaa gagatatacc attggccgga ccctgatccc acgctacttc 5460tgcctggagg atggaccaaa gagatatacc attggccgga ccctgatccc acgctacttc 5460
cgcagcatct ttgagggggg tgctacggag ctgtactatg ttcttaagca ccccaaggag 5520cgcagcatct ttgagggggg tgctacggag ctgtactatg ttcttaagca ccccaaggag 5520
gcattccaca gcaactttgt gtccctcgac tgtgaccagg gcagcatggt gacccagcat 5580gcattccaca gcaactttgt gtccctcgac tgtgaccagg gcagcatggt gacccagcat 5580
ggcaagccca tgttcaccca ggtgtgtgtg gagggccggt tgtacctgga gttcatgttt 5640ggcaagccca tgttcaccca ggtgtgtgtg gagggccggt tgtacctgga gttcatgttt 5640
gacgacatga tgcggataaa gacgtggcac ttcagcatcc ggcagcaccg agagctcatc 5700gacgacatga tgcggataaa gacgtggcac ttcagcatcc ggcagcaccg agagctcatc 5700
ccccgcagca tccttgccat gcatgcccaa gacccccaga tgttggatca gctctccaaa 5760ccccgcagca tccttgccat gcatgcccaa gacccccaga tgttggatca gctctccaaa 5760
aacatcactc ggtgtgggct gtccaattcc actctcaact acctccgact ctgtgtgata 5820aacatcactc ggtgtgggct gtccaattcc actctcaact acctccgact ctgtgtgata 5820
ctcgagccca tgcaagagct catgtcacgc cacaagacct acagcctcag cccccgcgac 5880ctcgagccca tgcaagagct catgtcacgc cacaagacct acagcctcag cccccgcgac 5880
tgcctcaaga cctgcctttt ccagaagtgg cagcgcatgg tagcaccccc tgcggagccc 5940tgcctcaaga cctgcctttt ccagaagtgg cagcgcatgg tagcaccccc tgcggagccc 5940
acacgtcagc agcccagcaa acggcggaaa cggaagatgt cagggggcag caccatgagc 6000acacgtcagc agcccagcaa acggcggaaa cggaagatgt caggggggcag caccatgagc 6000
tctggtggtg gcaacaccaa caacagcaac agcaagaaga agagcccagc tagcaccttc 6060tctggtggtg gcaacaccaa caacagcaac agcaagaaga agagcccagc tagcaccttc 6060
gccctctcca gccaggtacc tgatgtgatg gtggtggggg agcccaccct gatgggcggg 6120gccctctcca gccaggtacc tgatgtgatg gtggtggggg agcccaccct gatgggcggg 6120
gagttcgggg acgaggacga gaggctcatc acccggctgg agaacaccca gtttgacgca 6180gagttcgggg acgaggacga gaggctcatc acccggctgg agaacacccca gtttgacgca 6180
gccaacggca ttgacgacga ggacagcttt aacaactccc ctgcactggg cgccaacagc 6240gccaacggca ttgacgacga ggacagcttt aacaactccc ctgcactggg cgccaacagc 6240
ccctggaaca gcaagcctcc gtccagccaa gaaagcaaat cggagaaccc cacgtcacag 6300ccctggaaca gcaagcctcc gtccagccaa gaaagcaaat cggagaaccc cacgtcacag 6300
gcctcccagg ggcccttcga aggtaagcct atccctaacc ctctcctcgg tctcgattct 6360gcctcccagg ggcccttcga aggtaagcct atccctaacc ctctcctcgg tctcgattct 6360
acgcgtaccg gtcatcatca ccatcaccat tgagtttaaa cccgctgatc agcctcgact 6420acgcgtaccg gtcatcatca ccatcaccat tgagtttaaa cccgctgatc agcctcgact 6420
gtgccttcta gttgccagcc atctgttgtt tgcccctccc ccgtgccttc cttgaccctg 6480gtgccttcta gttgccagcc atctgttgtt tgcccctccc ccgtgccttc cttgaccctg 6480
gaaggtgcca ctcccactgt cctttcctaa taaaatgagg aaattgcatc gcattgtctg 6540gaaggtgcca ctcccactgt cctttcctaa taaaatgagg aaattgcatc gcattgtctg 6540
agtaggtgtc attctattct ggggggtggg gtggggcagg acagcaaggg ggaggattgg 6600agtaggtgtc attctattct gggggtggg gtggggcagg acagcaaggg ggaggattgg 6600
gaagacaata gcaggcatgc tggggatgcg gtgggctcta tggcttctga ggcggaaaga 6660gaagacaata gcaggcatgc tggggatgcg gtgggctcta tggcttctga ggcggaaaga 6660
accagctggg gctctagggg gtatccccac gcgccctgta gcggcgcatt aagcgcggcg 6720accagctggg gctctagggg gtatccccac gcgccctgta gcggcgcatt aagcgcggcg 6720
ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca gcgccctagc gcccgctcct 6780ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca gcgccctagc gcccgctcct 6780
ttcgctttct tcccttcctt tctcgccacg ttcgccggct ttccccgtca agctctaaat 6840ttcgctttct tcccttcctt tctcgccacg ttcgccggct ttccccgtca agctctaaat 6840
cggggcatcc ctttagggtt ccgatttagt gctttacggc acctcgaccc caaaaaactt 6900cggggcatcc ctttagggtt ccgatttagt gctttacggc acctcgaccc caaaaaactt 6900
gattagggtg atggttcacg tagtgggcca tcgccctgat agacggtttt tcgccctttg 6960gattagggtg atggttcacg tagtgggcca tcgccctgat agacggtttt tcgccctttg 6960
acgttggagt ccacgttctt taatagtgga ctcttgttcc aaactggaac aacactcaac 7020acgttggagt ccacgttctt taatagtgga ctcttgttcc aaactggaac aacactcaac 7020
cctatctcgg tctattcttt tgatttataa gggattttgg ggatttcggc ctattggtta 7080cctatctcgg tctattcttt tgatttataa gggattttgg ggatttcggc ctattggtta 7080
aaaaatgagc tgatttaaca aaaatttaac gcgaattaat tctgtggaat gtgtgtcagt 7140aaaaatgagc tgatttaaca aaaatttaac gcgaattaat tctgtggaat gtgtgtcagt 7140
tagggtgtgg aaagtcccca ggctccccag gcaggcagaa gtatgcaaag catgcatctc 7200tagggtgtgg aaagtcccca ggctccccag gcaggcagaa gtatgcaaag catgcatctc 7200
aattagtcag caaccaggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa 7260aattagtcag caaccaggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa 7260
agcatgcatc tcaattagtc agcaaccata gtcccgcccc taactccgcc catcccgccc 7320agcatgcatc tcaattagtc agcaaccata gtcccgcccc taactccgcc catcccgccc 7320
ctaactccgc ccagttccgc ccattctccg ccccatggct gactaatttt ttttatttat 7380ctaactccgc ccagttccgc ccattctccg ccccatggct gactaattttttttatttat 7380
gcagaggccg aggccgcctc tgcctctgag ctattccaga agtagtgagg aggctttttt 7440gcagaggccg aggccgcctc tgcctctgag ctattccaga agtagtgagg aggctttttt 7440
ggaggcctag gcttttgcaa aaagctcccg ggagcttgta tatccatttt cggatctgat 7500ggaggcctag gcttttgcaa aaagctcccg ggagcttgta tatccatttt cggatctgat 7500
cagcacgtgt tgacaattaa tcatcggcat agtatatcgg catagtataa tacgacaagg 7560cagcacgtgt tgacaattaa tcatcggcat agtatatcgg catagtataa tacgacaagg 7560
tgaggaacta aaccatggcc aagcctttgt ctcaagaaga atccaccctc attgaaagag 7620tgaggaacta aaccatggcc aagcctttgt ctcaagaaga atccaccctc attgaaagag 7620
caacggctac aatcaacagc atccccatct ctgaagacta cagcgtcgcc agcgcagctc 7680caacggctac aatcaacagc atccccatct ctgaagacta cagcgtcgcc agcgcagctc 7680
tctctagcga cggccgcatc ttcactggtg tcaatgtata tcattttact gggggacctt 7740tctctagcga cggccgcatc ttcactggtg tcaatgtata tcattttact gggggacctt 7740
gtgcagaact cgtggtgctg ggcactgctg ctgctgcggc agctggcaac ctgacttgta 7800gtgcagaact cgtggtgctg ggcactgctg ctgctgcggc agctggcaac ctgacttgta 7800
tcgtcgcgat cggaaatgag aacaggggca tcttgagccc ctgcggacgg tgtcgacagg 7860tcgtcgcgat cggaaatgag aacaggggca tcttgagccc ctgcggacgg tgtcgacagg 7860
tgcttctcga tctgcatcct gggatcaaag cgatagtgaa ggacagtgat ggacagccga 7920tgcttctcga tctgcatcct gggatcaaag cgatagtgaa ggacagtgat ggacagccga 7920
cggcagttgg gattcgtgaa ttgctgccct ctggttatgt gtgggagggc taagcacttc 7980cggcagttgg gattcgtgaa ttgctgccct ctggttatgt gtggggagggc taagcacttc 7980
gtggccgagg agcaggactg acacgtgcta cgagatttcg attccaccgc cgccttctat 8040gtggccgagg agcaggactg acacgtgcta cgagatttcg attccaccgc cgccttctat 8040
gaaaggttgg gcttcggaat cgttttccgg gacgccggct ggatgatcct ccagcgcggg 8100gaaaggttgg gcttcggaat cgttttccgg gacgccggct ggatgatcct ccagcgcggg 8100
gatctcatgc tggagttctt cgcccacccc aacttgttta ttgcagctta taatggttac 8160gatctcatgc tggagttctt cgcccacccc aacttgttta ttgcagctta taatggttac 8160
aaataaagca atagcatcac aaatttcaca aataaagcat ttttttcact gcattctagt 8220aaataaagca atagcatcac aaatttcaca aataaagcat ttttttcact gcattctagt 8220
tgtggtttgt ccaaactcat caatgtatct tatcatgtct gtataccgtc gacctctagc 8280tgtggtttgt ccaaactcat caatgtatct tatcatgtct gtataccgtc gacctctagc 8280
tagagcttgg cgtaatcatg gtcatagctg tttcctgtgt gaaattgtta tccgctcaca 8340tagagcttgg cgtaatcatg gtcatagctg tttcctgtgt gaaattgtta tccgctcaca 8340
attccacaca acatacgagc cggaagcata aagtgtaaag cctggggtgc ctaatgagtg 8400attccacaca acatacgagc cggaagcata aagtgtaaag cctggggtgc ctaatgagtg 8400
agctaactca cattaattgc gttgcgctca ctgcccgctt tccagtcggg aaacctgtcg 8460agctaactca cattaattgc gttgcgctca ctgcccgctt tccagtcggg aaacctgtcg 8460
tgccagctgc attaatgaat cggccaacgc gcggggagag gcggtttgcg tattgggcgc 8520tgccagctgc attaatgaat cggccaacgc gcggggagag gcggtttgcg tattgggcgc 8520
tcttccgctt cctcgctcac tgactcgctg cgctcggtcg ttcggctgcg gcgagcggta 8580tcttccgctt cctcgctcac tgactcgctg cgctcggtcg ttcggctgcg gcgagcggta 8580
tcagctcact caaaggcggt aatacggtta tccacagaat caggggataa cgcaggaaag 8640tcagctcact caaaggcggt aatacggtta tccacagaat caggggataa cgcaggaaag 8640
aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg 8700aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg 8700
tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc aagtcagagg 8760tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc aagtcagagg 8760
tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag ctccctcgtg 8820tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag ctccctcgtg 8820
cgctctcctg ttccgaccct gccgcttacc ggatacctgt ccgcctttct cccttcggga 8880cgctctcctg ttccgaccct gccgcttacc ggatacctgt ccgcctttct cccttcggga 8880
agcgtggcgc tttctcaatg ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc 8940agcgtggcgc tttctcaatg ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc 8940
tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc cttatccggt 9000tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc cttatccggt 9000
aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc agcagccact 9060aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc agcagccact 9060
ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt gaagtggtgg 9120ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt gaagtggtgg 9120
cctaactacg gctacactag aaggacagta tttggtatct gcgctctgct gaagccagtt 9180cctaactacg gctacactag aaggacagta tttggtatct gcgctctgct gaagccagtt 9180
accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt 9240accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt 9240
ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct 9300ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct 9300
ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta agggattttg 9360ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta agggattttg 9360
gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa atgaagtttt 9420gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa atgaagtttt 9420
aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg cttaatcagt 9480aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg cttaatcagt 9480
gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg actccccgtc 9540gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg actccccgtc 9540
gtgtagataa ctacgatacg ggagggctta ccatctggcc ccagtgctgc aatgataccg 9600gtgtagataa ctacgatacg ggagggctta ccatctggcc ccagtgctgc aatgataccg 9600
cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc cggaagggcc 9660cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc cggaagggcc 9660
gagcgcagaa gtggtcctgc aactttatcc gcctccatcc agtctattaa ttgttgccgg 9720gagcgcagaa gtggtcctgc aactttatcc gcctccatcc agtctattaa ttgttgccgg 9720
gaagctagag taagtagttc gccagttaat agtttgcgca acgttgttgc cattgctaca 9780gaagctagag taagtagttc gccagttaat agtttgcgca acgttgttgc cattgctaca 9780
ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat tcagctccgg ttcccaacga 9840ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat tcagctccgg ttcccaacga 9840
tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag cggttagctc cttcggtcct 9900tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag cggttagctc cttcggtcct 9900
ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat ggcagcactg 9960ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat ggcagcactg 9960
cataattctc ttactgtcat gccatccgta agatgctttt ctgtgactgg tgagtactca 10020cataattctc ttactgtcat gccatccgta agatgctttt ctgtgactgg tgagtactca 10020
accaagtcat tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata 10080accaagtcat tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata 10080
cgggataata ccgcgccaca tagcagaact ttaaaagtgc tcatcattgg aaaacgttct 10140cgggataata ccgcgccaca tagcagaact ttaaaagtgc tcatcattgg aaaacgttct 10140
tcggggcgaa aactctcaag gatcttaccg ctgttgagat ccagttcgat gtaacccact 10200tcggggcgaa aactctcaag gatcttaccg ctgttgagat ccagttcgat gtaacccact 10200
cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg gtgagcaaaa 10260cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg gtgagcaaaa 10260
acaggaaggc aaaatgccgc aaaaaaggga ataagggcga cacggaaatg ttgaatactc 10320acaggaaggc aaaatgccgc aaaaaaggga ataagggcga cacggaaatg ttgaatactc 10320
atactcttcc tttttcaata ttattgaagc atttatcagg gttattgtct catgagcgga 10380atactcttcc tttttcaata ttattgaagc atttatcagg gttattgtct catgagcgga 10380
tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac atttccccga 10440tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac atttccccga 10440
aaagtgccac ctgacgtc 10458aaagtgccac ctgacgtc 10458
<210> 13<210> 13
<211> 9903<211> 9903
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 13<400> 13
gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720actcacgggg atttccaagt ctccacccca ttgacgtcaa tggggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780aaaatcaacg ggactttcca aaatgtcgta acaactccgcccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840gtaggcgtgt acggtggggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900ctgcttactg gctttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900
accatgggac ctaagaaaaa gaggaaggtg gcggccgctg gcggcagcat gctggatagg 960accatgggac ctaagaaaaa gaggaaggtg gcggccgctg gcggcagcat gctggatagg 960
gatgtgggtc caactcccat gtatccgcct acatacctgg agccagggat tgggaggcac 1020gatgtgggtc caactcccat gtatccgcct acatacctgg agccagggat tgggaggcac 1020
acaccatatg gcaaccaaac tgactacaga atatttgagc ttaacaaacg gcttcagaac 1080acaccatatg gcaaccaaac tgactacaga atatttgagc ttaacaaacg gcttcagaac 1080
tggacagagg agtgtgacaa tctctggtgg gatgcattca cgactgagtt ctttgaggat 1140tggacagagg agtgtgacaa tctctggtgg gatgcattca cgactgagtt ctttgaggat 1140
gatgccatgt tgaccatcac tttctgcctg gaggatggac caaagagata taccattggc 1200gatgccatgt tgaccatcac tttctgcctg gaggatggac caaagagata taccattggc 1200
cggaccctga tcccacgcta cttccgcagc atctttgagg ggggtgctac ggagctgtac 1260cggaccctga tcccacgcta cttccgcagc atctttgagg ggggtgctac ggagctgtac 1260
tatgttctta agcaccccaa ggaggcattc cacagcaact ttgtgtccct cgactgtgac 1320tatgttctta agcaccccaa ggaggcattc cacagcaact ttgtgtccct cgactgtgac 1320
cagggcagca tggtgaccca gcatggcaag cccatgttca cccaggtgtg tgtggagggc 1380cagggcagca tggtgaccca gcatggcaag cccatgttca cccaggtgtg tgtggagggc 1380
cggttgtacc tggagttcat gtttgacgac atgatgcgga taaagacgtg gcacttcagc 1440cggttgtacc tggagttcat gtttgacgac atgatgcgga taaagacgtg gcacttcagc 1440
atccggcagc accgagagct catcccccgc agcatccttg ccatgcatgc ccaagacccc 1500atccggcagc accgagagct catcccccgc agcatccttg ccatgcatgc ccaagacccc 1500
cagatgttgg atcagctctc caaaaacatc actcggtgtg ggctgtccag cggcagcgag 1560cagatgttgg atcagctctc caaaaacatc actcggtgtg ggctgtccag cggcagcgag 1560
acccccggta ccagcgagag cgccaccccc gagagcgaca agaaatactc tattggactg 1620accccccggta ccagcgagag cgccaccccc gagagcgaca agaaatactc tattggactg 1620
gctatcggga caaactccgt tggctgggcc gtcataaccg acgagtataa ggtgccaagc 1680gctatcggga caaactccgt tggctgggcc gtcataaccg acgagtataa ggtgccaagc 1680
aagaaattca aggtgctggg taatactgac cgccattcaa tcaagaagaa cctgatcgga 1740aagaaattca aggtgctggg taatactgac cgccattcaa tcaagaagaa cctgatcgga 1740
gcactcctct tcgactccgg tgaaaccgct gaagctactc ggctgaagcg gaccgcaagg 1800gcactcctct tcgactccgg tgaaaccgct gaagctactc ggctgaagcg gaccgcaagg 1800
cggagataca cccgccgcaa gaatcggata tgttatctgc aagagatctt tagcaacgaa 1860cggagataca cccgccgcaa gaatcggata tgttatctgc aagagatctt tagcaacgaa 1860
atggctaagg tggacgactc cttctttcac cgcctggaag agagctttct ggtggaggag 1920atggctaagg tggacgactc cttctttcac cgcctggaag agagctttct ggtggaggag 1920
gataagaaac acgagaggca ccctatattc ggaaatatcg tggatgaggt ggcttaccat 1980gataagaaac acgagaggca ccctatattc ggaaatatcg tggatgaggt ggcttaccat 1980
gaaaagtatc ctacaatcta ccatctgagg aagaagctgg tggacagcac cgataaagca 2040gaaaagtatc ctacaatcta ccatctgagg aagaagctgg tggacagcac cgataaagca 2040
gacctgaggc tcatctatct ggccctggct catatgataa agtttagagg acactttctg 2100gacctgaggc tcatctatct ggccctggct catatgataa agtttagagg acactttctg 2100
atcgagggcg acctgaatcc cgataattcc gatgtggata aactcttcat tcaactggtg 2160atcgagggcg acctgaatcc cgataattcc gatgtggata aactcttcat tcaactggtg 2160
cagacatata accaactgtt cgaggagaat cccataaacg cttctggtgt ggatgccaag 2220cagacatata accaactgtt cgaggagaat cccataaacg cttctggtgt ggatgccaag 2220
gctattctgt ccgctcggct gtccaagtca cgcagactgg agaatctgat tgcccaactg 2280gctattctgt ccgctcggct gtccaagtca cgcagactgg agaatctgat tgcccaactg 2280
ccaggagaaa agaagaacgg cctgtttggg aacctcatcg ccctgagcct gggcctgaca 2340ccaggagaaa agaagaacgg cctgtttggg aacctcatcg ccctgagcct gggcctgaca 2340
cctaacttca agtccaattt tgatctggcc gaagatgcta aactccagct ctccaaggac 2400cctaacttca agtccaattt tgatctggcc gaagatgcta aactccagct ctccaaggac 2400
acctatgacg atgatctgga caacctgctc gcacagatag gcgaccagta cgccgatctc 2460acctatgacg atgatctgga caacctgctc gcacagatag gcgaccagta cgccgatctc 2460
tttctggctg ctaagaatct ctccgacgcc attctgctga gcgacatact ccgggtcaac 2520tttctggctg ctaagaatct ctccgacgcc attctgctga gcgacatact ccgggtcaac 2520
actgagatca ccaaagcacc tctgagcgcc tccatgataa aacgctatga tgaacaccat 2580actgagatca ccaaagcacc tctgagcgcc tccatgataa aacgctatga tgaacaccat 2580
caagacctga ctctgctcaa agccctcgtg aggcaacagc tgccagagaa gtacaaagag 2640caagacctga ctctgctcaa agccctcgtg aggcaacagc tgccagagaa gtacaaagag 2640
atattcttcg accagagcaa gaatggatat gccggataca tcgatggcgg agcatcacag 2700atattcttcg accagagcaa gaatggatat gccggataca tcgatggcgg agcatcacag 2700
gaagaatttt acaagttcat caaaccaatc ctcgagaaga tggacggtac tgaagagctg 2760gaagaatttt acaagttcat caaaccaatc ctcgagaaga tggacggtac tgaagagctg 2760
ctggtgaagc tgaacaggga ggacctgctg aggaagcaga ggacctttga taatggctcc 2820ctggtgaagc tgaacaggga ggacctgctg aggaagcaga ggacctttga taatggctcc 2820
attccacatc agatacacct gggagagctg catgcaatcc tccgcaggca ggaggatttc 2880attccacatc agatacacct gggagagctg catgcaatcc tccgcaggca ggaggatttc 2880
tatcctttcc tgaaggataa ccgggagaag atagagaaga tcctgacctt caggatccct 2940tatcctttcc tgaaggataa ccgggagaag atagagaaga tcctgacctt caggatccct 2940
tattacgtcg gccctctggc tagaggcaac tcccgcttcg cttggatgac caggaaatct 3000tattacgtcg gccctctggc tagaggcaac tcccgcttcg cttggatgac caggaaatct 3000
gaggagacaa ttactccttg gaacttcgaa gaggtcgtgg ataagggcgc aagcgcccag 3060gaggagacaa ttactccttg gaacttcgaa gaggtcgtgg ataagggcgc aagcgcccag 3060
tcattcatcg aacggatgac caatttcgat aagaacctgc caaacgagaa ggtcctgccc 3120tcattcatcg aacggatgac caatttcgat aagaacctgc caaacgagaa ggtcctgccc 3120
aaacattcac tcctgtacga gtatttcacc gtctataacg agctgactaa agtgaagtac 3180aaacattcac tcctgtacga gtatttcacc gtctataacg agctgactaa agtgaagtac 3180
gtgaccgagg gcatgaggaa gcctgccttc ctgtccggag agcagaagaa ggctatcgtt 3240gtgaccgagg gcatgaggaa gcctgccttc ctgtccggag agcagaagaa ggctatcgtt 3240
gatctgctct tcaagactaa tagaaaggtg acagtgaagc agctcaagga ggattacttt 3300gatctgctct tcaagactaa tagaaaggtg acagtgaagc agctcaagga ggattacttt 3300
aagaagatcg aatgctttga ctcagtggaa atctctggcg tggaggaccg ctttaatgcc 3360aagaagatcg aatgctttga ctcagtggaa atctctggcg tggaggaccg ctttaatgcc 3360
agcctgggca cttaccatga tctgctgaag ataatcaaag acaaagattt cctcgataat 3420agcctgggca cttaccatga tctgctgaag ataatcaaag acaaagattt cctcgataat 3420
gaggagaacg aggacatcct ggaagatatc gtgctgaccc tgactctgtt cgaggataga 3480gaggagaacg aggacatcct ggaagatatc gtgctgaccc tgactctgtt cgaggataga 3480
gagatgatcg aagagcgcct gaagacctat gcccatctgt ttgacgataa agtcatgaaa 3540gagatgatcg aagagcgcct gaagacctat gcccatctgt ttgacgataa agtcatgaaa 3540
cagctcaagc ggcggcgcta cactgggtgg ggtagactct ccaggaaact cataaacggc 3600cagctcaagc ggcggcgcta cactgggtgg ggtagactct ccaggaaact cataaacggc 3600
atccgcgaca aacagagcgg aaagaccatc ctggatttcc tgaaatccga cggattcgct 3660atccgcgaca aacagagcgg aaagaccatc ctggatttcc tgaaatccga cggattcgct 3660
aacaggaact tcatgcaact gattcacgat gactctctga catttaaaga ggacatccag 3720aacaggaact tcatgcaact gattcacgat gactctctga catttaaaga ggacatccag 3720
aaggcacagg tgagcggtca aggcgacagc ctgcacgagc acatcgccaa cctcgctgga 3780aaggcacagg tgagcggtca aggcgacagc ctgcacgagc acatcgccaa cctcgctgga 3780
tcacccgcca taaagaaggg aatactgcag acagtcaagg tcgtggacga actcgtcaaa 3840tcacccgcca taaagaaggg aatactgcag acagtcaagg tcgtggacga actcgtcaaa 3840
gtgatgggtc ggcacaagcc agagaatatc gttatcgaaa tggcaaggga gaaccaaacc 3900gtgatgggtc ggcacaagcc agagaatatc gttatcgaaa tggcaaggga gaaccaaacc 3900
acccagaagg gccagaagaa ctctcgggaa cggatgaaaa gaatcgaaga gggaattaag 3960acccagaagg gccagaagaa ctctcgggaa cggatgaaaa gaatcgaaga gggaattaag 3960
gagctgggat ctcagatact gaaggagcac cctgtggaga atacacagct ccagaacgag 4020gagctgggat ctcagatact gaaggagcac cctgtggaga atacacagct ccagaacgag 4020
aaactctacc tgtactacct ccagaacggg cgggacatgt acgttgacca ggaactcgac 4080aaactctacc tgtactacct ccagaacggg cgggacatgt acgttgacca ggaactcgac 4080
atcaaccggc tgtccgatta tgacgtggac gctattgttc cacagtcctt cctcaaagat 4140atcaaccggc tgtccgatta tgacgtggac gctattgttc cacagtcctt cctcaaagat 4140
gactccattg acaacaaggt gctgaccaga tccgataagg cccgcggtaa gtctgacaat 4200gactccattg acaacaaggt gctgaccaga tccgataagg cccgcggtaa gtctgacaat 4200
gttccatcag aagaggtggt caagaagatg aagaattact ggcggcagct cctcaacgcc 4260gttccatcag aagaggtggt caagaagatg aagaattact ggcggcagct cctcaacgcc 4260
aaactgatca cccagcggaa gtttgacaat ctgactaagg cagaaagagg aggtctgagc 4320aaactgatca cccagcggaa gtttgacaat ctgactaagg cagaaagagg aggtctgagc 4320
gaactcgaca aggccggctt tattaagagg caactggtcg aaacacgcca gattaccaaa 4380gaactcgaca aggccggctt tattaagagg caactggtcg aaacacgcca gattaccaaa 4380
cacgtggcac aaatcctcga ctctaggatg aacactaagt acgatgagaa cgataagctg 4440cacgtggcac aaatcctcga ctctaggatg aacactaagt acgatgagaa cgataagctg 4440
atcagggaag tgaaagtgat aactctgaag agcaagctgg tgtctgactt ccggaaggac 4500atcagggaag tgaaagtgat aactctgaag agcaagctgg tgtctgactt ccggaaggac 4500
tttcaattct acaaagttcg cgaaataaac aattaccatc atgctcacga tgcctatctc 4560tttcaattct acaaagttcg cgaaataaac aattaccatc atgctcacga tgcctatctc 4560
aatgctgtcg ttggcaccgc cctgatcaag aaatacccta aactggagtc tgagttcgtg 4620aatgctgtcg ttggcaccgc cctgatcaag aaatacccta aactggagtc tgagttcgtg 4620
tacggtgact ataaagtcta cgatgtgagg aagatgatag caaagtctga gcaagagatt 4680tacggtgact ataaagtcta cgatgtgagg aagatgatag caaagtctga gcaagagatt 4680
ggcaaagcca ccgccaagta cttcttctac tctaatatca tgaatttctt taagactgag 4740ggcaaagcca ccgccaagta cttcttctac tctaatatca tgaatttctt taagactgag 4740
ataaccctgg ctaacggcga aatccggaag cgcccactga tcgaaacaaa cggagaaaca 4800ataaccctgg ctaacggcga aatccggaag cgcccactga tcgaaacaaa cggagaaaca 4800
ggagaaatcg tgtgggataa aggcagggac ttcgcaactg tgcggaaggt gctgtccatg 4860ggagaaatcg tgtgggataa aggcagggac ttcgcaactg tgcggaaggt gctgtccatg 4860
ccacaagtca atatcgtgaa gaagaccgaa gtgcagaccg gcggattctc aaaggagagc 4920ccacaagtca atatcgtgaa gaagaccgaa gtgcagaccg gcggattctc aaaggagagc 4920
atcctgccaa agcggaactc tgacaagctg atcgccagga agaaagattg ggacccaaag 4980atcctgccaa agcggaactc tgacaagctg atcgccagga agaaagattg ggacccaaag 4980
aagtatggcg gtttcgattc ccctacagtg gcttattccg ttctggtcgt ggcaaaagtg 5040aagtatggcg gtttcgattc ccctacagtg gcttattccg ttctggtcgt ggcaaaagtg 5040
gagaaaggca agtccaagaa actcaagtct gttaaggagc tgctcggaat tactattatg 5100gagaaaggca agtccaagaa actcaagtct gttaaggagc tgctcggaat tactattatg 5100
gagagatcca gcttcgagaa gaatccaatc gatttcctgg aagctaaggg ctataaagaa 5160gagagatcca gcttcgagaa gaatccaatc gatttcctgg aagctaaggg ctataaagaa 5160
gtgaagaaag atctcatcat caaactgccc aagtactctc tctttgagct ggagaatggt 5220gtgaagaaag atctcatcat caaactgccc aagtactctc tctttgagct ggagaatggt 5220
aggaagcgga tgctggcctc cgccggagag ctgcagaaag gaaacgagct ggctctgccc 5280aggaagcgga tgctggcctc cgccggagag ctgcagaaag gaaacgagct ggctctgccc 5280
tccaaatacg tgaacttcct gtatctggcc tcccactacg agaaactcaa aggtagccct 5340tccaaatacg tgaacttcct gtatctggcc tcccactacg agaaactcaa aggtagccct 5340
gaagacaatg agcagaagca actctttgtt gagcaacata aacactacct ggacgaaatc 5400gaagacaatg agcagaagca actctttgtt gagcaacata aacactacct ggacgaaatc 5400
attgaacaga ttagcgagtt cagcaagcgg gttattctgg ccgatgcaaa cctcgataaa 5460attgaacaga ttagcgagtt cagcaagcgg gttatctctgg ccgatgcaaa cctcgataaa 5460
gtgctgagcg catataataa gcacagggac aagccaattc gcgaacaagc agagaatatt 5520gtgctgagcg catataataa gcacagggac aagccaattc gcgaacaagc agagaatatt 5520
atccacctct ttactctgac taatctgggc gctcctgctg ccttcaagta tttcgataca 5580atccacctct ttactctgac taatctgggc gctcctgctg ccttcaagta tttcgataca 5580
actattgaca ggaagcggta cacctctacc aaagaagttc tcgatgccac cctgatacac 5640actattgaca ggaagcggta cacctctacc aaagaagttc tcgatgccac cctgatacac 5640
cagtcaatta ccggactgta cgagactcgc atcgacctgt ctcagctcgg cggcgacggt 5700cagtcaatta ccggactgta cgagactcgc atcgacctgt ctcagctcgg cggcgacggt 5700
tctcccaaga agaagaggaa agtctcgagc ggtggagctg caggatagga attcgggccc 5760tctcccaaga agaagaggaa agtctcgagc ggtggagctg caggatagga attcgggccc 5760
ttcgaaggta agcctatccc taaccctctc ctcggtctcg attctacgcg taccggtcat 5820ttcgaaggta agcctatccc taaccctctc ctcggtctcg attctacgcg taccggtcat 5820
catcaccatc accattgagt ttaaacccgc tgatcagcct cgactgtgcc ttctagttgc 5880catcaccatc accattgagt ttaaacccgc tgatcagcct cgactgtgcc ttctagttgc 5880
cagccatctg ttgtttgccc ctcccccgtg ccttccttga ccctggaagg tgccactccc 5940cagccatctg ttgtttgccc ctcccccgtg ccttccttga ccctggaagg tgccactccc 5940
actgtccttt cctaataaaa tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct 6000actgtccttt cctaataaaa tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct 6000
attctggggg gtggggtggg gcaggacagc aagggggagg attgggaaga caatagcagg 6060attctggggg gtggggtggg gcaggacagc aagggggagg attgggaaga caatagcagg 6060
catgctgggg atgcggtggg ctctatggct tctgaggcgg aaagaaccag ctggggctct 6120catgctgggg atgcggtggg ctctatggct tctgaggcgg aaagaaccag ctggggctct 6120
agggggtatc cccacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg 6180agggggtatc cccacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg 6180
cgcagcgtga ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct 6240cgcagcgtga ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct 6240
tcctttctcg ccacgttcgc cggctttccc cgtcaagctc taaatcgggg catcccttta 6300tcctttctcg ccacgttcgc cggctttccc cgtcaagctc taaatcgggg catcccttta 6300
gggttccgat ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt 6360gggttccgat ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt 6360
tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg 6420tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg 6420
ttctttaata gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat 6480ttctttaata gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat 6480
tcttttgatt tataagggat tttggggatt tcggcctatt ggttaaaaaa tgagctgatt 6540tcttttgatt tataagggat tttggggatt tcggcctatt ggttaaaaaa tgagctgatt 6540
taacaaaaat ttaacgcgaa ttaattctgt ggaatgtgtg tcagttaggg tgtggaaagt 6600taacaaaaat ttaacgcgaa ttaattctgt ggaatgtgtg tcagttaggg tgtggaaagt 6600
ccccaggctc cccaggcagg cagaagtatg caaagcatgc atctcaatta gtcagcaacc 6660ccccaggctc cccaggcagg cagaagtatg caaagcatgc atctcaatta gtcagcaacc 6660
aggtgtggaa agtccccagg ctccccagca ggcagaagta tgcaaagcat gcatctcaat 6720aggtgtggaa agtccccagg ctccccagca ggcagaagta tgcaaagcat gcatctcaat 6720
tagtcagcaa ccatagtccc gcccctaact ccgcccatcc cgcccctaac tccgcccagt 6780tagtcagcaa ccatagtccc gcccctaact ccgcccatcc cgcccctaac tccgcccagt 6780
tccgcccatt ctccgcccca tggctgacta atttttttta tttatgcaga ggccgaggcc 6840tccgcccatt ctccgcccca tggctgacta atttttttta tttatgcaga ggccgaggcc 6840
gcctctgcct ctgagctatt ccagaagtag tgaggaggct tttttggagg cctaggcttt 6900gcctctgcct ctgagctatt ccagaagtag tgaggaggct tttttggagg cctaggcttt 6900
tgcaaaaagc tcccgggagc ttgtatatcc attttcggat ctgatcagca cgtgttgaca 6960tgcaaaaagc tcccgggagc ttgtatatcc attttcggat ctgatcagca cgtgttgaca 6960
attaatcatc ggcatagtat atcggcatag tataatacga caaggtgagg aactaaacca 7020attaatcatc ggcatagtat atcggcatag tataatacga caaggtgagg aactaaacca 7020
tggccaagcc tttgtctcaa gaagaatcca ccctcattga aagagcaacg gctacaatca 7080tggccaagcc tttgtctcaa gaagaatcca ccctcattga aagagcaacg gctacaatca 7080
acagcatccc catctctgaa gactacagcg tcgccagcgc agctctctct agcgacggcc 7140acagcatccc catctctgaa gactacagcg tcgccagcgc agctctctct agcgacggcc 7140
gcatcttcac tggtgtcaat gtatatcatt ttactggggg accttgtgca gaactcgtgg 7200gcatcttcac tggtgtcaat gtatatcatt ttactggggg accttgtgca gaactcgtgg 7200
tgctgggcac tgctgctgct gcggcagctg gcaacctgac ttgtatcgtc gcgatcggaa 7260tgctgggcac tgctgctgct gcggcagctg gcaacctgac ttgtatcgtc gcgatcggaa 7260
atgagaacag gggcatcttg agcccctgcg gacggtgtcg acaggtgctt ctcgatctgc 7320atgagaacag gggcatcttg agcccctgcg gacggtgtcg acaggtgctt ctcgatctgc 7320
atcctgggat caaagcgata gtgaaggaca gtgatggaca gccgacggca gttgggattc 7380atcctggggat caaagcgata gtgaaggaca gtgatggaca gccgacggca gttgggattc 7380
gtgaattgct gccctctggt tatgtgtggg agggctaagc acttcgtggc cgaggagcag 7440gtgaattgct gccctctggt tatgtgtggg agggctaagc acttcgtggc cgaggagcag 7440
gactgacacg tgctacgaga tttcgattcc accgccgcct tctatgaaag gttgggcttc 7500gactgacacg tgctacgaga tttcgattcc accgccgcct tctatgaaag gttgggcttc 7500
ggaatcgttt tccgggacgc cggctggatg atcctccagc gcggggatct catgctggag 7560ggaatcgttt tccgggacgc cggctggatg atcctccagc gcggggatct catgctggag 7560
ttcttcgccc accccaactt gtttattgca gcttataatg gttacaaata aagcaatagc 7620ttcttcgccc accccaactt gtttattgca gcttataatg gttacaaata aagcaatagc 7620
atcacaaatt tcacaaataa agcatttttt tcactgcatt ctagttgtgg tttgtccaaa 7680atcacaaatt tcacaaataa agcatttttt tcactgcatt ctagttgtgg tttgtccaaa 7680
ctcatcaatg tatcttatca tgtctgtata ccgtcgacct ctagctagag cttggcgtaa 7740ctcatcaatg tatcttatca tgtctgtata ccgtcgacct ctagctagag cttggcgtaa 7740
tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc acacaacata 7800tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc acacaacata 7800
cgagccggaa gcataaagtg taaagcctgg ggtgcctaat gagtgagcta actcacatta 7860cgagccggaa gcataaagtg taaagcctgg ggtgcctaat gagtgagcta actcacatta 7860
attgcgttgc gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa 7920attgcgttgc gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa 7920
tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc cgcttcctcg 7980tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc cgcttcctcg 7980
ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag 8040ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag 8040
gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa 8100gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa 8100
ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc 8160ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc 8160
cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca 8220cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca 8220
ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg 8280ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg 8280
accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct 8340accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct 8340
caatgctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt 8400caatgctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt 8400
gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag 8460gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag 8460
tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc 8520tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc 8520
agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac 8580agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac 8580
actagaagga cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga 8640actagaagga cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga 8640
gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc 8700gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc 8700
aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg 8760aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg 8760
gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca 8820gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca 8820
aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt 8880aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt 8880
atatatgagt aaacttggtc tgacagttac caatgcttaa tcagtgaggc acctatctca 8940atatatgagt aaacttggtc tgacagttac caatgcttaa tcagtgaggc acctatctca 8940
gcgatctgtc tatttcgttc atccatagtt gcctgactcc ccgtcgtgta gataactacg 9000gcgatctgtc tatttcgttc atccatagtt gcctgactcc ccgtcgtgta gataactacg 9000
atacgggagg gcttaccatc tggccccagt gctgcaatga taccgcgaga cccacgctca 9060atacgggagg gcttaccatc tggccccagt gctgcaatga taccgcgaga cccacgctca 9060
ccggctccag atttatcagc aataaaccag ccagccggaa gggccgagcg cagaagtggt 9120ccggctccag atttatcagc aataaaccag ccagccggaa gggccgagcg cagaagtggt 9120
cctgcaactt tatccgcctc catccagtct attaattgtt gccgggaagc tagagtaagt 9180cctgcaactt tatccgcctc catccagtct attaattgtt gccgggaagc tagagtaagt 9180
agttcgccag ttaatagttt gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca 9240agttcgccag ttaatagttt gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca 9240
cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag gcgagttaca 9300cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag gcgagttaca 9300
tgatccccca tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat cgttgtcaga 9360tgatccccca tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat cgttgtcaga 9360
agtaagttgg ccgcagtgtt atcactcatg gttatggcag cactgcataa ttctcttact 9420agtaagttgg ccgcagtgtt atcactcatg gttatggcag cactgcataa ttctcttact 9420
gtcatgccat ccgtaagatg cttttctgtg actggtgagt actcaaccaa gtcattctga 9480gtcatgccat ccgtaagatg cttttctgtg actggtgagt actcaaccaa gtcattctga 9480
gaatagtgta tgcggcgacc gagttgctct tgcccggcgt caatacggga taataccgcg 9540gaatagtgta tgcggcgacc gagttgctct tgcccggcgt caatacggga taataccgcg 9540
ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg gcgaaaactc 9600ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg gcgaaaactc 9600
tcaaggatct taccgctgtt gagatccagt tcgatgtaac ccactcgtgc acccaactga 9660tcaaggatct taccgctgtt gagatccagt tcgatgtaac ccactcgtgc acccaactga 9660
tcttcagcat cttttacttt caccagcgtt tctgggtgag caaaaacagg aaggcaaaat 9720tcttcagcat cttttacttt caccagcgtt tctgggtgag caaaaacagg aaggcaaaat 9720
gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa tactcatact cttccttttt 9780gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa tactcatact cttccttttt 9780
caatattatt gaagcattta tcagggttat tgtctcatga gcggatacat atttgaatgt 9840caatattatt gaagcattta tcagggttat tgtctcatga gcggatacat atttgaatgt 9840
atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt gccacctgac 9900atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt gccacctgac 9900
gtc 9903gtc 9903
<210> 14<210> 14
<211> 9933<211> 9933
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 14<400> 14
gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720actcacgggg atttccaagt ctccacccca ttgacgtcaa tggggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780aaaatcaacg ggactttcca aaatgtcgta acaactccgcccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840gtaggcgtgt acggtggggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900ctgcttactg gctttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900
accatgggac ctaagaaaaa gaggaaggtg gcggccgctg actacaaaga ccatgacggt 960accatgggac ctaagaaaaa gaggaaggtg gcggccgctg actacaaaga ccatgacggt 960
gattataaag atcatgacat cgactacaag gatgacgatg acaagtctag agacaagaaa 1020gattataaag atcatgacat cgactacaag gatgacgatg acaagtctag agacaagaaa 1020
tactctattg gactggctat cgggacaaac tccgttggct gggccgtcat aaccgacgag 1080tactctattg gactggctat cgggacaaac tccgttggct gggccgtcat aaccgacgag 1080
tataaggtgc caagcaagaa attcaaggtg ctgggtaata ctgaccgcca ttcaatcaag 1140tataaggtgc caagcaagaa attcaaggtg ctgggtaata ctgaccgcca ttcaatcaag 1140
aagaacctga tcggagcact cctcttcgac tccggtgaaa ccgctgaagc tactcggctg 1200aagaacctga tcggagcact cctcttcgac tccggtgaaa ccgctgaagc tactcggctg 1200
aagcggaccg caaggcggag atacacccgc cgcaagaatc ggatatgtta tctgcaagag 1260aagcggaccg caaggcggag atacacccgc cgcaagaatc ggatatgtta tctgcaagag 1260
atctttagca acgaaatggc taaggtggac gactccttct ttcaccgcct ggaagagagc 1320atctttagca acgaaatggc taaggtggac gactccttct ttcaccgcct ggaagagagc 1320
tttctggtgg aggaggataa gaaacacgag aggcacccta tattcggaaa tatcgtggat 1380tttctggtgg aggaggataa gaaacacgag aggcacccta tattcggaaa tatcgtggat 1380
gaggtggctt accatgaaaa gtatcctaca atctaccatc tgaggaagaa gctggtggac 1440gaggtggctt accatgaaaa gtatcctaca atctaccatc tgaggaagaa gctggtggac 1440
agcaccgata aagcagacct gaggctcatc tatctggccc tggctcatat gataaagttt 1500agcaccgata aagcagacct gaggctcatc tatctggccc tggctcatat gataaagttt 1500
agaggacact ttctgatcga gggcgacctg aatcccgata attccgatgt ggataaactc 1560agaggacact ttctgatcga gggcgacctg aatcccgata attccgatgt ggataaactc 1560
ttcattcaac tggtgcagac atataaccaa ctgttcgagg agaatcccat aaacgcttct 1620ttcattcaac tggtgcagac atataaccaa ctgttcgagg agaatcccat aaacgcttct 1620
ggtgtggatg ccaaggctat tctgtccgct cggctgtcca agtcacgcag actggagaat 1680ggtgtggatg ccaaggctat tctgtccgct cggctgtcca agtcacgcag actggagaat 1680
ctgattgccc aactgccagg agaaaagaag aacggcctgt ttgggaacct catcgccctg 1740ctgattgccc aactgccagg agaaaagaag aacggcctgt ttgggaacct catcgccctg 1740
agcctgggcc tgacacctaa cttcaagtcc aattttgatc tggccgaaga tgctaaactc 1800agcctgggcc tgacacctaa cttcaagtcc aattttgatc tggccgaaga tgctaaactc 1800
cagctctcca aggacaccta tgacgatgat ctggacaacc tgctcgcaca gataggcgac 1860cagctctcca aggacacccta tgacgatgat ctggacaacc tgctcgcaca gataggcgac 1860
cagtacgccg atctctttct ggctgctaag aatctctccg acgccattct gctgagcgac 1920cagtacgccg atctctttct ggctgctaag aatctctccg acgccattct gctgagcgac 1920
atactccggg tcaacactga gatcaccaaa gcacctctga gcgcctccat gataaaacgc 1980atactccggg tcaacactga gatcaccaaa gcacctctga gcgcctccat gataaaacgc 1980
tatgatgaac accatcaaga cctgactctg ctcaaagccc tcgtgaggca acagctgcca 2040tatgatgaac accatcaaga cctgactctg ctcaaagccc tcgtgaggca acagctgcca 2040
gagaagtaca aagagatatt cttcgaccag agcaagaatg gatatgccgg atacatcgat 2100gagaagtaca aagagatatt cttcgaccag agcaagaatg gatatgccgg atacatcgat 2100
ggcggagcat cacaggaaga attttacaag ttcatcaaac caatcctcga gaagatggac 2160ggcggagcat cacaggaaga attttacaag ttcatcaaac caatcctcga gaagatggac 2160
ggtactgaag agctgctggt gaagctgaac agggaggacc tgctgaggaa gcagaggacc 2220ggtactgaag agctgctggt gaagctgaac agggaggacc tgctgaggaa gcagaggacc 2220
tttgataatg gctccattcc acatcagata cacctgggag agctgcatgc aatcctccgc 2280tttgataatg gctccattcc acatcagata cacctgggag agctgcatgc aatcctccgc 2280
aggcaggagg atttctatcc tttcctgaag gataaccggg agaagataga gaagatcctg 2340aggcaggagg atttctatcc tttcctgaag gataaccggg agaagataga gaagatcctg 2340
accttcagga tcccttatta cgtcggccct ctggctagag gcaactcccg cttcgcttgg 2400accttcagga tcccttatta cgtcggccct ctggctagag gcaactcccg cttcgcttgg 2400
atgaccagga aatctgagga gacaattact ccttggaact tcgaagaggt cgtggataag 2460atgaccagga aatctgagga gacaattact ccttggaact tcgaagaggt cgtggataag 2460
ggcgcaagcg cccagtcatt catcgaacgg atgaccaatt tcgataagaa cctgccaaac 2520ggcgcaagcg cccagtcatt catcgaacgg atgaccaatt tcgataagaa cctgccaaac 2520
gagaaggtcc tgcccaaaca ttcactcctg tacgagtatt tcaccgtcta taacgagctg 2580gagaaggtcc tgcccaaaca ttcactcctg tacgagtatt tcaccgtcta taacgagctg 2580
actaaagtga agtacgtgac cgagggcatg aggaagcctg ccttcctgtc cggagagcag 2640actaaagtga agtacgtgac cgagggcatg aggaagcctg ccttcctgtc cggagagcag 2640
aagaaggcta tcgttgatct gctcttcaag actaatagaa aggtgacagt gaagcagctc 2700aagaaggcta tcgttgatct gctcttcaag actaatagaa aggtgacagt gaagcagctc 2700
aaggaggatt actttaagaa gatcgaatgc tttgactcag tggaaatctc tggcgtggag 2760aaggaggatt actttaagaa gatcgaatgc tttgactcag tggaaatctc tggcgtggag 2760
gaccgcttta atgccagcct gggcacttac catgatctgc tgaagataat caaagacaaa 2820gaccgcttta atgccagcct gggcacttac catgatctgc tgaagataat caaagacaaa 2820
gatttcctcg ataatgagga gaacgaggac atcctggaag atatcgtgct gaccctgact 2880gatttcctcg ataatgagga gaacgaggac atcctggaag atatcgtgct gaccctgact 2880
ctgttcgagg atagagagat gatcgaagag cgcctgaaga cctatgccca tctgtttgac 2940ctgttcgagg atagagagat gatcgaagag cgcctgaaga cctatgccca tctgtttgac 2940
gataaagtca tgaaacagct caagcggcgg cgctacactg ggtggggtag actctccagg 3000gataaagtca tgaaacagct caagcggcgg cgctacactg ggtggggtag actctccagg 3000
aaactcataa acggcatccg cgacaaacag agcggaaaga ccatcctgga tttcctgaaa 3060aaactcataa acggcatccg cgacaaacag agcggaaaga ccatcctgga tttcctgaaa 3060
tccgacggat tcgctaacag gaacttcatg caactgattc acgatgactc tctgacattt 3120tccgacggat tcgctaacag gaacttcatg caactgattc acgatgactc tctgacattt 3120
aaagaggaca tccagaaggc acaggtgagc ggtcaaggcg acagcctgca cgagcacatc 3180aaagaggaca tccagaaggc acaggtgagc ggtcaaggcg acagcctgca cgagcacatc 3180
gccaacctcg ctggatcacc cgccataaag aagggaatac tgcagacagt caaggtcgtg 3240gccaacctcg ctggatcacc cgccataaag aagggaatac tgcagacagt caaggtcgtg 3240
gacgaactcg tcaaagtgat gggtcggcac aagccagaga atatcgttat cgaaatggca 3300gacgaactcg tcaaagtgat gggtcggcac aagccagaga atatcgttat cgaaatggca 3300
agggagaacc aaaccaccca gaagggccag aagaactctc gggaacggat gaaaagaatc 3360agggagaacc aaaccaccca gaagggccag aagaactctc gggaacggat gaaaagaatc 3360
gaagagggaa ttaaggagct gggatctcag atactgaagg agcaccctgt ggagaataca 3420gaagagggaa ttaaggagct gggatctcag atactgaagg agcaccctgt ggagaataca 3420
cagctccaga acgagaaact ctacctgtac tacctccaga acgggcggga catgtacgtt 3480cagctccaga acgagaaact ctacctgtac tacctccaga acgggcggga catgtacgtt 3480
gaccaggaac tcgacatcaa ccggctgtcc gattatgacg tggacgctat tgttccacag 3540gaccaggaac tcgacatcaa ccggctgtcc gattatgacg tggacgctat tgttccacag 3540
tccttcctca aagatgactc cattgacaac aaggtgctga ccagatccga taaggcccgc 3600tccttcctca aagatgactc cattgacaac aaggtgctga ccagatccga taaggcccgc 3600
ggtaagtctg acaatgttcc atcagaagag gtggtcaaga agatgaagaa ttactggcgg 3660ggtaagtctg acaatgttcc atcagaagag gtggtcaaga agatgaagaa ttactggcgg 3660
cagctcctca acgccaaact gatcacccag cggaagtttg acaatctgac taaggcagaa 3720cagctcctca acgccaaact gatcacccag cggaagtttg acaatctgac taaggcagaa 3720
agaggaggtc tgagcgaact cgacaaggcc ggctttatta agaggcaact ggtcgaaaca 3780agaggaggtc tgagcgaact cgacaaggcc ggctttatta agaggcaact ggtcgaaaca 3780
cgccagatta ccaaacacgt ggcacaaatc ctcgactcta ggatgaacac taagtacgat 3840cgccagatta ccaaacacgt ggcacaaatc ctcgactcta ggatgaacac taagtacgat 3840
gagaacgata agctgatcag ggaagtgaaa gtgataactc tgaagagcaa gctggtgtct 3900gagaacgata agctgatcag ggaagtgaaa gtgataactc tgaagagcaa gctggtgtct 3900
gacttccgga aggactttca attctacaaa gttcgcgaaa taaacaatta ccatcatgct 3960gacttccgga aggactttca attctacaaa gttcgcgaaa taaacaatta ccatcatgct 3960
cacgatgcct atctcaatgc tgtcgttggc accgccctga tcaagaaata ccctaaactg 4020cacgatgcct atctcaatgc tgtcgttggc accgccctga tcaagaaata ccctaaactg 4020
gagtctgagt tcgtgtacgg tgactataaa gtctacgatg tgaggaagat gatagcaaag 4080gagtctgagt tcgtgtacgg tgactataaa gtctacgatg tgaggaagat gatagcaaag 4080
tctgagcaag agattggcaa agccaccgcc aagtacttct tctactctaa tatcatgaat 4140tctgagcaag agattggcaa agccaccgcc aagtacttct tctactctaa tatcatgaat 4140
ttctttaaga ctgagataac cctggctaac ggcgaaatcc ggaagcgccc actgatcgaa 4200ttctttaaga ctgagataac cctggctaac ggcgaaatcc ggaagcgccc actgatcgaa 4200
acaaacggag aaacaggaga aatcgtgtgg gataaaggca gggacttcgc aactgtgcgg 4260acaaacggag aaacaggaga aatcgtgtgg gataaaggca gggacttcgc aactgtgcgg 4260
aaggtgctgt ccatgccaca agtcaatatc gtgaagaaga ccgaagtgca gaccggcgga 4320aaggtgctgt ccatgccaca agtcaatatc gtgaagaaga ccgaagtgca gaccggcgga 4320
ttctcaaagg agagcatcct gccaaagcgg aactctgaca agctgatcgc caggaagaaa 4380ttctcaaagg agagcatcct gccaaagcgg aactctgaca agctgatcgc caggaagaaa 4380
gattgggacc caaagaagta tggcggtttc gattccccta cagtggctta ttccgttctg 4440gattgggacc caaagaagta tggcggtttc gattccccta cagtggctta ttccgttctg 4440
gtcgtggcaa aagtggagaa aggcaagtcc aagaaactca agtctgttaa ggagctgctc 4500gtcgtggcaa aagtggagaa aggcaagtcc aagaaactca agtctgttaa ggagctgctc 4500
ggaattacta ttatggagag atccagcttc gagaagaatc caatcgattt cctggaagct 4560ggaattacta ttatggagag atccagcttc gagaagaatc caatcgattt cctggaagct 4560
aagggctata aagaagtgaa gaaagatctc atcatcaaac tgcccaagta ctctctcttt 4620aagggctata aagaagtgaa gaaagatctc atcatcaaac tgcccaagta ctctctcttt 4620
gagctggaga atggtaggaa gcggatgctg gcctccgccg gagagctgca gaaaggaaac 4680gagctggaga atggtaggaa gcggatgctg gcctccgccg gagagctgca gaaaggaaac 4680
gagctggctc tgccctccaa atacgtgaac ttcctgtatc tggcctccca ctacgagaaa 4740gagctggctc tgccctccaa atacgtgaac ttcctgtatc tggcctccca ctacgagaaa 4740
ctcaaaggta gccctgaaga caatgagcag aagcaactct ttgttgagca acataaacac 4800ctcaaaggta gccctgaaga caatgagcag aagcaactct ttgttgagca acataaacac 4800
tacctggacg aaatcattga acagattagc gagttcagca agcgggttat tctggccgat 4860tacctggacg aaatcattga acagattagc gagttcagca agcgggttat tctggccgat 4860
gcaaacctcg ataaagtgct gagcgcatat aataagcaca gggacaagcc aattcgcgaa 4920gcaaacctcg ataaagtgct gagcgcatat aataagcaca gggacaagcc aattcgcgaa 4920
caagcagaga atattatcca cctctttact ctgactaatc tgggcgctcc tgctgccttc 4980caagcagaga atattatcca cctctttat ctgactaatc tgggcgctcc tgctgccttc 4980
aagtatttcg atacaactat tgacaggaag cggtacacct ctaccaaaga agttctcgat 5040aagtatttcg atacaactat tgacaggaag cggtacacct ctaccaaaga agttctcgat 5040
gccaccctga tacaccagtc aattaccgga ctgtacgaga ctcgcatcga cctgtctcag 5100gccaccctga tacaccagtc aattaccgga ctgtacgaga ctcgcatcga cctgtctcag 5100
ctcggcggcg acggttctcc caagaagaag aggaaagtcg ggcgcgctgg aggaggatcc 5160ctcggcggcg acggttctcc caagaagaag aggaaagtcg ggcgcgctgg aggaggatcc 5160
ggaggaggat ccggaggagg atccatgctg gatagggatg tgggtccaac tcccatgtat 5220ggaggaggat ccggaggagg atccatgctg gatagggatg tgggtccaac tcccatgtat 5220
ccgcctacat acctggagcc agggattggg aggcacacac catatggcaa ccaaactgac 5280ccgcctacat acctggagcc agggattggg aggcacacac catatggcaa ccaaactgac 5280
tacagaatat ttgagcttaa caaacggctt cagaactgga cagaggagtg tgacaatctc 5340tacagaatat ttgagcttaa caaacggctt cagaactgga cagaggagtg tgacaatctc 5340
tggtgggatg cattcacgac tgagttcttt gaggatgatg ccatgttgac catcactttc 5400tggtggggatg cattcacgac tgagttcttt gaggatgatg ccatgttgac catcactttc 5400
tgcctggagg atggaccaaa gagatatacc attggccgga ccctgatccc acgctacttc 5460tgcctggagg atggaccaaa gagatatacc attggccgga ccctgatccc acgctacttc 5460
cgcagcatct ttgagggggg tgctacggag ctgtactatg ttcttaagca ccccaaggag 5520cgcagcatct ttgagggggg tgctacggag ctgtactatg ttcttaagca ccccaaggag 5520
gcattccaca gcaactttgt gtccctcgac tgtgaccagg gcagcatggt gacccagcat 5580gcattccaca gcaactttgt gtccctcgac tgtgaccagg gcagcatggt gacccagcat 5580
ggcaagccca tgttcaccca ggtgtgtgtg gagggccggt tgtacctgga gttcatgttt 5640ggcaagccca tgttcaccca ggtgtgtgtg gagggccggt tgtacctgga gttcatgttt 5640
gacgacatga tgcggataaa gacgtggcac ttcagcatcc ggcagcaccg agagctcatc 5700gacgacatga tgcggataaa gacgtggcac ttcagcatcc ggcagcaccg agagctcatc 5700
ccccgcagca tccttgccat gcatgcccaa gacccccaga tgttggatca gctctccaaa 5760ccccgcagca tccttgccat gcatgcccaa gacccccaga tgttggatca gctctccaaa 5760
aacatcactc ggtgtgggct gtccgggccc ttcgaaggta agcctatccc taaccctctc 5820aacatcactc ggtgtgggct gtccgggccc ttcgaaggta agcctatccc taaccctctc 5820
ctcggtctcg attctacgcg taccggtcat catcaccatc accattgagt ttaaacccgc 5880ctcggtctcg attctacgcg taccggtcat catcaccatc accattgagt ttaaacccgc 5880
tgatcagcct cgactgtgcc ttctagttgc cagccatctg ttgtttgccc ctcccccgtg 5940tgatcagcct cgactgtgcc ttctagttgc cagccatctg ttgtttgccc ctcccccgtg 5940
ccttccttga ccctggaagg tgccactccc actgtccttt cctaataaaa tgaggaaatt 6000ccttccttga ccctggaagg tgccactccc actgtccttt cctaataaaa tgaggaaatt 6000
gcatcgcatt gtctgagtag gtgtcattct attctggggg gtggggtggg gcaggacagc 6060gcatcgcatt gtctgagtag gtgtcattct attctggggg gtggggtggg gcaggacagc 6060
aagggggagg attgggaaga caatagcagg catgctgggg atgcggtggg ctctatggct 6120aagggggagg attgggaaga caatagcagg catgctgggg atgcggtggg ctctatggct 6120
tctgaggcgg aaagaaccag ctggggctct agggggtatc cccacgcgcc ctgtagcggc 6180tctgaggcgg aaagaaccag ctggggctct aggggggtatc cccacgcgcc ctgtagcggc 6180
gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc 6240gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc 6240
ctagcgcccg ctcctttcgc tttcttccct tcctttctcg ccacgttcgc cggctttccc 6300ctagcgcccg ctcctttcgc tttcttccct tcctttctcg ccacgttcgc cggctttccc 6300
cgtcaagctc taaatcgggg catcccttta gggttccgat ttagtgcttt acggcacctc 6360cgtcaagctc taaatcgggg catcccttta gggttccgat ttagtgcttt acggcacctc 6360
gaccccaaaa aacttgatta gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg 6420gaccccaaaa aacttgatta gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg 6420
gtttttcgcc ctttgacgtt ggagtccacg ttctttaata gtggactctt gttccaaact 6480gtttttcgcc ctttgacgtt ggagtccacg ttctttaata gtggactctt gttccaaact 6480
ggaacaacac tcaaccctat ctcggtctat tcttttgatt tataagggat tttggggatt 6540ggaacaacac tcaaccctat ctcggtctat tcttttgatt tataagggat tttggggatt 6540
tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttaattctgt 6600tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttaattctgt 6600
ggaatgtgtg tcagttaggg tgtggaaagt ccccaggctc cccaggcagg cagaagtatg 6660ggaatgtgtg tcagttaggg tgtggaaagt ccccaggctc cccaggcagg cagaagtatg 6660
caaagcatgc atctcaatta gtcagcaacc aggtgtggaa agtccccagg ctccccagca 6720caaagcatgc atctcaatta gtcagcaacc aggtgtggaa agtccccagg ctccccagca 6720
ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa ccatagtccc gcccctaact 6780ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa ccatagtccc gcccctaact 6780
ccgcccatcc cgcccctaac tccgcccagt tccgcccatt ctccgcccca tggctgacta 6840ccgcccatcc cgcccctaac tccgcccagt tccgcccatt ctccgcccca tggctgacta 6840
atttttttta tttatgcaga ggccgaggcc gcctctgcct ctgagctatt ccagaagtag 6900atttttttta tttatgcaga ggccgaggcc gcctctgcct ctgagctatt ccagaagtag 6900
tgaggaggct tttttggagg cctaggcttt tgcaaaaagc tcccgggagc ttgtatatcc 6960tgaggaggct tttttggagg cctaggcttt tgcaaaaagc tcccggggagc ttgtatatcc 6960
attttcggat ctgatcagca cgtgttgaca attaatcatc ggcatagtat atcggcatag 7020attttcggat ctgatcagca cgtgttgaca attaatcatc ggcatagtat atcggcatag 7020
tataatacga caaggtgagg aactaaacca tggccaagcc tttgtctcaa gaagaatcca 7080tataatacga caaggtgagg aactaaacca tggccaagcc tttgtctcaa gaagaatcca 7080
ccctcattga aagagcaacg gctacaatca acagcatccc catctctgaa gactacagcg 7140ccctcattga aagagcaacg gctacaatca acagcatccc catctctgaa gactacagcg 7140
tcgccagcgc agctctctct agcgacggcc gcatcttcac tggtgtcaat gtatatcatt 7200tcgccagcgc agctctctct agcgacggcc gcatcttcac tggtgtcaat gtatatcatt 7200
ttactggggg accttgtgca gaactcgtgg tgctgggcac tgctgctgct gcggcagctg 7260ttactggggg accttgtgca gaactcgtgg tgctgggcac tgctgctgct gcggcagctg 7260
gcaacctgac ttgtatcgtc gcgatcggaa atgagaacag gggcatcttg agcccctgcg 7320gcaacctgac ttgtatcgtc gcgatcggaa atgagaacag gggcatcttg agcccctgcg 7320
gacggtgtcg acaggtgctt ctcgatctgc atcctgggat caaagcgata gtgaaggaca 7380gacggtgtcg acaggtgctt ctcgatctgc atcctggggat caaagcgata gtgaaggaca 7380
gtgatggaca gccgacggca gttgggattc gtgaattgct gccctctggt tatgtgtggg 7440gtgatggaca gccgacggca gttgggattc gtgaattgct gccctctggt tatgtgtggg 7440
agggctaagc acttcgtggc cgaggagcag gactgacacg tgctacgaga tttcgattcc 7500agggctaagc acttcgtggc cgaggagcag gactgacacg tgctacgaga tttcgattcc 7500
accgccgcct tctatgaaag gttgggcttc ggaatcgttt tccgggacgc cggctggatg 7560accgccgccttctatgaaag gttgggcttc ggaatcgttt tccgggacgc cggctggatg 7560
atcctccagc gcggggatct catgctggag ttcttcgccc accccaactt gtttattgca 7620atcctccagc gcggggatct catgctggag ttcttcgccc accccaactt gtttattgca 7620
gcttataatg gttacaaata aagcaatagc atcacaaatt tcacaaataa agcatttttt 7680gcttataatg gttacaaata aagcaatagc atcacaaatt tcacaaataa agcatttttt 7680
tcactgcatt ctagttgtgg tttgtccaaa ctcatcaatg tatcttatca tgtctgtata 7740tcactgcatt ctagttgtgg tttgtccaaa ctcatcaatg tatcttatca tgtctgtata 7740
ccgtcgacct ctagctagag cttggcgtaa tcatggtcat agctgtttcc tgtgtgaaat 7800ccgtcgacct ctagctagag cttggcgtaa tcatggtcat agctgtttcc tgtgtgaaat 7800
tgttatccgc tcacaattcc acacaacata cgagccggaa gcataaagtg taaagcctgg 7860tgttatccgc tcacaattcc acacaacata cgagccggaa gcataaagtg taaagcctgg 7860
ggtgcctaat gagtgagcta actcacatta attgcgttgc gctcactgcc cgctttccag 7920ggtgcctaat gagtgagcta actcacatta attgcgttgc gctcactgcc cgctttccag 7920
tcgggaaacc tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt 7980tcgggaaacc tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt 7980
ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg 8040ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg 8040
ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg 8100ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg 8100
gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag 8160gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag 8160
gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga 8220gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga 8220
cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct 8280cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct 8280
ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc 8340ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc 8340
tttctccctt cgggaagcgt ggcgctttct caatgctcac gctgtaggta tctcagttcg 8400tttctccctt cgggaagcgt ggcgctttct caatgctcac gctgtaggta tctcagttcg 8400
gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc 8460gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc 8460
tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca 8520tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca 8520
ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag 8580ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag 8580
ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct 8640ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct 8640
ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc 8700ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc 8700
accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga 8760accgctggta gcggtggtttttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga 8760
tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca 8820tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca 8820
cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat 8880cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat 8880
taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac 8940taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac 8940
caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt 9000caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt 9000
gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt 9060gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt 9060
gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag 9120gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag 9120
ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct 9180ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct 9180
attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt 9240attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt 9240
gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc 9300gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc 9300
tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt 9360tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt 9360
agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg 9420agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg 9420
gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg 9480gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg 9480
actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct 9540actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct 9540
tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa agtgctcatc 9600tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa agtgctcatc 9600
attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt 9660attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt 9660
tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt 9720tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt 9720
tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg 9780tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg 9780
aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta tcagggttat 9840aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta tcagggttat 9840
tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg 9900tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg 9900
cgcacatttc cccgaaaagt gccacctgac gtc 9933cgcacatttc cccgaaaagt gccacctgac gtc 9933
<210> 15<210> 15
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 15<400> 15
accgaatatg tcacattctg tctc 24accgaatatg tcacattctg tctc 24
<210> 16<210> 16
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 16<400> 16
aaacgagaca gaatgtgaca tatt 24aaacgagaca gaatgtgaca tatt 24
<210> 17<210> 17
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 17<400> 17
accgggacta tgggaggtca ctaa 24accgggacta tgggaggtca ctaa 24
<210> 18<210> 18
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 18<400> 18
aaacttagtg acctcccata gtcc 24aaacttagtg acctcccata gtcc 24
<210> 19<210> 19
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 19<400> 19
accggaaggt tacacagaac caga 24accggaaggt tacacagaac caga 24
<210> 20<210> 20
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 20<400> 20
aaactctggt tctgtgtaac cttc 24aaactctggt tctgtgtaac cttc 24
<210> 21<210> 21
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 21<400> 21
accgggccaa gagatatatc ttag 24accgggccaa gagatatatc ttag 24
<210> 22<210> 22
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 22<400> 22
aaacctaaga tatatctctt ggcc 24aaacctaaga tatatctctt ggcc 24
<210> 23<210> 23
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 23<400> 23
accggtgcca gaagagccaa ggac 24accggtgcca gaagagccaa ggac 24
<210> 24<210> 24
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 24<400> 24
aaacgtcctt ggctcttctg gcac 24aaacgtcctt ggctcttctg gcac 24
<210> 25<210> 25
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 25<400> 25
accggtggag ccacacccta gggt 24accggtggag ccacacccta gggt 24
<210> 26<210> 26
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 26<400> 26
aaacacccta gggtgtggct ccac 24aaacacccta gggtgtggct ccac 24
<210> 27<210> 27
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 27<400> 27
aaacggagcg caccatcttc ttca 24aaacggagcg caccatcttc ttca 24
<210> 28<210> 28
<211> 24<211> 24
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 28<400> 28
aaactgaaga agatggtgcg ctcc 24aaactgaaga agatggtgcg ctcc 24
<210> 29<210> 29
<211> 82<211> 82
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 29<400> 29
gggacctaag aaaaagagga aggtggcggc cgctggcggc agcatgctgg atagggatgt 60gggacctaag aaaaagagga aggtggcggc cgctggcggc agcatgctgg atagggatgt 60
gggtccaact cccatgtatc cg 82gggtccaact cccatgtatc cg 82
<210> 30<210> 30
<211> 65<211> 65
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 30<400> 30
ctctcggggg tggcgctctc gctggtaccg ggggtctcgc tgccgctctg ggaggcctgt 60ctctcggggg tggcgctctc gctggtaccg ggggtctcgc tgccgctctg ggaggcctgt 60
gacgt 65gacgt 65
<210> 31<210> 31
<211> 84<211> 84
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 31<400> 31
gggcgcgctg gaggaggatc cggaggagga tccggaggag gatccatgct ggatagggat 60gggcgcgctg gaggaggatc cggagggagga tccggaggag gatccatgct ggatagggat 60
gtgggtccaa ctcccatgta tccg 84gtgggtccaa ctcccatgta tccg 84
<210> 32<210> 32
<211> 27<211> 27
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 32<400> 32
gaagggcccc tgggaggcct gtgacgt 27gaagggcccc tgggaggcct gtgacgt 27
<210> 33<210> 33
<211> 60<211> 60
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 33<400> 33
gtggcggccg ctggcggcag catgctggat agggatgtgg gtccaactcc catgtatccg 60gtggcggccg ctggcggcag catgctggat agggatgtgg gtccaactcc catgtatccg 60
<210> 34<210> 34
<211> 57<211> 57
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 34<400> 34
cgctggtacc gggggtctcg ctgccgctgg acagcccaca ccgagtgatg tttttgg 57cgctggtacc gggggtctcg ctgccgctgg acagcccaca ccgagtgatg tttttgg 57
<210> 35<210> 35
<211> 84<211> 84
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 35<400> 35
gggcgcgctg gaggaggatc cggaggagga tccggaggag gatccatgct ggatagggat 60gggcgcgctg gaggaggatc cggagggagga tccggaggag gatccatgct ggatagggat 60
gtgggtccaa ctcccatgta tccg 84gtgggtccaa ctcccatgta tccg 84
<210> 36<210> 36
<211> 35<211> 35
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 36<400> 36
tcgaagggcc cggacagccc acaccgagtg atgtt 35tcgaagggcc cggacagccc acaccgagtg atgtt 35
<210> 37<210> 37
<211> 1767<211> 1767
<212> PRT<212> PRT
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 37<400> 37
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val GlyAsp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 151 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe LysTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 3020 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile GlyVal Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 4535 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu LysAla Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 6050 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys TyrArg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 8065 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser PheLeu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 9585 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys HisPhe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr HisGlu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp SerGlu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His MetThr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro AspIle Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr AsnAsn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala LysGln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn LeuAla Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn LeuIle Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe AspIle Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp AspLeu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp LeuAsp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp IlePhe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser MetLeu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys AlaIle Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe AspLeu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser GlnGln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp GlyGlu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg LysThr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu GlyGln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe LeuGlu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile ProLys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp MetTyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu ValThr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr AsnVal Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser LeuPhe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys TyrLeu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln LysVal Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr ValLys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp SerLys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly ThrVal Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp AsnTyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr LeuGlu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala HisPhe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr ThrLeu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp LysGly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe AlaGln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe LysAsn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu HisGlu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly IleGlu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly ArgLeu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln ThrHis Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile GluThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro ValGlu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu GlnGlu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg LeuAsn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830820 825 830
Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys AspSer Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg GlyAsp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly
850 855 860850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys AsnLys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys PheTyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp LysAsp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr LysAla Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp GluHis Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser LysAsn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg GluLeu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val ValIle Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe ValGly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys SerTyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser
1010 1015 10201010 1015 1020
Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser AsnGlu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn
1025 1030 1035 10401025 1030 1035 1040
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu IleIle Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile
1045 1050 10551045 1050 1055
Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile ValArg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val
1060 1065 10701060 1065 1070
Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser MetTrp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met
1075 1080 10851075 1080 1085
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly PhePro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe
1090 1095 11001090 1095 1100
Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile AlaSer Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala
1105 1110 1115 11201105 1110 1115 1120
Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser ProArg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro
1125 1130 11351125 1130 1135
Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly LysThr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1140 1145 11501140 1145 1150
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile MetSer Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met
1155 1160 11651155 1160 1165
Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala LysGlu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1170 1175 11801170 1175 1180
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys TyrGly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr
1185 1190 1195 12001185 1190 1195 1200
Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser AlaSer Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1205 1210 12151205 1210 1215
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr ValGly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 12301220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser ProAsn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 12451235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His TyrGlu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr
1250 1255 12601250 1255 1260
Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val IleLeu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile
1265 1270 1275 12801265 1270 1275 1280
Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys HisLeu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His
1285 1290 12951285 1290 1295
Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu PheArg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe
1300 1305 13101300 1305 1310
Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp ThrThr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr
1315 1320 13251315 1320 1325
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp AlaThr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala
1330 1335 13401330 1335 1340
Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile AspThr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp
1345 1350 1355 13601345 1350 1355 1360
Leu Ser Gln Leu Gly Gly Asp Gly Ser Pro Lys Lys Lys Arg Lys ValLeu Ser Gln Leu Gly Gly Asp Gly Ser Pro Lys Lys Lys Arg Lys Val
1365 1370 13751365 1370 1375
Gly Arg Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser MetGly Arg Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Met
1380 1385 13901380 1385 1390
Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr LeuLeu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr Leu
1395 1400 14051395 1400 1405
Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp TyrGlu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp Tyr
1410 1415 14201410 1415 1420
Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu CysArg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu Cys
1425 1430 1435 14401425 1430 1435 1440
Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp AspAsp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp Asp
1445 1450 14551445 1450 1455
Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg TyrAla Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg Tyr
1460 1465 14701460 1465 1470
Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe GluThr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe Glu
1475 1480 14851475 1480 1485
Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu AlaGly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu Ala
1490 1495 15001490 1495 1500
Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met ValPhe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met Val
1505 1510 1515 15201505 1510 1515 1520
Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly ArgThr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly Arg
1525 1530 15351525 1530 1535
Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr TrpLeu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr Trp
1540 1545 15501540 1545 1550
His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile LeuHis Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile Leu
1555 1560 15651555 1560 1565
Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys AsnAla Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys Asn
1570 1575 15801570 1575 1580
Ile Thr Arg Cys Gly Leu Ser Asn Ser Thr Leu Asn Tyr Leu Arg LeuIle Thr Arg Cys Gly Leu Ser Asn Ser Thr Leu Asn Tyr Leu Arg Leu
1585 1590 1595 16001585 1590 1595 1600
Cys Val Ile Leu Glu Pro Met Gln Glu Leu Met Ser Arg His Lys ThrCys Val Ile Leu Glu Pro Met Gln Glu Leu Met Ser Arg His Lys Thr
1605 1610 16151605 1610 1615
Tyr Ser Leu Ser Pro Arg Asp Cys Leu Lys Thr Cys Leu Phe Gln LysTyr Ser Leu Ser Pro Arg Asp Cys Leu Lys Thr Cys Leu Phe Gln Lys
1620 1625 16301620 1625 1630
Trp Gln Arg Met Val Ala Pro Pro Ala Glu Pro Thr Arg Gln Gln ProTrp Gln Arg Met Val Ala Pro Pro Ala Glu Pro Thr Arg Gln Gln Pro
1635 1640 16451635 1640 1645
Ser Lys Arg Arg Lys Arg Lys Met Ser Gly Gly Ser Thr Met Ser SerSer Lys Arg Arg Lys Arg Lys Met Ser Gly Gly Ser Thr Met Ser Ser
1650 1655 16601650 1655 1660
Gly Gly Gly Asn Thr Asn Asn Ser Asn Ser Lys Lys Lys Ser Pro AlaGly Gly Gly Asn Thr Asn Asn Ser Asn Ser Lys Lys Lys Ser Pro Ala
1665 1670 1675 16801665 1670 1675 1680
Ser Thr Phe Ala Leu Ser Ser Gln Val Pro Asp Val Met Val Val GlySer Thr Phe Ala Leu Ser Ser Gln Val Pro Asp Val Met Val Val Gly
1685 1690 16951685 1690 1695
Glu Pro Thr Leu Met Gly Gly Glu Phe Gly Asp Glu Asp Glu Arg LeuGlu Pro Thr Leu Met Gly Gly Glu Phe Gly Asp Glu Asp Glu Arg Leu
1700 1705 17101700 1705 1710
Ile Thr Arg Leu Glu Asn Thr Gln Phe Asp Ala Ala Asn Gly Ile AspIle Thr Arg Leu Glu Asn Thr Gln Phe Asp Ala Ala Asn Gly Ile Asp
1715 1720 17251715 1720 1725
Asp Glu Asp Ser Phe Asn Asn Ser Pro Ala Leu Gly Ala Asn Ser ProAsp Glu Asp Ser Phe Asn Asn Ser Pro Ala Leu Gly Ala Asn Ser Pro
1730 1735 17401730 1735 1740
Trp Asn Ser Lys Pro Pro Ser Ser Gln Glu Ser Lys Ser Glu Asn ProTrp Asn Ser Lys Pro Pro Ser Ser Gln Glu Ser Lys Ser Glu Asn Pro
1745 1750 1755 17601745 1750 1755 1760
Thr Ser Gln Ala Ser Gln GlyThr Ser Gln Ala Ser Gln Gly
17651765
<210> 38<210> 38
<211> 1767<211> 1767
<212> PRT<212> PRT
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 38<400> 38
Met Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr TyrMet Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr
1 5 10 151 5 10 15
Leu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr AspLeu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp
20 25 3020 25 30
Tyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu GluTyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu
35 40 4535 40 45
Cys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu AspCys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp
50 55 6050 55 60
Asp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys ArgAsp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg
65 70 75 8065 70 75 80
Tyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile PheTyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe
85 90 9585 90 95
Glu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys GluGlu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu
100 105 110100 105 110
Ala Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser MetAla Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met
115 120 125115 120 125
Val Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu GlyVal Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly
130 135 140130 135 140
Arg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys ThrArg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr
145 150 155 160145 150 155 160
Trp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser IleTrp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile
165 170 175165 170 175
Leu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser LysLeu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys
180 185 190180 185 190
Asn Ile Thr Arg Cys Gly Leu Ser Asn Ser Thr Leu Asn Tyr Leu ArgAsn Ile Thr Arg Cys Gly Leu Ser Asn Ser Thr Leu Asn Tyr Leu Arg
195 200 205195 200 205
Leu Cys Val Ile Leu Glu Pro Met Gln Glu Leu Met Ser Arg His LysLeu Cys Val Ile Leu Glu Pro Met Gln Glu Leu Met Ser Arg His Lys
210 215 220210 215 220
Thr Tyr Ser Leu Ser Pro Arg Asp Cys Leu Lys Thr Cys Leu Phe GlnThr Tyr Ser Leu Ser Pro Arg Asp Cys Leu Lys Thr Cys Leu Phe Gln
225 230 235 240225 230 235 240
Lys Trp Gln Arg Met Val Ala Pro Pro Ala Glu Pro Thr Arg Gln GlnLys Trp Gln Arg Met Val Ala Pro Pro Ala Glu Pro Thr Arg Gln Gln
245 250 255245 250 255
Pro Ser Lys Arg Arg Lys Arg Lys Met Ser Gly Gly Ser Thr Met SerPro Ser Lys Arg Arg Lys Arg Lys Met Ser Gly Gly Ser Thr Met Ser
260 265 270260 265 270
Ser Gly Gly Gly Asn Thr Asn Asn Ser Asn Ser Lys Lys Lys Ser ProSer Gly Gly Gly Asn Thr Asn Asn Ser Asn Ser Lys Lys Lys Ser Pro
275 280 285275 280 285
Ala Ser Thr Phe Ala Leu Ser Ser Gln Val Pro Asp Val Met Val ValAla Ser Thr Phe Ala Leu Ser Ser Gln Val Pro Asp Val Met Val Val
290 295 300290 295 300
Gly Glu Pro Thr Leu Met Gly Gly Glu Phe Gly Asp Glu Asp Glu ArgGly Glu Pro Thr Leu Met Gly Gly Glu Phe Gly Asp Glu Asp Glu Arg
305 310 315 320305 310 315 320
Leu Ile Thr Arg Leu Glu Asn Thr Gln Phe Asp Ala Ala Asn Gly IleLeu Ile Thr Arg Leu Glu Asn Thr Gln Phe Asp Ala Ala Asn Gly Ile
325 330 335325 330 335
Asp Asp Glu Asp Ser Phe Asn Asn Ser Pro Ala Leu Gly Ala Asn SerAsp Asp Glu Asp Ser Phe Asn Asn Ser Pro Ala Leu Gly Ala Asn Ser
340 345 350340 345 350
Pro Trp Asn Ser Lys Pro Pro Ser Ser Gln Glu Ser Lys Ser Glu AsnPro Trp Asn Ser Lys Pro Pro Ser Ser Gln Glu Ser Lys Ser Glu Asn
355 360 365355 360 365
Pro Thr Ser Gln Ala Ser Gln Ser Gly Ser Glu Thr Pro Gly Thr SerPro Thr Ser Gln Ala Ser Gln Ser Gly Ser Glu Thr Pro Gly Thr Ser
370 375 380370 375 380
Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu AlaGlu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
385 390 395 400385 390 395 400
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr LysIle Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
405 410 415405 410 415
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His SerVal Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
420 425 430420 425 430
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu ThrIle Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
435 440 445435 440 445
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr ArgAla Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
450 455 460450 455 460
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu MetArg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
465 470 475 480465 470 475 480
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe LeuAla Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
485 490 495485 490 495
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn IleVal Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
500 505 510500 505 510
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His LeuVal Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
515 520 525515 520 525
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu IleArg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
530 535 540530 535 540
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu IleTyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
545 550 555 560545 550 555 560
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe IleGlu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
565 570 575565 570 575
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile AsnGln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
580 585 590580 585 590
Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser LysAla Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
595 600 605595 600 605
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys LysSer Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
610 615 620610 615 620
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr ProAsn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
625 630 635 640625 630 635 640
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln LeuAsn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
645 650 655645 650 655
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln IleSer Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
660 665 670660 665 670
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser AspGly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
675 680 685675 680 685
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr LysAla Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
690 695 700690 695 700
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His GlnAla Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
705 710 715 720705 710 715 720
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu LysAsp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
725 730 735725 730 735
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly TyrTyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
740 745 750740 745 750
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys ProIle Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
755 760 765755 760 765
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu AsnIle Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
770 775 780770 775 780
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser IleArg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
785 790 795 800785 790 795 800
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg GlnPro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
805 810 815805 810 815
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu LysGlu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
820 825 830820 825 830
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg GlyIle Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
835 840 845835 840 845
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile ThrAsn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
850 855 860850 855 860
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln SerPro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
865 870 875 880865 870 875 880
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu LysPhe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
885 890 895885 890 895
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr AsnVal Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
900 905 910900 905 910
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro AlaGlu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
915 920 925915 920 925
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe LysPhe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
930 935 940930 935 940
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe LysThr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
945 950 955 960945 950 955 960
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp ArgLys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
965 970 975965 970 975
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile LysPhe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
980 985 990980 985 990
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu AspAsp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
995 1000 1005995 1000 1005
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu GluIle Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
1010 1015 10201010 1015 1020
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys GlnArg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
1025 1030 1035 10401025 1030 1035 1040
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys LeuLeu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
1045 1050 10551045 1050 1055
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp PheIle Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
1060 1065 10701060 1065 1070
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile HisLeu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
1075 1080 10851075 1080 1085
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val SerAsp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
1090 1095 11001090 1095 1100
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly SerGly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
1105 1110 1115 11201105 1110 1115 1120
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp GluPro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
1125 1130 11351125 1130 1135
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile GluLeu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
1140 1145 11501140 1145 1150
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser ArgMet Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
1155 1160 11651155 1160 1165
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser GlnGlu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
1170 1175 11801170 1175 1180
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu LysIle Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
1185 1190 1195 12001185 1190 1195 1200
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp GlnLeu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln
1205 1210 12151205 1210 1215
Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp Ala Ile ValGlu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp Ala Ile Val
1220 1225 12301220 1225 1230
Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu ThrPro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr
1235 1240 12451235 1240 1245
Arg Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu GluArg Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1250 1255 12601250 1255 1260
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala LysVal Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys
1265 1270 1275 12801265 1270 1275 1280
Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg GlyLeu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
1285 1290 12951285 1290 1295
Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu ValGly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val
1300 1305 13101300 1305 1310
Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser ArgGlu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg
1315 1320 13251315 1320 1325
Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val LysMet Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys
1330 1335 13401330 1335 1340
Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp PheVal Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe
1345 1350 1355 13601345 1350 1355 1360
Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His AspGln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp
1365 1370 13751365 1370 1375
Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr ProAla Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro
1380 1385 13901380 1385 1390
Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp ValLys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1395 1400 14051395 1400 1405
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr AlaArg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala
1410 1415 14201410 1415 1420
Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu IleLys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1425 1430 1435 14401425 1430 1435 1440
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr AsnThr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn
1445 1450 14551445 1450 1455
Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala ThrGly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1460 1465 14701460 1465 1470
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys ThrVal Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1475 1480 14851475 1480 1485
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys ArgGlu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1490 1495 15001490 1495 1500
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys LysAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys
1505 1510 1515 15201505 1510 1515 1520
Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val ValTyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val
1525 1530 15351525 1530 1535
Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys GluAla Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu
1540 1545 15501540 1545 1550
Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn ProLeu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro
1555 1560 15651555 1560 1565
Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp LeuIle Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu
1570 1575 15801570 1575 1580
Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly ArgIle Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg
1585 1590 1595 16001585 1590 1595 1600
Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu LeuLys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu
1605 1610 16151605 1610 1615
Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His TyrAla Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr
1620 1625 16301620 1625 1630
Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu PheGlu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1635 1640 16451635 1640 1645
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile SerVal Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser
1650 1655 16601650 1655 1660
Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys ValGlu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1665 1670 1675 16801665 1670 1675 1680
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln AlaLeu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala
1685 1690 16951685 1690 1695
Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro AlaGlu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1700 1705 17101700 1705 1710
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr SerAla Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1715 1720 17251715 1720 1725
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr GlyThr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1730 1735 17401730 1735 1740
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Gly SerLeu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Gly Ser
1745 1750 1755 17601745 1750 1755 1760
Pro Lys Lys Lys Arg Lys ValPro Lys Lys Lys Arg Lys Val
17651765
<210> 39<210> 39
<211> 1592<211> 1592
<212> PRT<212> PRT
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 39<400> 39
Met Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr TyrMet Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr
1 5 10 151 5 10 15
Leu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr AspLeu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp
20 25 3020 25 30
Tyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu GluTyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu
35 40 4535 40 45
Cys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu AspCys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp
50 55 6050 55 60
Asp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys ArgAsp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg
65 70 75 8065 70 75 80
Tyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile PheTyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe
85 90 9585 90 95
Glu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys GluGlu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu
100 105 110100 105 110
Ala Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser MetAla Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met
115 120 125115 120 125
Val Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu GlyVal Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly
130 135 140130 135 140
Arg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys ThrArg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr
145 150 155 160145 150 155 160
Trp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser IleTrp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile
165 170 175165 170 175
Leu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser LysLeu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys
180 185 190180 185 190
Asn Ile Thr Arg Cys Gly Leu Ser Ser Gly Ser Glu Thr Pro Gly ThrAsn Ile Thr Arg Cys Gly Leu Ser Ser Gly Ser Glu Thr Pro Gly Thr
195 200 205195 200 205
Ser Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly LeuSer Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu
210 215 220210 215 220
Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu TyrAla Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr
225 230 235 240225 230 235 240
Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg HisLys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His
245 250 255245 250 255
Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly GluSer Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu
260 265 270260 265 270
Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr ThrThr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr
275 280 285275 280 285
Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn GluArg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu
290 295 300290 295 300
Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser PheMet Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe
305 310 315 320305 310 315 320
Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly AsnLeu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn
325 330 335325 330 335
Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr HisIle Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His
340 345 350340 345 350
Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg LeuLeu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu
355 360 365355 360 365
Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe LeuIle Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu
370 375 380370 375 380
Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu PheIle Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe
385 390 395 400385 390 395 400
Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro IleIle Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile
405 410 415405 410 415
Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu SerAsn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser
420 425 430420 425 430
Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu LysLys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys
435 440 445435 440 445
Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu ThrLys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr
450 455 460450 455 460
Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu GlnPro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln
465 470 475 480465 470 475 480
Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala GlnLeu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln
485 490 495485 490 495
Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu SerIle Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser
500 505 510500 505 510
Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile ThrAsp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr
515 520 525515 520 525
Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His HisLys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His
530 535 540530 535 540
Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro GluGln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu
545 550 555 560545 550 555 560
Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala GlyLys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly
565 570 575565 570 575
Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile LysTyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys
580 585 590580 585 590
Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys LeuPro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu
595 600 605595 600 605
Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly SerAsn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
610 615 620610 615 620
Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg ArgIle Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg
625 630 635 640625 630 635 640
Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile GluGln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu
645 650 655645 650 655
Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala ArgLys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg
660 665 670660 665 670
Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr IleGly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile
675 680 685675 680 685
Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala GlnThr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln
690 695 700690 695 700
Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn GluSer Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu
705 710 715 720705 710 715 720
Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val TyrLys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr
725 730 735725 730 735
Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys ProAsn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro
740 745 750740 745 750
Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu PheAla Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe
755 760 765755 760 765
Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr PheLys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe
770 775 780770 775 780
Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu AspLys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp
785 790 795 800785 790 795 800
Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile IleArg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile
805 810 815805 810 815
Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu GluLys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu
820 825 830820 825 830
Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile GluAsp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu
835 840 845835 840 845
Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met LysGlu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys
850 855 860850 855 860
Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg LysGln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys
865 870 875 880865 870 875 880
Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu AspLeu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp
885 890 895885 890 895
Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu IlePhe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile
900 905 910900 905 910
His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln ValHis Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val
915 920 925915 920 925
Ser Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala GlySer Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly
930 935 940930 935 940
Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val AspSer Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp
945 950 955 960945 950 955 960
Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val IleGlu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile
965 970 975965 970 975
Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn SerGlu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser
980 985 990980 985 990
Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly SerArg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser
995 1000 1005995 1000 1005
Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn GluGln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu
1010 1015 10201010 1015 1020
Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val AspLys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp
1025 1030 1035 10401025 1030 1035 1040
Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp Ala IleGln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp Ala Ile
1045 1050 10551045 1050 1055
Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val LeuVal Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu
1060 1065 10701060 1065 1070
Thr Arg Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser GluThr Arg Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu
1075 1080 10851075 1080 1085
Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn AlaGlu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
1090 1095 11001090 1095 1100
Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu ArgLys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg
1105 1110 1115 11201105 1110 1115 1120
Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln LeuGly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu
1125 1130 11351125 1130 1135
Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp SerVal Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser
1140 1145 11501140 1145 1150
Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu ValArg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val
1155 1160 11651155 1160 1165
Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys AspLys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp
1170 1175 11801170 1175 1180
Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala HisPhe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His
1185 1190 1195 12001185 1190 1195 1200
Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys TyrAsp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr
1205 1210 12151205 1210 1215
Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr AspPro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp
1220 1225 12301220 1225 1230
Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala ThrVal Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr
1235 1240 12451235 1240 1245
Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr GluAla Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu
1250 1255 12601250 1255 1260
Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrIle Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr
1265 1270 1275 12801265 1270 1275 1280
Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe AlaAsn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala
1285 1290 12951285 1290 1295
Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1300 1305 13101300 1305 1310
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro LysThr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1315 1320 13251315 1320 1325
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro LysArg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1330 1335 13401330 1335 1340
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu ValLys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val
1345 1350 1355 13601345 1350 1355 1360
Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val LysVal Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys
1365 1370 13751365 1370 1375
Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys AsnGlu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn
1380 1385 13901380 1385 1390
Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys AspPro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp
1395 1400 14051395 1400 1405
Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn GlyLeu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly
1410 1415 14201410 1415 1420
Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn GluArg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu
1425 1430 1435 14401425 1430 1435 1440
Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser HisLeu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His
1445 1450 14551445 1450 1455
Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln LeuTyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu
1460 1465 14701460 1465 1470
Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln IlePhe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile
1475 1480 14851475 1480 1485
Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp LysSer Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys
1490 1495 15001490 1495 1500
Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu GlnVal Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln
1505 1510 1515 15201505 1510 1515 1520
Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala ProAla Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro
1525 1530 15351525 1530 1535
Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr ThrAla Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1540 1545 15501540 1545 1550
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile ThrSer Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1555 1560 15651555 1560 1565
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp GlyGly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Gly
1570 1575 15801570 1575 1580
Ser Pro Lys Lys Lys Arg Lys ValSer Pro Lys Lys Lys Arg Lys Val
1585 15901585 1590
<210> 40<210> 40
<211> 1592<211> 1592
<212> PRT<212> PRT
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 40<400> 40
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val GlyAsp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 151 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe LysTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 3020 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile GlyVal Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 4535 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu LysAla Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 6050 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys TyrArg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 8065 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser PheLeu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 9585 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys HisPhe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr HisGlu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp SerGlu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His MetThr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro AspIle Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr AsnAsn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala LysGln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn LeuAla Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn LeuIle Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe AspIle Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp AspLeu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp LeuAsp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp IlePhe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser MetLeu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys AlaIle Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe AspLeu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser GlnGln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp GlyGlu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg LysThr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu GlyGln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe LeuGlu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile ProLys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp MetTyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu ValThr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr AsnVal Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser LeuPhe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys TyrLeu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln LysVal Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr ValLys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp SerLys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly ThrVal Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp AsnTyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr LeuGlu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala HisPhe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr ThrLeu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp LysGly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe AlaGln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe LysAsn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu HisGlu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly IleGlu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly ArgLeu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln ThrHis Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile GluThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro ValGlu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu GlnGlu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg LeuAsn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830820 825 830
Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys AspSer Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg GlyAsp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly
850 855 860850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys AsnLys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys PheTyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp LysAsp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr LysAla Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp GluHis Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser LysAsn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg GluLeu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val ValIle Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe ValGly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys SerTyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser
1010 1015 10201010 1015 1020
Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser AsnGlu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn
1025 1030 1035 10401025 1030 1035 1040
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu IleIle Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile
1045 1050 10551045 1050 1055
Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile ValArg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val
1060 1065 10701060 1065 1070
Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser MetTrp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met
1075 1080 10851075 1080 1085
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly PhePro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe
1090 1095 11001090 1095 1100
Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile AlaSer Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala
1105 1110 1115 11201105 1110 1115 1120
Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser ProArg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro
1125 1130 11351125 1130 1135
Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly LysThr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1140 1145 11501140 1145 1150
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile MetSer Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met
1155 1160 11651155 1160 1165
Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala LysGlu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1170 1175 11801170 1175 1180
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys TyrGly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr
1185 1190 1195 12001185 1190 1195 1200
Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser AlaSer Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1205 1210 12151205 1210 1215
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr ValGly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 12301220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser ProAsn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 12451235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His TyrGlu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr
1250 1255 12601250 1255 1260
Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val IleLeu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile
1265 1270 1275 12801265 1270 1275 1280
Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys HisLeu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His
1285 1290 12951285 1290 1295
Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu PheArg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe
1300 1305 13101300 1305 1310
Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp ThrThr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr
1315 1320 13251315 1320 1325
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp AlaThr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala
1330 1335 13401330 1335 1340
Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile AspThr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp
1345 1350 1355 13601345 1350 1355 1360
Leu Ser Gln Leu Gly Gly Asp Gly Ser Pro Lys Lys Lys Arg Lys ValLeu Ser Gln Leu Gly Gly Asp Gly Ser Pro Lys Lys Lys Arg Lys Val
1365 1370 13751365 1370 1375
Gly Arg Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser MetGly Arg Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Met
1380 1385 13901380 1385 1390
Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr LeuLeu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr Leu
1395 1400 14051395 1400 1405
Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp TyrGlu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp Tyr
1410 1415 14201410 1415 1420
Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu CysArg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu Cys
1425 1430 1435 14401425 1430 1435 1440
Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp AspAsp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp Asp
1445 1450 14551445 1450 1455
Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg TyrAla Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg Tyr
1460 1465 14701460 1465 1470
Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe GluThr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe Glu
1475 1480 14851475 1480 1485
Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu AlaGly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu Ala
1490 1495 15001490 1495 1500
Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met ValPhe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met Val
1505 1510 1515 15201505 1510 1515 1520
Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly ArgThr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly Arg
1525 1530 15351525 1530 1535
Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr TrpLeu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr Trp
1540 1545 15501540 1545 1550
His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile LeuHis Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile Leu
1555 1560 15651555 1560 1565
Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys AsnAla Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys Asn
1570 1575 15801570 1575 1580
Ile Thr Arg Cys Gly Leu Ser GlyIle Thr Arg Cys Gly Leu Ser Gly
1585 15901585 1590
<210> 41<210> 41
<211> 16<211> 16
<212> PRT<212> PRT
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 41<400> 41
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu SerSer Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 151 5 10 15
<210> 42<210> 42
<211> 15<211> 15
<212> PRT<212> PRT
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 42<400> 42
Gly Arg Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly SerGly Arg Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser
1 5 10 151 5 10 15
<210> 43<210> 43
<211> 23<211> 23
<212> DNA<212> DNA
<213> 人工序列(Artificial Sequence)<213> Artificial Sequence
<400> 43<400> 43
gnnnnnnnnn nnnnnnnnnn ngg 23gnnnnnnnnn nnnnnnnn ngg 23
Claims (16)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910063623.6A CN111471665B (en) | 2019-01-23 | 2019-01-23 | DNA cyclization molecule and application thereof |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910063623.6A CN111471665B (en) | 2019-01-23 | 2019-01-23 | DNA cyclization molecule and application thereof |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111471665A CN111471665A (en) | 2020-07-31 |
| CN111471665B true CN111471665B (en) | 2023-07-04 |
Family
ID=71743398
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910063623.6A Active CN111471665B (en) | 2019-01-23 | 2019-01-23 | DNA cyclization molecule and application thereof |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111471665B (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102858985A (en) * | 2009-07-24 | 2013-01-02 | 西格马-奥尔德里奇有限责任公司 | Method for genome editing |
| KR20130083964A (en) * | 2012-01-16 | 2013-07-24 | 김승찬 | The alteration of signal transduction by applying cultured cell line with antagomeric dna antisense oligmer targeting hsa-mir-129-5p mirna |
| AU2013272283A1 (en) * | 2012-06-07 | 2015-01-15 | The Children's Hospital Of Philadelphia | Controlled gene expression methods |
| CN107636017A (en) * | 2015-04-10 | 2018-01-26 | 费尔丹生物公司 | Polypeptide-based shuttle agents for improving the efficiency of cytoplasmic transduction of polypeptide payloads to target eukaryotic cells, uses thereof, and methods and kits related thereto |
| CN108064283A (en) * | 2015-02-24 | 2018-05-22 | 加利福尼亚大学董事会 | Binding triggered transcriptional switches and methods of use thereof |
-
2019
- 2019-01-23 CN CN201910063623.6A patent/CN111471665B/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102858985A (en) * | 2009-07-24 | 2013-01-02 | 西格马-奥尔德里奇有限责任公司 | Method for genome editing |
| KR20130083964A (en) * | 2012-01-16 | 2013-07-24 | 김승찬 | The alteration of signal transduction by applying cultured cell line with antagomeric dna antisense oligmer targeting hsa-mir-129-5p mirna |
| AU2013272283A1 (en) * | 2012-06-07 | 2015-01-15 | The Children's Hospital Of Philadelphia | Controlled gene expression methods |
| CN108064283A (en) * | 2015-02-24 | 2018-05-22 | 加利福尼亚大学董事会 | Binding triggered transcriptional switches and methods of use thereof |
| CN107636017A (en) * | 2015-04-10 | 2018-01-26 | 费尔丹生物公司 | Polypeptide-based shuttle agents for improving the efficiency of cytoplasmic transduction of polypeptide payloads to target eukaryotic cells, uses thereof, and methods and kits related thereto |
Non-Patent Citations (3)
| Title |
|---|
| LDB1-mediated enhancer looping can be established independent of mediator and cohesin;Ivan Krivega et al.;《Nucleic Acids Research》;20170518;第45卷(第14期);第8255页摘要、第8260页第1段-第8261页左栏第1段、图4 * |
| Long-range enhancer–promoter contacts in gene expression control;Stefan Schoenfelder et al.;《nature reviews genetics》;20190513;第20卷;第437-455页 * |
| Role of LDB1 in the transition from chromatin looping to transcription activation;Ivan Krivega et al.;《GENES & DEVELOPMENT》;20140529;第28卷;第1278页第1段、第1280页第3段、第1281页右栏第3段 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111471665A (en) | 2020-07-31 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108495685B (en) | Yeast-based immunotherapy for Clostridium difficile infection | |
| CN110914439B (en) | Self-inactivating viral vector | |
| CN110835633B (en) | Preparation and application of PTC stable cell line using optimized gene codon expansion system | |
| US8283518B2 (en) | Administration of transposon-based vectors to reproductive organs | |
| CA2388535A1 (en) | Ligand activated transcriptional regulator proteins | |
| JP2003534775A (en) | Methods for destabilizing proteins and uses thereof | |
| KR20230056630A (en) | Novel OMNI-59, 61, 67, 76, 79, 80, 81 and 82 CRISPR nucleases | |
| KR20220149588A (en) | Compositions and methods for the treatment of metabolic liver disorders | |
| KR20230131229A (en) | Site-specific genetic modification | |
| KR102663134B1 (en) | Production cell line enhancers | |
| WO2022241455A1 (en) | A synthetic circuit for buffering gene dosage variation between individual mammalian cells | |
| CN111471665B (en) | DNA cyclization molecule and application thereof | |
| CN107541526B (en) | CIK capable of knocking down endogenous CTLA4 expression and preparation method and application thereof | |
| KR101077689B1 (en) | Hypoxia inducible vegf plasmid for ischemic disease | |
| KR102065917B1 (en) | Transformed stem cell expressing glb1 gene and pharmaceutical composition for treating gm1 gangliosidosis comprising same | |
| CN113005140B (en) | GS expression vector with double expression cassettes and application thereof | |
| KR102061251B1 (en) | Recombinant cell and method for production of endogenous polypeptide | |
| CN109055425B (en) | A Xenopus laevis oocyte expression vector with yellow or red fluorescent protein label and its application | |
| WO1999047151A1 (en) | Peptide ligands for the erythropoietin receptor | |
| CN110777147A (en) | IKZF3 gene-silenced T cell and application thereof | |
| CA2677475C (en) | Expression of polypeptides from the nuclear genome of ostreococcus sp | |
| US11225666B2 (en) | Plasmid vector for expressing a PVT1 exon and method for constructing standard curve therefor | |
| CN106566844B (en) | Method for down-regulating eukaryotic gene expression and kit thereof | |
| US20240415978A1 (en) | Anellovectors for delivery of effectors to the central nervous system | |
| CN109402155A (en) | A kind of dual control delay cracking performance plasmid and its construction method and application |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |