US20120129715A1 - Gb1 peptidic libraries and methods of screening the same - Google Patents
Gb1 peptidic libraries and methods of screening the same Download PDFInfo
- Publication number
- US20120129715A1 US20120129715A1 US13/294,072 US201113294072A US2012129715A1 US 20120129715 A1 US20120129715 A1 US 20120129715A1 US 201113294072 A US201113294072 A US 201113294072A US 2012129715 A1 US2012129715 A1 US 2012129715A1
- Authority
- US
- United States
- Prior art keywords
- amino acid
- seq
- library
- sequence
- mutations
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 82
- 238000012216 screening Methods 0.000 title abstract description 41
- 150000001875 compounds Chemical class 0.000 claims abstract description 187
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 152
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 125
- 108091033319 polynucleotide Proteins 0.000 claims abstract description 16
- 102000040430 polynucleotide Human genes 0.000 claims abstract description 16
- 239000002157 polynucleotide Substances 0.000 claims abstract description 16
- 230000035772 mutation Effects 0.000 claims description 209
- 150000001413 amino acids Chemical class 0.000 claims description 128
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 115
- 238000003780 insertion Methods 0.000 claims description 70
- 230000037431 insertion Effects 0.000 claims description 70
- 108020004705 Codon Proteins 0.000 claims description 19
- 238000006467 substitution reaction Methods 0.000 claims description 17
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 3
- 230000027455 binding Effects 0.000 abstract description 82
- 230000009870 specific binding Effects 0.000 abstract description 14
- 108090000765 processed proteins & peptides Proteins 0.000 description 37
- 102000004196 processed proteins & peptides Human genes 0.000 description 36
- 229920001184 polypeptide Polymers 0.000 description 34
- 239000013598 vector Substances 0.000 description 30
- 238000002823 phage display Methods 0.000 description 25
- 210000004027 cell Anatomy 0.000 description 24
- 230000004927 fusion Effects 0.000 description 20
- 230000009824 affinity maturation Effects 0.000 description 19
- 239000002245 particle Substances 0.000 description 17
- 239000002904 solvent Substances 0.000 description 17
- 108020001507 fusion proteins Proteins 0.000 description 16
- 102000037865 fusion proteins Human genes 0.000 description 16
- 101710125418 Major capsid protein Proteins 0.000 description 15
- 108091034117 Oligonucleotide Proteins 0.000 description 15
- 230000001588 bifunctional effect Effects 0.000 description 15
- -1 GB1 compound Chemical class 0.000 description 14
- 101710132601 Capsid protein Proteins 0.000 description 13
- 101710094648 Coat protein Proteins 0.000 description 13
- 101710141454 Nucleoprotein Proteins 0.000 description 13
- 101710083689 Probable capsid protein Proteins 0.000 description 13
- 239000012634 fragment Substances 0.000 description 13
- 150000007523 nucleic acids Chemical class 0.000 description 13
- 239000000243 solution Substances 0.000 description 13
- 108020004414 DNA Proteins 0.000 description 12
- 102100021181 Golgi phosphoprotein 3 Human genes 0.000 description 12
- 125000000539 amino acid group Chemical group 0.000 description 12
- 239000011230 binding agent Substances 0.000 description 12
- 230000002209 hydrophobic effect Effects 0.000 description 12
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 11
- 230000001225 therapeutic effect Effects 0.000 description 11
- 238000012217 deletion Methods 0.000 description 9
- 230000037430 deletion Effects 0.000 description 9
- 108090000565 Capsid Proteins Proteins 0.000 description 8
- 238000002965 ELISA Methods 0.000 description 8
- 102000005789 Vascular Endothelial Growth Factors Human genes 0.000 description 8
- 108010019530 Vascular Endothelial Growth Factors Proteins 0.000 description 8
- 239000000872 buffer Substances 0.000 description 8
- 238000013461 design Methods 0.000 description 8
- 230000010076 replication Effects 0.000 description 8
- 238000011160 research Methods 0.000 description 8
- 241000588724 Escherichia coli Species 0.000 description 7
- 241000724791 Filamentous phage Species 0.000 description 7
- 108010076504 Protein Sorting Signals Proteins 0.000 description 7
- 108010073929 Vascular Endothelial Growth Factor A Proteins 0.000 description 7
- 230000003278 mimic effect Effects 0.000 description 7
- 241001515965 unidentified phage Species 0.000 description 7
- 206010028980 Neoplasm Diseases 0.000 description 6
- 239000007977 PBT buffer Substances 0.000 description 6
- 108020005038 Terminator Codon Proteins 0.000 description 6
- 238000011161 development Methods 0.000 description 6
- 238000006471 dimerization reaction Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 238000000126 in silico method Methods 0.000 description 6
- 102000039446 nucleic acids Human genes 0.000 description 6
- 108020004707 nucleic acids Proteins 0.000 description 6
- 239000013612 plasmid Substances 0.000 description 6
- 230000002265 prevention Effects 0.000 description 6
- 238000012546 transfer Methods 0.000 description 6
- 108091005804 Peptidases Proteins 0.000 description 5
- 108091000080 Phosphotransferase Proteins 0.000 description 5
- 239000004365 Protease Substances 0.000 description 5
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 5
- 210000004899 c-terminal region Anatomy 0.000 description 5
- 201000011510 cancer Diseases 0.000 description 5
- 238000000338 in vitro Methods 0.000 description 5
- 230000003993 interaction Effects 0.000 description 5
- 239000003446 ligand Substances 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 102000020233 phosphotransferase Human genes 0.000 description 5
- 238000011533 pre-incubation Methods 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 102000005962 receptors Human genes 0.000 description 5
- 108020003175 receptors Proteins 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- 101710120037 Toxin CcdB Proteins 0.000 description 4
- 239000011324 bead Substances 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 239000013078 crystal Substances 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 239000013604 expression vector Substances 0.000 description 4
- 229940088597 hormone Drugs 0.000 description 4
- 239000005556 hormone Substances 0.000 description 4
- 238000001727 in vivo Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000002703 mutagenesis Methods 0.000 description 4
- 231100000350 mutagenesis Toxicity 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- HBAQYPYDRFILMT-UHFFFAOYSA-N 8-[3-(1-cyclopropylpyrazol-4-yl)-1H-pyrazolo[4,3-d]pyrimidin-5-yl]-3-methyl-3,8-diazabicyclo[3.2.1]octan-2-one Chemical class C1(CC1)N1N=CC(=C1)C1=NNC2=C1N=C(N=C2)N1C2C(N(CC1CC2)C)=O HBAQYPYDRFILMT-UHFFFAOYSA-N 0.000 description 3
- 102000007644 Colony-Stimulating Factors Human genes 0.000 description 3
- 108010071942 Colony-Stimulating Factors Proteins 0.000 description 3
- 150000008574 D-amino acids Chemical class 0.000 description 3
- ULGZDMOVFRHVEP-RWJQBGPGSA-N Erythromycin Chemical compound O([C@@H]1[C@@H](C)C(=O)O[C@@H]([C@@]([C@H](O)[C@@H](C)C(=O)[C@H](C)C[C@@](C)(O)[C@H](O[C@H]2[C@@H]([C@H](C[C@@H](C)O2)N(C)C)O)[C@H]1C)(C)O)CC)[C@H]1C[C@@](C)(OC)[C@@H](O)[C@H](C)O1 ULGZDMOVFRHVEP-RWJQBGPGSA-N 0.000 description 3
- 108091028043 Nucleic acid sequence Proteins 0.000 description 3
- 241000725643 Respiratory syncytial virus Species 0.000 description 3
- 102000000395 SH3 domains Human genes 0.000 description 3
- 108050008861 SH3 domains Proteins 0.000 description 3
- 108060008682 Tumor Necrosis Factor Proteins 0.000 description 3
- 108010053096 Vascular Endothelial Growth Factor Receptor-1 Proteins 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 238000007792 addition Methods 0.000 description 3
- 239000000427 antigen Substances 0.000 description 3
- 108091007433 antigens Proteins 0.000 description 3
- 102000036639 antigens Human genes 0.000 description 3
- 230000001580 bacterial effect Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 238000011049 filling Methods 0.000 description 3
- 239000003102 growth factor Substances 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000008194 pharmaceutical composition Substances 0.000 description 3
- 230000004962 physiological condition Effects 0.000 description 3
- 238000000159 protein binding assay Methods 0.000 description 3
- 230000012846 protein folding Effects 0.000 description 3
- 238000002741 site-directed mutagenesis Methods 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- GOJUJUVQIVIZAV-UHFFFAOYSA-N 2-amino-4,6-dichloropyrimidine-5-carbaldehyde Chemical group NC1=NC(Cl)=C(C=O)C(Cl)=N1 GOJUJUVQIVIZAV-UHFFFAOYSA-N 0.000 description 2
- UAIUNKRWKOVEES-UHFFFAOYSA-N 3,3',5,5'-tetramethylbenzidine Chemical compound CC1=C(N)C(C)=CC(C=2C=C(C)C(N)=C(C)C=2)=C1 UAIUNKRWKOVEES-UHFFFAOYSA-N 0.000 description 2
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 2
- 102100034608 Angiopoietin-2 Human genes 0.000 description 2
- 108091023037 Aptamer Proteins 0.000 description 2
- 208000032791 BCR-ABL1 positive chronic myelogenous leukemia Diseases 0.000 description 2
- 108010021064 CTLA-4 Antigen Proteins 0.000 description 2
- 102000008203 CTLA-4 Antigen Human genes 0.000 description 2
- 101710169873 Capsid protein G8P Proteins 0.000 description 2
- 208000024172 Cardiovascular disease Diseases 0.000 description 2
- 208000010833 Chronic myeloid leukaemia Diseases 0.000 description 2
- 241000193163 Clostridioides difficile Species 0.000 description 2
- 102000008130 Cyclic AMP-Dependent Protein Kinases Human genes 0.000 description 2
- 102000003903 Cyclin-dependent kinases Human genes 0.000 description 2
- 108090000266 Cyclin-dependent kinases Proteins 0.000 description 2
- 102000004127 Cytokines Human genes 0.000 description 2
- 108090000695 Cytokines Proteins 0.000 description 2
- 102100037024 E3 ubiquitin-protein ligase XIAP Human genes 0.000 description 2
- 102000001301 EGF receptor Human genes 0.000 description 2
- 108060006698 EGF receptor Proteins 0.000 description 2
- 238000012286 ELISA Assay Methods 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 101710165567 Extracellular signal-regulated kinase 1 Proteins 0.000 description 2
- 102000012673 Follicle Stimulating Hormone Human genes 0.000 description 2
- 108010079345 Follicle Stimulating Hormone Proteins 0.000 description 2
- 102000002254 Glycogen Synthase Kinase 3 Human genes 0.000 description 2
- 108010014905 Glycogen Synthase Kinase 3 Proteins 0.000 description 2
- 108010000521 Human Growth Hormone Proteins 0.000 description 2
- 102000002265 Human Growth Hormone Human genes 0.000 description 2
- 239000000854 Human Growth Hormone Substances 0.000 description 2
- 241000725303 Human immunodeficiency virus Species 0.000 description 2
- 108060003951 Immunoglobulin Proteins 0.000 description 2
- 108090000723 Insulin-Like Growth Factor I Proteins 0.000 description 2
- 102000004890 Interleukin-8 Human genes 0.000 description 2
- 108090001007 Interleukin-8 Proteins 0.000 description 2
- 102100020880 Kit ligand Human genes 0.000 description 2
- 102000043136 MAP kinase family Human genes 0.000 description 2
- 108091054455 MAP kinase family Proteins 0.000 description 2
- 101710156564 Major tail protein Gp23 Proteins 0.000 description 2
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 2
- 102100024192 Mitogen-activated protein kinase 3 Human genes 0.000 description 2
- 208000033761 Myelogenous Chronic BCR-ABL Positive Leukemia Diseases 0.000 description 2
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 2
- 108010025020 Nerve Growth Factor Proteins 0.000 description 2
- 102000004861 Phosphoric Diester Hydrolases Human genes 0.000 description 2
- 108090001050 Phosphoric Diester Hydrolases Proteins 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-N Phosphoric acid Chemical compound OP(O)(O)=O NBIIXXVUZAFLBC-UHFFFAOYSA-N 0.000 description 2
- 108010057464 Prolactin Proteins 0.000 description 2
- 102100024819 Prolactin Human genes 0.000 description 2
- 102000004022 Protein-Tyrosine Kinases Human genes 0.000 description 2
- 108090000412 Protein-Tyrosine Kinases Proteins 0.000 description 2
- 102000014400 SH2 domains Human genes 0.000 description 2
- 108050003452 SH2 domains Proteins 0.000 description 2
- 101710084578 Short neurotoxin 1 Proteins 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 101710182532 Toxin a Proteins 0.000 description 2
- 102000000852 Tumor Necrosis Factor-alpha Human genes 0.000 description 2
- 102100025387 Tyrosine-protein kinase JAK3 Human genes 0.000 description 2
- 102000016548 Vascular Endothelial Growth Factor Receptor-1 Human genes 0.000 description 2
- 108700031544 X-Linked Inhibitor of Apoptosis Proteins 0.000 description 2
- 238000002835 absorbance Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000033115 angiogenesis Effects 0.000 description 2
- 208000006673 asthma Diseases 0.000 description 2
- 108010005774 beta-Galactosidase Proteins 0.000 description 2
- 238000013357 binding ELISA Methods 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 230000031018 biological processes and functions Effects 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 229940047120 colony stimulating factors Drugs 0.000 description 2
- 230000002860 competitive effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000010828 elution Methods 0.000 description 2
- 229940088598 enzyme Drugs 0.000 description 2
- 229940028334 follicle stimulating hormone Drugs 0.000 description 2
- 125000001165 hydrophobic group Chemical group 0.000 description 2
- 102000018358 immunoglobulin Human genes 0.000 description 2
- 230000002458 infectious effect Effects 0.000 description 2
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 2
- 229920002521 macromolecule Polymers 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 238000000302 molecular modelling Methods 0.000 description 2
- 230000000869 mutational effect Effects 0.000 description 2
- 230000010807 negative regulation of binding Effects 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000002138 osteoinductive effect Effects 0.000 description 2
- 229920002704 polyhistidine Polymers 0.000 description 2
- 229940097325 prolactin Drugs 0.000 description 2
- 108060006633 protein kinase Proteins 0.000 description 2
- 206010039073 rheumatoid arthritis Diseases 0.000 description 2
- 238000002702 ribosome display Methods 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 238000002864 sequence alignment Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- MZOFCQQQCNRIBI-VMXHOPILSA-N (3s)-4-[[(2s)-1-[[(2s)-1-[[(1s)-1-carboxy-2-hydroxyethyl]amino]-4-methyl-1-oxopentan-2-yl]amino]-5-(diaminomethylideneamino)-1-oxopentan-2-yl]amino]-3-[[2-[[(2s)-2,6-diaminohexanoyl]amino]acetyl]amino]-4-oxobutanoic acid Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN MZOFCQQQCNRIBI-VMXHOPILSA-N 0.000 description 1
- FSPQCTGGIANIJZ-UHFFFAOYSA-N 2-[[(3,4-dimethoxyphenyl)-oxomethyl]amino]-4,5,6,7-tetrahydro-1-benzothiophene-3-carboxamide Chemical compound C1=C(OC)C(OC)=CC=C1C(=O)NC1=C(C(N)=O)C(CCCC2)=C2S1 FSPQCTGGIANIJZ-UHFFFAOYSA-N 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- 102100036009 5'-AMP-activated protein kinase catalytic subunit alpha-2 Human genes 0.000 description 1
- 108010011376 AMP-Activated Protein Kinases Proteins 0.000 description 1
- 102000014156 AMP-Activated Protein Kinases Human genes 0.000 description 1
- HRPVXLWXLXDGHG-UHFFFAOYSA-N Acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 1
- 108010059616 Activins Proteins 0.000 description 1
- 102000005606 Activins Human genes 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- 108010088751 Albumins Proteins 0.000 description 1
- 102000009027 Albumins Human genes 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 208000024827 Alzheimer disease Diseases 0.000 description 1
- 102000002659 Amyloid Precursor Protein Secretases Human genes 0.000 description 1
- 108010043324 Amyloid Precursor Protein Secretases Proteins 0.000 description 1
- 102100022987 Angiogenin Human genes 0.000 description 1
- 108010048036 Angiopoietin-2 Proteins 0.000 description 1
- 102400000068 Angiostatin Human genes 0.000 description 1
- 108010079709 Angiostatins Proteins 0.000 description 1
- 108010005853 Anti-Mullerian Hormone Proteins 0.000 description 1
- 102000016605 B-Cell Activating Factor Human genes 0.000 description 1
- 108010028006 B-Cell Activating Factor Proteins 0.000 description 1
- 108010003455 BLyS receptor Proteins 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 102100026189 Beta-galactosidase Human genes 0.000 description 1
- 102000007350 Bone Morphogenetic Proteins Human genes 0.000 description 1
- 108010007726 Bone Morphogenetic Proteins Proteins 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 101100000858 Caenorhabditis elegans act-3 gene Proteins 0.000 description 1
- 101100314454 Caenorhabditis elegans tra-1 gene Proteins 0.000 description 1
- 102100021809 Chorionic somatomammotropin hormone 1 Human genes 0.000 description 1
- 102100031162 Collagen alpha-1(XVIII) chain Human genes 0.000 description 1
- 108010049894 Cyclic AMP-Dependent Protein Kinases Proteins 0.000 description 1
- 108010024986 Cyclin-Dependent Kinase 2 Proteins 0.000 description 1
- 108010025464 Cyclin-Dependent Kinase 4 Proteins 0.000 description 1
- 108010025454 Cyclin-Dependent Kinase 5 Proteins 0.000 description 1
- 102100036239 Cyclin-dependent kinase 2 Human genes 0.000 description 1
- 102100036252 Cyclin-dependent kinase 4 Human genes 0.000 description 1
- 102100026805 Cyclin-dependent-like kinase 5 Human genes 0.000 description 1
- XUIIKFGFIJCVMT-GFCCVEGCSA-N D-thyroxine Chemical compound IC1=CC(C[C@@H](N)C(O)=O)=CC(I)=C1OC1=CC(I)=C(O)C(I)=C1 XUIIKFGFIJCVMT-GFCCVEGCSA-N 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 102100033553 Delta-like protein 4 Human genes 0.000 description 1
- 101100296720 Dictyostelium discoideum Pde4 gene Proteins 0.000 description 1
- 101100189582 Dictyostelium discoideum pdeD gene Proteins 0.000 description 1
- 102100025979 Disintegrin and metalloproteinase domain-containing protein 33 Human genes 0.000 description 1
- 108010079505 Endostatins Proteins 0.000 description 1
- 241000701832 Enterobacteria phage T3 Species 0.000 description 1
- 208000010228 Erectile Dysfunction Diseases 0.000 description 1
- 108090000394 Erythropoietin Proteins 0.000 description 1
- 102000003951 Erythropoietin Human genes 0.000 description 1
- 101000867232 Escherichia coli Heat-stable enterotoxin II Proteins 0.000 description 1
- 101001065501 Escherichia phage MS2 Lysis protein Proteins 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 241001524679 Escherichia virus M13 Species 0.000 description 1
- 102000018233 Fibroblast Growth Factor Human genes 0.000 description 1
- 108050007372 Fibroblast Growth Factor Proteins 0.000 description 1
- 108090000386 Fibroblast Growth Factor 1 Proteins 0.000 description 1
- 102100031706 Fibroblast growth factor 1 Human genes 0.000 description 1
- 102100024785 Fibroblast growth factor 2 Human genes 0.000 description 1
- 108090000379 Fibroblast growth factor 2 Proteins 0.000 description 1
- 108091006057 GST-tagged proteins Proteins 0.000 description 1
- 101710112780 Gene 1 protein Proteins 0.000 description 1
- 101710122194 Gene 2 protein Proteins 0.000 description 1
- 102000003886 Glycoproteins Human genes 0.000 description 1
- 108090000288 Glycoproteins Proteins 0.000 description 1
- 102000006771 Gonadotropins Human genes 0.000 description 1
- 108010086677 Gonadotropins Proteins 0.000 description 1
- 208000031886 HIV Infections Diseases 0.000 description 1
- 108010002459 HIV Integrase Proteins 0.000 description 1
- 208000037357 HIV infectious disease Diseases 0.000 description 1
- 102000003745 Hepatocyte Growth Factor Human genes 0.000 description 1
- 108090000100 Hepatocyte Growth Factor Proteins 0.000 description 1
- 102100021866 Hepatocyte growth factor Human genes 0.000 description 1
- 102100022623 Hepatocyte growth factor receptor Human genes 0.000 description 1
- 101000783681 Homo sapiens 5'-AMP-activated protein kinase catalytic subunit alpha-2 Proteins 0.000 description 1
- 101000924533 Homo sapiens Angiopoietin-2 Proteins 0.000 description 1
- 101000872077 Homo sapiens Delta-like protein 4 Proteins 0.000 description 1
- 101000720049 Homo sapiens Disintegrin and metalloproteinase domain-containing protein 33 Proteins 0.000 description 1
- 101000898034 Homo sapiens Hepatocyte growth factor Proteins 0.000 description 1
- 101001076408 Homo sapiens Interleukin-6 Proteins 0.000 description 1
- 101000926206 Homo sapiens Putative glutathione hydrolase 3 proenzyme Proteins 0.000 description 1
- 101100054114 Homo sapiens SH3BP2 gene Proteins 0.000 description 1
- 101000868152 Homo sapiens Son of sevenless homolog 1 Proteins 0.000 description 1
- 101001050288 Homo sapiens Transcription factor Jun Proteins 0.000 description 1
- 101100264173 Homo sapiens XIAP gene Proteins 0.000 description 1
- 101100450591 Human adenovirus B serotype 3 PVIII gene Proteins 0.000 description 1
- HEFNNWSXXWATRW-UHFFFAOYSA-N Ibuprofen Chemical compound CC(C)CC1=CC=C(C(C)C(O)=O)C=C1 HEFNNWSXXWATRW-UHFFFAOYSA-N 0.000 description 1
- 102100026120 IgG receptor FcRn large subunit p51 Human genes 0.000 description 1
- 101710177940 IgG receptor FcRn large subunit p51 Proteins 0.000 description 1
- 206010062016 Immunosuppression Diseases 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 108010004250 Inhibins Proteins 0.000 description 1
- 102000002746 Inhibins Human genes 0.000 description 1
- 102000004877 Insulin Human genes 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 206010022489 Insulin Resistance Diseases 0.000 description 1
- 108090001117 Insulin-Like Growth Factor II Proteins 0.000 description 1
- 102100037852 Insulin-like growth factor I Human genes 0.000 description 1
- 102100025947 Insulin-like growth factor II Human genes 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 108010061833 Integrases Proteins 0.000 description 1
- 102000014150 Interferons Human genes 0.000 description 1
- 108010050904 Interferons Proteins 0.000 description 1
- 108091029795 Intergenic region Proteins 0.000 description 1
- 108090000978 Interleukin-4 Proteins 0.000 description 1
- 108010063738 Interleukins Proteins 0.000 description 1
- 102000015696 Interleukins Human genes 0.000 description 1
- 102000036770 Islet Amyloid Polypeptide Human genes 0.000 description 1
- 108010041872 Islet Amyloid Polypeptide Proteins 0.000 description 1
- 108010019421 Janus Kinase 3 Proteins 0.000 description 1
- 101710172072 Kexin Proteins 0.000 description 1
- 101710177504 Kit ligand Proteins 0.000 description 1
- 150000008575 L-amino acids Chemical class 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- 102000007330 LDL Lipoproteins Human genes 0.000 description 1
- 108010007622 LDL Lipoproteins Proteins 0.000 description 1
- 108010054278 Lac Repressors Proteins 0.000 description 1
- 235000019687 Lamb Nutrition 0.000 description 1
- 102100032352 Leukemia inhibitory factor Human genes 0.000 description 1
- 108090000581 Leukemia inhibitory factor Proteins 0.000 description 1
- 102000036243 Lymphocyte Specific Protein Tyrosine Kinase p56(lck) Human genes 0.000 description 1
- 108010002481 Lymphocyte Specific Protein Tyrosine Kinase p56(lck) Proteins 0.000 description 1
- 102000002274 Matrix Metalloproteinases Human genes 0.000 description 1
- 108010000684 Matrix Metalloproteinases Proteins 0.000 description 1
- 102100026632 Mimecan Human genes 0.000 description 1
- 208000034578 Multiple myelomas Diseases 0.000 description 1
- 101100108124 Mus musculus Adam33 gene Proteins 0.000 description 1
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 1
- 101710135898 Myc proto-oncogene protein Proteins 0.000 description 1
- 108010057466 NF-kappa B Proteins 0.000 description 1
- 102000003945 NF-kappa B Human genes 0.000 description 1
- 102000015336 Nerve Growth Factor Human genes 0.000 description 1
- 102000007072 Nerve Growth Factors Human genes 0.000 description 1
- 239000004677 Nylon Substances 0.000 description 1
- 108700006385 OmpF Proteins 0.000 description 1
- 239000004218 Orcein Substances 0.000 description 1
- 101800002327 Osteoinductive factor Proteins 0.000 description 1
- 101150098694 PDE5A gene Proteins 0.000 description 1
- 102000003982 Parathyroid hormone Human genes 0.000 description 1
- 108090000445 Parathyroid hormone Proteins 0.000 description 1
- 108010067902 Peptide Library Proteins 0.000 description 1
- 108010003044 Placental Lactogen Proteins 0.000 description 1
- 239000000381 Placental Lactogen Substances 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- 101100082610 Plasmodium falciparum (isolate 3D7) PDEdelta gene Proteins 0.000 description 1
- 108010076181 Proinsulin Proteins 0.000 description 1
- 108010044159 Proprotein Convertases Proteins 0.000 description 1
- 102000006437 Proprotein Convertases Human genes 0.000 description 1
- 102000001253 Protein Kinase Human genes 0.000 description 1
- 108010089836 Proto-Oncogene Proteins c-met Proteins 0.000 description 1
- 102100034060 Putative glutathione hydrolase 3 proenzyme Human genes 0.000 description 1
- 102000014128 RANK Ligand Human genes 0.000 description 1
- 108010025832 RANK Ligand Proteins 0.000 description 1
- 108091005682 Receptor kinases Proteins 0.000 description 1
- 102100020718 Receptor-type tyrosine-protein kinase FLT3 Human genes 0.000 description 1
- 101710151245 Receptor-type tyrosine-protein kinase FLT3 Proteins 0.000 description 1
- 108090000103 Relaxin Proteins 0.000 description 1
- 102000003743 Relaxin Human genes 0.000 description 1
- 101710137426 Replication-associated protein G2P Proteins 0.000 description 1
- 102100024865 SH3 domain-binding protein 2 Human genes 0.000 description 1
- 102000007562 Serum Albumin Human genes 0.000 description 1
- 108010071390 Serum Albumin Proteins 0.000 description 1
- 102000013275 Somatomedins Human genes 0.000 description 1
- 108010039445 Stem Cell Factor Proteins 0.000 description 1
- 101000777492 Stichodactyla helianthus DELTA-stichotoxin-She4b Proteins 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- 108090000190 Thrombin Proteins 0.000 description 1
- 102000002938 Thrombospondin Human genes 0.000 description 1
- 108060008245 Thrombospondin Proteins 0.000 description 1
- 108090000373 Tissue Plasminogen Activator Proteins 0.000 description 1
- 102100033571 Tissue-type plasminogen activator Human genes 0.000 description 1
- 101710182223 Toxin B Proteins 0.000 description 1
- 102100023132 Transcription factor Jun Human genes 0.000 description 1
- 101710150448 Transcriptional regulator Myc Proteins 0.000 description 1
- 102000004887 Transforming Growth Factor beta Human genes 0.000 description 1
- 108090001012 Transforming Growth Factor beta Proteins 0.000 description 1
- 101800004564 Transforming growth factor alpha Proteins 0.000 description 1
- 102400001320 Transforming growth factor alpha Human genes 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 101710112792 Tyrosine-protein kinase JAK3 Proteins 0.000 description 1
- 108010053100 Vascular Endothelial Growth Factor Receptor-3 Proteins 0.000 description 1
- 102100033178 Vascular endothelial growth factor receptor 1 Human genes 0.000 description 1
- 102100033179 Vascular endothelial growth factor receptor 3 Human genes 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 229920006243 acrylic copolymer Polymers 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000000488 activin Substances 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 108010072788 angiogenin Proteins 0.000 description 1
- 239000000868 anti-mullerian hormone Substances 0.000 description 1
- 230000000890 antigenic effect Effects 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 206010003246 arthritis Diseases 0.000 description 1
- FZCSTZYAHCUGEM-UHFFFAOYSA-N aspergillomarasmine B Natural products OC(=O)CNC(C(O)=O)CNC(C(O)=O)CC(O)=O FZCSTZYAHCUGEM-UHFFFAOYSA-N 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 238000002819 bacterial display Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 102000005936 beta-Galactosidase Human genes 0.000 description 1
- 238000004166 bioassay Methods 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 239000000090 biomarker Substances 0.000 description 1
- 229940112869 bone morphogenetic protein Drugs 0.000 description 1
- 108010006025 bovine growth hormone Proteins 0.000 description 1
- 102100029175 cGMP-specific 3',5'-cyclic phosphodiesterase Human genes 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000000423 cell based assay Methods 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 229920002678 cellulose Polymers 0.000 description 1
- 239000001913 cellulose Substances 0.000 description 1
- 210000003169 central nervous system Anatomy 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 229920001577 copolymer Polymers 0.000 description 1
- 239000004121 copper complexes of chlorophylls and chlorophyllins Substances 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 238000013104 docking experiment Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000012377 drug delivery Methods 0.000 description 1
- 238000001952 enzyme assay Methods 0.000 description 1
- 229940105423 erythropoietin Drugs 0.000 description 1
- 229940126864 fibroblast growth factor Drugs 0.000 description 1
- 108700014844 flt3 ligand Proteins 0.000 description 1
- 230000005714 functional activity Effects 0.000 description 1
- 239000000499 gel Substances 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 239000002622 gonadotropin Substances 0.000 description 1
- 239000004120 green S Substances 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 108010067006 heat stable toxin (E coli) Proteins 0.000 description 1
- 238000013537 high throughput screening Methods 0.000 description 1
- 238000012188 high-throughput screening assay Methods 0.000 description 1
- 208000033519 human immunodeficiency virus infectious disease Diseases 0.000 description 1
- 239000012216 imaging agent Substances 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 229940072221 immunoglobulins Drugs 0.000 description 1
- 238000003364 immunohistochemistry Methods 0.000 description 1
- 230000001506 immunosuppresive effect Effects 0.000 description 1
- 201000001881 impotence Diseases 0.000 description 1
- 208000027866 inflammatory disease Diseases 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 239000000893 inhibin Substances 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- ZPNFWUPYTFPOJU-LPYSRVMUSA-N iniprol Chemical compound C([C@H]1C(=O)NCC(=O)NCC(=O)N[C@H]2CSSC[C@H]3C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@H](C(N[C@H](C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=4C=CC(O)=CC=4)C(=O)N[C@@H](CC=4C=CC=CC=4)C(=O)N[C@@H](CC=4C=CC(O)=CC=4)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CSSC[C@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC=4C=CC=CC=4)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(N)=N)NC2=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CSSC[C@H](NC(=O)[C@H](CC=2C=CC=CC=2)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H]2N(CCC2)C(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N2[C@@H](CCC2)C(=O)N2[C@@H](CCC2)C(=O)N[C@@H](CC=2C=CC(O)=CC=2)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N2[C@@H](CCC2)C(=O)N3)C(=O)NCC(=O)NCC(=O)N[C@@H](C)C(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(=O)N[C@@H](CC=2C=CC=CC=2)C(=O)N[C@H](C(=O)N1)C(C)C)[C@@H](C)O)[C@@H](C)CC)=O)[C@@H](C)CC)C1=CC=C(O)C=C1 ZPNFWUPYTFPOJU-LPYSRVMUSA-N 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 229940125396 insulin Drugs 0.000 description 1
- 102000006495 integrins Human genes 0.000 description 1
- 108010044426 integrins Proteins 0.000 description 1
- 229940047124 interferons Drugs 0.000 description 1
- 229940047122 interleukins Drugs 0.000 description 1
- 208000037906 ischaemic injury Diseases 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000033001 locomotion Effects 0.000 description 1
- 238000002824 mRNA display Methods 0.000 description 1
- 108010026228 mRNA guanylyltransferase Proteins 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000007479 molecular analysis Methods 0.000 description 1
- 230000009149 molecular binding Effects 0.000 description 1
- 230000004001 molecular interaction Effects 0.000 description 1
- 210000004898 n-terminal fragment Anatomy 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 230000004770 neurodegeneration Effects 0.000 description 1
- 208000015122 neurodegenerative disease Diseases 0.000 description 1
- 230000016273 neuron death Effects 0.000 description 1
- 108010087904 neutravidin Proteins 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 125000003729 nucleotide group Chemical group 0.000 description 1
- 229920001778 nylon Polymers 0.000 description 1
- 102000006271 p21-Activated Kinases Human genes 0.000 description 1
- 108010058266 p21-Activated Kinases Proteins 0.000 description 1
- 102000002574 p38 Mitogen-Activated Protein Kinases Human genes 0.000 description 1
- 108010068338 p38 Mitogen-Activated Protein Kinases Proteins 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 229960001319 parathyroid hormone Drugs 0.000 description 1
- 239000000199 parathyroid hormone Substances 0.000 description 1
- 230000008506 pathogenesis Effects 0.000 description 1
- 210000001322 periplasm Anatomy 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 230000001323 posttranslational effect Effects 0.000 description 1
- OXCMYAYHXIHQOA-UHFFFAOYSA-N potassium;[2-butyl-5-chloro-3-[[4-[2-(1,2,4-triaza-3-azanidacyclopenta-1,4-dien-5-yl)phenyl]phenyl]methyl]imidazol-4-yl]methanol Chemical compound [K+].CCCCC1=NC(Cl)=C(CO)N1CC1=CC=C(C=2C(=CC=CC=2)C2=N[N-]N=N2)C=C1 OXCMYAYHXIHQOA-UHFFFAOYSA-N 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 108010087851 prorelaxin Proteins 0.000 description 1
- 238000002818 protein evolution Methods 0.000 description 1
- 230000004850 protein–protein interaction Effects 0.000 description 1
- 102000009929 raf Kinases Human genes 0.000 description 1
- 108010077182 raf Kinases Proteins 0.000 description 1
- 238000003259 recombinant expression Methods 0.000 description 1
- 230000007115 recruitment Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000027404 regulation of phosphorylation Effects 0.000 description 1
- 230000037425 regulation of transcription Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 102000000568 rho-Associated Kinases Human genes 0.000 description 1
- 108010041788 rho-Associated Kinases Proteins 0.000 description 1
- 238000007363 ring formation reaction Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 230000003248 secreting effect Effects 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000013207 serial dilution Methods 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 102000009076 src-Family Kinases Human genes 0.000 description 1
- 108010087686 src-Family Kinases Proteins 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000010189 synthetic method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- ZRKFYGHZFMAOKI-QMGMOQQFSA-N tgfbeta Chemical compound C([C@H](NC(=O)[C@H](C(C)C)NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CC(C)C)NC(=O)CNC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCSC)C(C)C)[C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O)C1=CC=C(O)C=C1 ZRKFYGHZFMAOKI-QMGMOQQFSA-N 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 229960004072 thrombin Drugs 0.000 description 1
- 230000009424 thromboembolic effect Effects 0.000 description 1
- 210000001685 thyroid gland Anatomy 0.000 description 1
- 229940034208 thyroxine Drugs 0.000 description 1
- XUIIKFGFIJCVMT-UHFFFAOYSA-N thyroxine-binding globulin Natural products IC1=CC(CC([NH3+])C([O-])=O)=CC(I)=C1OC1=CC(I)=C(O)C(I)=C1 XUIIKFGFIJCVMT-UHFFFAOYSA-N 0.000 description 1
- 230000036964 tight binding Effects 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 231100000765 toxin Toxicity 0.000 description 1
- 108700012359 toxins Proteins 0.000 description 1
- 102000003390 tumor necrosis factor Human genes 0.000 description 1
- 208000001072 type 2 diabetes mellitus Diseases 0.000 description 1
- 241000701447 unidentified baculovirus Species 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/68—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
- G01N33/6803—General methods of protein analysis not limited to specific proteins or families of proteins
- G01N33/6845—Methods of identifying protein-protein interactions in protein mixtures
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/315—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Streptococcus (G), e.g. Enterococci
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B30/00—Methods of screening libraries
- C40B30/04—Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding
Definitions
- Libraries of polypeptides can be prepared, e.g., by manipulating the immune system or via chemical synthesis, from which specificity of binding to target molecules can be selected. Molecular diversity from which specificity can be selected is large for polypeptides having numerous possible sequence combinations of amino acids.
- proteins can form large binding surfaces with multiple contacts to a target molecule that leads to highly specific and high affinity binding events.
- antibodies are a class of protein that has yielded highly specific and tight binding ligands for various target antigens.
- GB1 peptidic libraries and methods of screening the same for specific binding to a target protein are provided.
- Libraries of polynucleotides that encode GB1 peptidic compounds are provided. These libraries find use in a variety of applications in which specific binding to target molecules, e.g., target proteins is desired. Also provided are methods of screening the libraries for binding to a target.
- FIG. 1 depicts a ribbon structure of a GB1 protein that illustrates a 4 ⁇ -1 ⁇ motif (Marcho et al., Nature Structural Biology, 5(6), 1998, p. 470-475).
- FIGS. 2A and 2B depict six different libraries that include a GB1 scaffold, both in a ribbon representation (top) and a space filling representation (bottom). Amino acids at several positions of the GB1 scaffold that are selected for mutation are highlighted in darker shade (top).
- the space filling representations of Library 1 to Library 6 (bottom) illustrate six different potential binding surfaces (shown in darker shade) on the GB1 scaffold.
- FIG. 3 illustrates the underlying sequence of the GB1 scaffold domain (SEQ ID NO: 1) of FIGS. 2A-2B and the positions of the variant amino acids (shown in the grey blocks) in Libraries 1 to 6.
- the asterisks indicate positions at which mutations may include insertion of amino acids.
- FIG. 4A depicts the phage display of a GB1 peptidic compound fusion of coat protein p3 that includes a hinge and dimerization format.
- FIG. 4B illustrates display levels of various formats of the GB1 peptidic compound fusion on the phage particles.
- FIG. 5 illustrates the design of phage display Library 1 (SEQ ID NOs: 225, 226 and 197-199).
- FIG. 6 illustrates the design of phage display Library 2 (SEQ ID NOs: 225, 226 and 200-202).
- FIG. 7 illustrates the design of phage display Library 3 (SEQ ID NOs: 225, 226 and 203-209).
- FIG. 8 illustrates the design of phage display Library 4 (SEQ ID NOs: 225, 226 and 210-216).
- FIG. 9 illustrates the design of phage display Library 5 (SEQ ID NOs: 225, 226 and 217-220).
- FIG. 10 illustrates the design of phage display Library 6 (SEQ ID NOs: 225, 226 and 221-224).
- FIGS. 5 to 10 illustrate the design of phage display libraries based on Libraries 1 to 6 illustrated in FIGS. 2A-2B .
- Ribbon (left) and space filling (right) structural representations depict the variant amino acid positions in dark.
- Oligonucleotide and amino acid sequences (SEQ ID NOs: 225 and 226) show the GB1 scaffold in the context of the fusion protein with GGS linkers at the N- and C-termini of the scaffold. Also shown are the oligonucleotide sequences synthesized for use in preparation of the libraries by Kunkel mutagenesis that include KHT codons at variant amino acid positions to encode variable regions of GB1 peptidic compounds.
- FIG. 11 illustrates binding results from four rounds of phage display screening of Libraries 1 to 6 against L-VEGF and D-VEGF.
- FIG. 12 illustrates binding assay results of individual clones identified from phage display screening of subject libraries against VEGF proteins. 10 nM or 100 nM VEGF protein was added to binding solutions in a competition binding assay.
- FIG. 13 illustrates exemplary bifunctional libraries having two potential binding surfaces.
- peptidic refers to a moiety that is composed of amino acid residues.
- peptidic includes compounds or libraries in which the conventional backbone has been replaced with non-naturally occurring or synthetic backbones, and peptides in which one or more naturally occurring amino acids have been replaced with one or more non-naturally occurring or synthetic amino acids, or a D-amino acid. Any of the depictions of sequences found herein (e.g., using one-letter or three-letter codes) may represent a L-amino acid or a D-amino acid version of the sequence. Unless noted otherwise, the capital and small letter codes of L- and D-amino acid residues are not utilized.
- polypeptide and “protein” are used interchangeably.
- polypeptide also includes post translational modified polypeptides or proteins.
- polypeptide includes polypeptides in which the conventional backbone has been replaced with non-naturally occurring or synthetic backbones, and peptides in which one or more of the conventional amino acids have been replaced with one or more non-naturally occurring or synthetic amino acids.
- polypeptides may be of any length, e.g., 2 or more amino acids, 4 or more amino acids, 10 or more amino acids, 20 or more amino acids, 30 or more amino acids, 40 or more amino acids, 50 or more amino acids, 60 or more amino acids, 100 or more amino acids, 300 or more amino acids, 500 or more or 1000 or more amino acids.
- the term “scaffold” or “scaffold domain” refers to a peptidic framework from which a library of compounds arose, and against which the compounds are able to be compared.
- the amino acids at those positions are referred to as “variant amino acids.”
- variant amino acids may confer on the resulting peptidic compounds different functions, such as specific binding to a target protein.
- mutation is a deletion, insertion, or substitution of an amino acid(s) residue or nucleotide(s) residue relative to a reference sequence or motif, such as a scaffold sequence or motif.
- GB1 scaffold domain and “GB1 scaffold” refer to a scaffold that has a structural motif similar to the B1 domain of Protein G (GB1), where the structural motif is characterized by a motif including a four stranded ⁇ -sheet packed against a helix (also referred to as a 4 ⁇ -1 ⁇ motif). The arrangement of four ⁇ -strands and one ⁇ -helix may form a hairpin-helix-hairpin motif.
- An exemplary GB1 scaffold domain is depicted in FIG. 1 .
- GB1 scaffold domains include members of the family of IgG binding B domains, e.g., Protein L B1 domain.
- Exemplary GB1 scaffold domain sequences include those described by SEQ ID NOs: 227-261.
- a GB1 scaffold domain may be a native sequence of a member of the B domain protein family, a B domain sequence with pre-existing amino acid sequence modifications (such as additions, deletions and/or substitutions), or a fragment or analogue thereof.
- a GB1 scaffold domain may be L-peptidic, D-peptidic or a combination thereof. In some cases, a “GB1 scaffold domain” may also be referred to as a “parent amino acid sequence.”
- GB1 peptidic compound refers to a compound composed of peptidic residues that has a parent GB1 scaffold domain.
- parent amino acid sequence and “parent polypeptide” refer to a polypeptide comprising an amino acid sequence from which a variant GB1 peptidic compound arose and against which the variant GB1 peptidic compound is being compared. In some cases, the parent polypeptide lacks one or more of the modifications disclosed herein and differs in function compared to a variant GB1 peptidic compound as disclosed herein.
- the parent polypeptide may comprise a native GB1 sequence or GB1 scaffold sequence with pre-existing amino acid sequence modifications (such as additions, deletions and/or substitutions).
- variable region refers to a continuous sequence of residues that includes one or more variant amino acids.
- a variable region may also include one or more conserved amino acids at fixed positions.
- fixed region refers to a continuous sequence of residues that does not include any mutations or variant amino acids, and is conserved across a library of compounds.
- variable domain refers to a domain that includes all of the variant amino acids of a GB1 scaffold.
- the variable domain may include one or more variable regions, and may encompass a continuous or a discontinuous sequence of residues.
- the variable domain may be part of the scaffold domain.
- discontinuous sequence of residues refers to a sequence of residues that is not continuous with respect to the primary sequence of a peptidic compound.
- a peptidic compound may fold to form a secondary or tertiary structure, e.g., a 4 ⁇ -1 ⁇ motif, where the amino acids of a discontinuous sequence of residues are adjacent to each other in space, i.e., contiguous.
- continuous sequence of residues refers to a sequence of residues that is continuous in terms of the primary sequence of a peptidic compound.
- non-core mutation refers to an amino acid mutation of a GB1 peptidic compound that is located at a position in the 4 ⁇ -1 ⁇ structure that is not part of the hydrophobic core of the structure. Amino acid residues in the hydrophobic core of a GB1 peptidic compound are not significantly solvent exposed but rather tend to form intramolecular hydrophobic contacts. Unless explicitly defined otherwise, a hydrophobic core residue or core position, as described herein, of a GB1 scaffold domain that is described by SEQ ID NO: 1 is defined by one of positions 2, 4, 6, 19, 25, 29, 33, 38, 42, 51 and 53 of the GB1 scaffold.
- surface mutation refers to an amino acid mutation in a GB1 scaffold that is located at a position in the 4 ⁇ -1 ⁇ structure that is solvent exposed. Such variant amino acid residues at surface positions of a GB1 peptidic compound are capable of interacting directly with a target molecule, whether or not such an interaction occurs.
- boundary mutation refers to an amino acid mutation of a GB1 scaffold that is located at a position in the 4 ⁇ -1 ⁇ structure that is at the boundary between the hydrophobic core and the solvent exposed surface.
- variant amino acid residues at boundary positions of a GB1 peptidic compound may be in part contacting hydrophobic core residues and/or in part solvent exposed and capable of some interaction with a target molecule, whether or not such an interaction occurs.
- One criteria for describing core, surface and boundary residues of a GB1 peptidic structure is described by Mayo et al. Nature Structural Biology, 5(6), 1998, 470-475. Such methods and criteria can be modified for use with the GB1 scaffold domain.
- linking sequence refers to a continuous sequence of amino acid residues, or analogs thereof, that connect two peptidic motifs.
- a linking sequence is the loop connecting two ⁇ -strands in a 13-hairpin motif.
- phage display refers to a technique by which variant peptidic compounds are displayed as fusion proteins to a coat protein on the surface of phage, e.g. filamentous phage particles.
- phagemid refers to a plasmid vector having a bacterial origin of replication, e.g., Co1E1, and a copy of an intergenic region of a bacteriophage.
- the phagemid may be based on any known bacteriophage, including filamentous bacteriophage.
- the plasmid will also contain a selectable marker for antibiotic resistance. Segments of DNA cloned into these vectors can be propagated as plasmids.
- the mode of replication of the plasmid changes to rolling circle replication to generate copies of one strand of the plasmid DNA and package phage particles.
- the phagemid may form infectious or non-infectious phage particles. This term includes phagemids which contain a phage coat protein gene or fragment thereof linked to a heterologous polypeptide gene as a gene fusion such that the heterologous polypeptide is displayed on the surface of the phage particle.
- phage vector refers to a double stranded replicative form of a bacteriophage that contains a heterologous gene and is capable of replication.
- the phage vector has a phage origin of replication allowing phage replication and phage particle formation.
- the phage is a filamentous bacteriophage, such as an M13, f1, fd, Pf3 phage or a derivative thereof, a lambdoid phage, such as lambda, 21, phi80, phi81, 82, 424, 434, etc., or a derivative thereof, a Baculovirus or a derivative thereof, a T4 phage or a derivative thereof, a T7 phage virus or a derivative thereof.
- the term “stable” refers to a compound that is able to maintain a folded state under physiological conditions at a certain temperature, such that it retains at least one of its normal functional activities, for example binding to a target protein.
- the stability of the compound can be determined using standard methods.
- the “thermostability” of a compound can be determined by measuring the thermal melt (“Tm”) temperature.
- Tm is the temperature in degrees Celsius at which half of the compounds become unfolded. In some instances, the higher the Tm, the more stable the compound.
- the compounds of the subject libraries may contain one or more asymmetric centers and may thus give rise to enantiomers, diastereomers, and other stereoisomeric forms that may be defined, in terms of absolute stereochemistry, as (R)- or (S)- or, as (D)- or (L)- for amino acids and polypeptides.
- the present invention is meant to include all such possible isomers, as well as, their racemic and optically pure forms.
- the compounds described herein contain olefinic double bonds or other centers of geometric asymmetry, and unless specified otherwise, it is intended that the compounds include both E and Z geometric isomers. Likewise, all tautomeric forms are also intended to be included.
- a target protein refers to all members of the target family, and fragments and enantiomers thereof, and protein mimics thereof.
- the target proteins of interest that are described herein are intended to include all members of the target family, and fragments and enantiomers thereof, and protein mimics thereof, unless explicitly described otherwise.
- the target protein may be any protein of interest, such as a therapeutic or diagnostic target, including but not limited to: hormones, growth factors, receptors, enzymes, cytokines, osteoinductive factors, colony stimulating factors and immunoglobulins.
- target protein is intended to include recombinant and synthetic molecules, which can be prepared using any convenient recombinant expression methods or using any convenient synthetic methods, or purchased commercially, as well as fusion proteins containing a target molecule, as well as synthetic L- or D-proteins.
- the term “protein mimic” refers to a peptidic compound that mimics a binding property of a protein of interest, e.g., a target protein.
- the target protein mimic includes an essential part of the original target protein (e.g., an epitope or essential residues thereof) that is necessary for forming a potential binding surface, such that the target protein mimic and the original target protein are each capable of binding specifically to a binding moiety of interest, e.g., an antibody or a D-peptidic compound.
- the part(s) of the original target protein that is essential for binding is displayed on a scaffold such that potential binding surface of the original target protein is mimicked.
- a target protein mimic includes residues or fragments of the original target protein that are incorporated into a protein scaffold, where the scaffold mimics a structural motif of the target protein.
- the protein mimic may present a potential binding surface that mimics that of the original target protein.
- the native structure of the fragments of the original target protein are retained using methods of conformational constraint. Any convenient methods of conformationally constraining a peptidic compound may be used, such as but not limited to, bioconjugation, dimerization (e.g., via a linker), multimerization, or cyclization.
- the subject libraries include a plurality of GB1 peptidic compounds, where each GB1 peptidic compound has a scaffold domain of the same structural motif as the B1 domain of Protein G (GB1), where the structural motif of GB1 is characterized by a motif that includes an arrangement of four 13-strands and one ⁇ -helix around a hydrophobic core (also referred to as a 4 ⁇ -1 ⁇ motif).
- the GB1 peptidic compounds of the subject libraries include mutations at non-core positions, e.g., variant amino acids at positions within a GB1 scaffold domain that are not part of the hydrophobic core of the structure.
- a 4 ⁇ -1 ⁇ motif is depicted in FIG. 1 .
- a variety of libraries of GB1 peptidic compounds are provided.
- both the positions of the mutations and the nature of the mutation at each variable position of the scaffold may be varied.
- the mutations are included at non-core positions, although mutations at core positions may also be included.
- the mutations may confer different functions on the resulting GB1 peptidic compounds, such as specific binding to a target molecule.
- the mutations may be selected at positions of a GB1 scaffold domain that are solvent exposed such that the variant amino acids at these positions can form part of a potential target molecule binding surface, although mutations at selected core and/or boundary positions may also be included.
- the mutations may be concentrated in a variable domain that defines one of several distinct potential binding surfaces of the GB1 scaffold domain.
- Libraries of GB1 peptidic compounds are provided that include distinct arrangements of mutations concentrated at various surfaces of the 413-1 ⁇ motif, for example, as depicted in FIGS. 2A-2B .
- the subject libraries may include compounds that specifically bind to a target molecule via one of the several potential binding sites of the GB1 scaffold domain. Mutations may be included at the potential binding surface to provide for specific binding to a target molecule without significantly disrupting the GB1 peptidic structure.
- a GB1 peptidic library is contacted with a target molecule to screen for a compound of the library that specifically binds to the target with high affinity.
- the subject methods and libraries find use in a variety of applications, including screening applications.
- aspects of the invention include libraries of GB1 peptidic compounds where each GB1 peptidic compound has a scaffold domain of the same structural motif as the B1 domain of Protein G (GB1), where the structural motif of GB1 is characterized by a motif that includes an arrangement of four ⁇ -strands and one ⁇ -helix (also referred to as a 4 ⁇ -1 ⁇ motif) around a hydrophobic core.
- the GB1 peptidic compounds of the subject libraries include mutations at various non-core positions of the 4 ⁇ -1 ⁇ motif, e.g., variant amino acids at non-core positions within a GB1 scaffold domain.
- the four ⁇ -strands and one ⁇ -helix motifs of the structure are arranged in a hairpin-helix-hairpin motif, e.g., ⁇ 1- ⁇ 2- ⁇ 1- ⁇ 3- ⁇ 4 where ⁇ 1- ⁇ 4 are ⁇ -strand motifs and ⁇ 1 is a helix motif.
- a GB1 peptidic hairpin-helix-hairpin motif is depicted in FIG. 1 .
- a GB1 scaffold domain may be any polypeptide, or fragment thereof that includes the 4 ⁇ -1 ⁇ motif, whether naturally occurring or synthetic.
- the GB1 scaffold domain may be a native sequence of a member of the IgG binding B domain protein family, a IgG binding B domain sequence with pre-existing amino acid sequence modifications (such as additions, deletions and/or substitutions), or a fragment or analogue thereof.
- GB1 scaffold domains include those described in the following references Gronenborn et al., FEBS Letters 398 (1996), 312-316; Kotz et al., Eur. J. Biochem. 271, 1623-1629 (2004); Malakaukas et al., Nature Structural Biology, 5(6), 1998, p.
- a GB1 scaffold domain has an amino acid sequence as set forth in one of SEQ ID NOs: 1 and 227-261.
- a GB1 scaffold domain includes a sequence having 60% or more amino acid sequence identity, such as 70% or more, 80% or more, 90% or more, 95% or more or 98% or more amino acid sequence identity to an amino acid sequence set forth in one of SEQ ID NO: 1 and 227-261.
- a GB1 scaffold domain sequence may include 1 or more, such as 2 or more, 3 or more, 4 or more, 5 or more, 10 or more, 15 or more, or even 20 or more additional peptidic residues compared to a native IgG binding B domain sequence.
- a GB1 scaffold domain sequence may include fewer peptidic residues compared a native IgG binding B domain sequence, such as 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10, or even fewer residues.
- B4U242_STREM/244-298 (SEQ ID NO: 227) ...S.YKLVIKGATFSGETATKAVDAAVAEQ.TFRDYANKNGVDGVWAYDAATKTFTVTE... B4U242_STREM/316-370 (SEQ ID NO: 228) ....TYRLVIKGVTFSGETATKAVDAATAEQ.TFRQYANDNGITGEWAYDTATKTFTVTE... C0MA37_STRE4/228-282 (SEQ ID NO: 229) ...S.YKLVIKGATFSGETATKAVDAAVAEQ.TFRDYANKNGVDGVWAYDAATKTFTVTE...
- C0MA37_STRE4/300-354 (SEQ ID NO: 230) ....TYRLVIKGVTFSGETATKAVDAATAEQ.TFRQYANDNGVTGEWAYDAATKTFTVTE... C0MCK9_STRS7/228-282 (SEQ ID NO: 231) ...S.YKLVIKGATFSGETATKAVDAAVAEQ.TFRDYANKNGVDGVWAYDAATKTFTVTE... C0MCK9_STRS7/300-354 (SEQ ID NO: 232) ....TYRLVIKGVTFSGETSTKAVDAATAEQ.TFRQYANDNGVTGEWAYDAATKTFTVTE...
- SPG1_STRSG/228-282 (SEQ ID NO: 257) ....TYKLILNGKTLKGETTTEAVDAATAEK.VFKQYANDNGVDGEWTYDDATKTFTVTE...
- SPG1_STRSG/298-352 (SEQ ID NO: 258) ....TYKLVINGKTLKGETTTKAVDAETAEK.AFKQYANDNGVDGVWTYDDATKTFTVTE...
- SPG2_STRSG/303-357 SEQ ID NO: 259) ....TYKLILNGKTLKGETTTEAVDAATAEK.VFKQYANDNGVDGEWTYDDATKTFTVTE...
- SPG2_STRSG/373-427 (SEQ ID NO: 260) ....TYKLVINGKTLKGETTTEAVDAATAEK.VFKQYANDNGVDGEWTYDDATKTFTVTE...
- SPG2_STRSG/443-497 (SEQ ID NO: 261) ....TYKLVINGKTLKGETTTKAVDAETAEK.AFKQYANDNGVDGVWTYDDATKTFTVTE...
- the GB1 scaffold domain is described by the following sequence: (T/S)Y(K/R)L(Z1)(Z1)(N/K)G(K/N/V/A)T(L/F)(K/S)GET(T/A/S)T(K/E)(A/T)(V/I)D(A/T/V) (A/E)(T/V)AE(K/Q)(A/E/T/V)F(K/R)(Q/D)YA(N/T)(A/D/E/K)N(G/N)(Z3)(D/T)G(E/V)W(A/T/S)YD(D/A/Y/T)ATKT(Z1)T(Z1)TE (SEQ ID NO:262) where each Z1 is independently a hydrophobic residue.
- the GB1 scaffold domain is described by the following sequence: (T/S)Y(K/R)L(I/V)(L/I/V)(N/K)G(K/N/V/A)T(L/F)(K/S)GET(T/A/S)T(K/E)(A/T)(V/I)D(A/T/V)(A/E)(T/V)AE(K/Q)(A/E/T/V)F(K/R)(Q/D)YA(N/T)(A/D/E/K)N(G/N)(V/I)(D/T)G(E/V) W(A/T/S)YD(D/A/Y/T)ATKTFTVTE (SEQ ID NO:263).
- GB1 scaffold domain is described by the following sequence: TYKL(I/V)(L/I/V)(N/K)G(K/N)T(L/F)(K/S)GET(T/A)T(K/E)AVD(A/T/V)(A/E)TAE(K/Q)(A/E/T/V)F(K/R)QYA(N/T)(A/D/E/K)N(G/N)VDG(E/V)W(A/T/S)YD(D/A)ATKTFTVTE (SEQ ID NO:264).
- a mutation in a scaffold domain may include a deletion, insertion, or substitution of an amino acid residue at any convenient position to produce a sequence that is distinct from the reference scaffold domain sequence.
- the GB1 scaffold domain is described by the following sequence: T(Z2)K(Z1)(Z1)(Z1)(N/V)(G/L/I)(K/G)(Q/T/D)(L/A/R)(K/V)(G/E/V)(E/V)(A/T/R/I/P/V)(T/I) (R/W/L/K/V/T/I)E(A/L/I)VDA(A/G)(T/E)(A/V/F)EK(V/I/Y)(F/L/W/I/A)K(L/Q)(Z1)(Z3)N(A/D)(K/N)(T/G)(V/I)(E/D)G(V/E)(W/F)TY(D/K)D(E/A)(T/I)KT(Z1)T(Z1)TE (SEQ ID NO:265), where each Z1 is independently a hydrophobic
- the GB1 scaffold domain is described by the following sequence:
- the diversity of the subject libraries is designed to maximize diversity while minimizing structural perturbations of the GB1 scaffold domain.
- the positions to be mutated are selected to ensure that the GB1 peptidic compounds of the subject libraries can maintain a folded state under physiological conditions.
- Another aspect of generating diversity in the subject libraries is the selection of amino acid positions to be mutated such that the amino acids can form a potential binding surface in the GB1 scaffold domain, whether or not the residues actually contact a target protein.
- One way of determining whether an amino acid position is part of a potential binding surface involves examining the three dimensional structure of the GB1 scaffold domain, using a computer program such as the UCSF Chimera program. Other ways include crystallographic and genetic mutational analysis. Any convenient method may be used to determine whether an amino acid position is part of a potential binding surface.
- solvent exposed positions can be determined using software suitable for protein modeling and three-dimensional structural information obtained from a crystal structure.
- solvent exposed residues may be determined using the Protein Data Bank (PDB) structure 3 GB1 and estimating the solvent accessible surface area (SASA) for each residue using the GETarea tool (Fraczkiewicz & Braun, “Exact and efficient analytical calculation of the accessible surface areas and their gradients for macromolecules,” J. Comput. Chem. 1998, 19, 319-333).
- This tool calculates the ratio of SASA in structure compared to SASA in a random coil.
- a ratio of 0.4 was used in selecting the following solvent accessible residues (shown in bold): TYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVT E (SEQ ID NO:1).
- the mutations of the GB1 scaffold domain may be concentrated at one of several different potential binding surfaces of the scaffold domain.
- Several distinct arrangements of mutations of the GB1 scaffold domain at non-core positions of the hairpin-helix-hairpin scaffold domain are provided.
- the majority of the mutations are at non-core positions of the GB1 scaffold domain (e.g., solvent exposed or boundary positions) however in some cases one or more mutations may be located at hydrophobic core positions.
- mutations at hydrophobic core position may be tolerated without significantly disrupting the GB1 scaffold structure, such as, when those core mutations are selected in a loop region. In such cases the loop region may form a structure or conformation that is different to that of the parent scaffold.
- the GB1 scaffold may have loop regions that are independently selected from any one of the loop sequences set forth in Table 1: (SEQ ID NOs: 67-196 and 267-272). Any of the loop sequences 1-4 of Table 1 may be incorporated at the positions indicated in Table 1 into any convenient GB1 scaffold domain (e.g., SEQ ID NO: 1) to produce another GB1 scaffold domain.
- mutations at boundary positions may also be tolerated without significantly disrupting the GB1 scaffold structure. Mutations at such positions may confer desirable properties upon the resulting GB1 compound variants, such as stability, a certain structural property, or specific binding to a target molecule.
- FIG. 3 illustrates the alignment of the position numbering scheme for a GB1 scaffold domain relative to its ⁇ 1, ⁇ 2, ⁇ 1, ⁇ 3 and ⁇ 4 motifs, and relative to the mutations of certain libraries of the invention.
- Positions marked with an asterix indicate exemplary positions at which mutations that include the insertion of one or more amino acids may be included.
- Any GB1 scaffold domain sequence may be substituted for the scaffold sequence depicted in FIG. 3 , and the positions of the mutations that define a subject library may be transferred from one scaffold to another by any convenient method.
- a sequence alignment method may be used to place any GB1 scaffold domain sequence within the framework of the position numbering scheme illustrated in FIG. 3 .
- Alignment methods based on structural motifs such as ⁇ -strands and ⁇ -helices may also be used to place a GB1 scaffold domain sequence within the framework of the position numbering scheme illustrated in FIG. 3 .
- a first GB1 scaffold domain sequence may be aligned with a second GB1 scaffold domain sequence that is one or more amino acids longer or shorter.
- the second GB1 scaffold domain may have one or more additional amino acids at the N-terminal or C-terminal relative to the first GB1 scaffold, or may have one or more additional amino acids in one of the loop regions of the structure.
- a numbering scheme such as is described below for insertion mutations may be used to relate two scaffold domain sequences.
- a subject library includes 50 or more distinct compounds, such as 100 or more, 300 or more, 1 ⁇ 10 3 or more, 1 ⁇ 10 4 or more, 1 ⁇ 10 5 or more, 1 ⁇ 10 6 or more, 1 ⁇ 10 7 or more, 1 ⁇ 10 8 or more, 1 ⁇ 10 9 or more, 1 ⁇ 10 10 or more, 1 ⁇ 10 11 or more, or 1 ⁇ 10 12 or more, distinct compounds.
- a subject library may include GB1 peptidic compounds each having a hairpin-helix-hairpin scaffold domain described by formula (I):
- P1 and P2 are independently beta-hairpin domains and ⁇ 1 is a helix domain and P1, ⁇ 1 and P2 are connected independently by linking sequences of between 1 and 10 residues in length.
- P1 is ⁇ 1- ⁇ 2 and P2 is ⁇ 3- ⁇ 4 such that the compounds are described by formula (II):
- each linking sequence is independently of 3, 4, 5, 6, 7 or 8 residues in length, such as 4 or 5 residues in length.
- the linking sequences may form a loop or a turn structure.
- the two antiparallel ⁇ -strands of a hairpin motif may be connected via a loop.
- Mutations in a linking sequence that includes insertion or deletion of one or more amino acid residues may be tolerated without significantly disrupting the GB1 scaffold structure.
- each compound of the subject library includes mutations in one or more linking sequences. In certain embodiments, 80% or more, 90% or more, 95% or more, or even 100% of the mutations are at positions within the regions of the linking sequences.
- At least one of the linking sequences is one or more (e.g., such as 2 or more) residues longer in length than the corresponding linking sequence of the GB1 scaffold. In certain embodiments, in formulas (I) and (II), at least one of the linking sequences is one or more residues shorter in length than the corresponding linking sequence of the GB1 scaffold.
- one or more positions in the scaffold may be selected as positions at which to include insertion mutations, e.g., mutations that include the insertion of 1 or 2 additional amino acid residues in addition to the amino acid residue being substituted.
- the insertion mutations are selected for inclusion in one or more loop regions, or at the N-terminal or C-terminal of the scaffold.
- the positions of the variant amino acids that are inserted may be referred to using a letter designation with respect to the numbered position of the mutation, e.g., an insertion mutation of 2 amino acids at position 38 may be referred to as positions 38a and 38b.
- the subject library includes a mutation at position 38 that includes insertion of 0, 1 or 2 variant amino acids. In certain embodiments, the subject library includes a mutation at position 19 that includes insertion of 0, 1 or 2 variant amino acids. In certain embodiments, the subject library includes a mutation at position 1 that includes insertion of 2 variant amino acids, and at positions 19 and 47 that each include insertion of 0, 1 or 2 variant amino acids. In certain embodiments, the subject library includes mutations at positions 9 and 38 that each includes insertion of 0, 1 or 2 variant amino acids, and at position 55 that includes insertion of 1 variant amino acid. In certain embodiments, the subject library includes a mutation at position 9 that includes insertion of 0, 1 or 2 variant amino acids, and at position 55 that includes insertion of 1 variant amino acid. In certain embodiments, the subject library includes a mutation at position 1 that includes insertion of 1 variant amino acid and at position 47 that includes insertion of 0, 1 or 2 variant amino acids.
- the resulting GB1 compound variants may be aligned with the parent GB1 scaffold in different ways.
- an insertion mutation including 2 additional variant amino acids at position 38 of the GB1 scaffold may lead to GB1 compound variants where the loop regions between the ⁇ 1 and P3 regions can be aligned with the GB1 scaffold domain in two or more distinct ways.
- the resulting GB1 compounds may encompass various distinct loop sequences and/or structures that align differently with the parent GB1 scaffold domain.
- the various distinct loop sequences are produced when the insertion mutation is in a variable loop region (e.g. where most of the loop region is being mutated).
- each compound of a subject library includes 4 or more, such as, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, 12 or more, 13 or more, 14 or more, or 15 or more mutations at different positions of a hairpin-helix-hairpin scaffold domain.
- the mutations may involve the deletion, insertion, or substitution of the amino acid residue at the position of the scaffold being mutated.
- the mutations may include substitution with any naturally or non-naturally occurring amino acid, or an analog thereof.
- each compound of a subject library includes 3 or more different non-core mutations, such as, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, or 12 or more different non-core mutations in a region outside of the ⁇ 1- ⁇ 2 region.
- each compound of a subject library includes 3 or more different non-core mutations, such as, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more or 11 or more different non-core mutations in the ⁇ 1 region.
- each compound of a subject library includes 3 or more different non-core mutations, such as 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more or 10 or more different non-core mutations in the ⁇ 3- ⁇ 4 region.
- each compound of a subject library includes 5 or more different non-core mutations, such as 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, or 12 or more different non-core mutations in the ⁇ 1- ⁇ 3 region.
- each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of positions 21-24, 26, 27, 30, 31, 34, 35, 37-41.
- each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of positions 18-24, 26-28, 30-32, 34 and 35.
- each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of positions 1, 18-24 and 45-49. In certain embodiments, each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of positions 7-12, 36-41, 54 and 55.
- each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of positions 3, 5, 7-14, 16, 52, 54 and 55.
- each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of positions 1, 3, 5, 7, 41, 43, 45-50 52 and 54.
- each compound of a subject library includes five or more different mutations in the ⁇ 1 region. In certain embodiments, five or more different mutations are located at positions selected from the group consisting of positions 22-24, 26, 27, 30, 31, 34 and 35.
- each compound of a subject library includes ten or more different mutations in the ⁇ 1 region.
- the ten or more different mutations are located at positions selected from the group consisting of positions 22-24, 26, 27, 28, 30, 31, 32, 34 and 35.
- each compound of a subject library includes three or more different mutations in the ⁇ 3- ⁇ 4 region. In certain embodiments, the three or more different mutations are located at positions selected from the group consisting of positions 41, 54 and 55. In certain embodiments, the three or more different mutations are located at positions selected from the group consisting of positions 52, 54 and 55.
- each compound of a subject library includes five or more different mutations in the ⁇ 3- ⁇ 4 region. In certain embodiments, the five or more different mutations are located at positions selected from the group consisting of positions 45-49. In certain embodiments, each compound of a subject library includes nine or more different mutations in the ⁇ 3- ⁇ 4 region. In certain embodiments, the nine or more different mutations are located at positions selected from the group consisting of positions 41, 43, 45-50 52 and 54.
- each compound of a subject library includes two or more different mutations in the region between the al and ⁇ 3 regions, e.g., mutations in the linking sequence between al and ⁇ 3.
- the two or more different mutations are located at positions selected from the group consisting of positions 37-40.
- each compound of a subject library includes three or more, four or more, five or more, six or more, or ten or more different mutations in the ⁇ 1- ⁇ 2 region.
- the ten or more different mutations in the ⁇ 1- ⁇ 2 region are located at positions selected from the group consisting of positions 3, 5, 7-14 and 16.
- each compound of a subject library is described by a formula independently selected from the group consisting of:
- V9-F11-V10 V9-F11-V10 (VII).
- F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11 and F12 are fixed regions and V1, V2, V3, V4, V5, V6, V7, V8, V9, V10, V11 and V12 are variable regions;
- each fixed region is common to all compounds of the same formula and each compound of the library has a distinct variable region.
- each compound of a subject library is described by formula (III), where:
- F1 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYKLILNGKTLKGETTTEA (SEQ ID NO: 2);
- F2 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence TYDDATKTFTVTE (SEQ ID NO: 3); and
- V1 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence VDAATAEKVFKQYANDNGVDGEW (SEQ ID NO: 4), where each compound of the library comprises 10 or more mutations (e.g., 11, 12, 13, 14 or 15 or more mutations) in the V1 variable region.
- V1 comprises a sequence of the following formula: VXXXXAXXVFXXYAXXNXXXXXW (SEQ ID NO: 5), where each X is a variant amino acid.
- F1 comprises the sequence TYKLILNGKTLKGETTTEA (SEQ ID NO: 2)
- F2 comprises the sequence TYDDATKTFTVTE (SEQ ID NO: 3)
- V1 comprises a sequence of the following formula: VXXXXAXXVFXXYAXXNXXXXXW (SEQ ID NO: 6) where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- the mutation at position 19 of V1 includes insertion of 0, 1 or 2 variant amino acids.
- each compound of a subject library is described by formula (IV), where:
- F3 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYKLILNGKTLKGETT (SEQ ID NO: 7);
- F4 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence GVDGEWTYDDATKTFTVTE (SEQ ID NO: 8); and
- V2 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TEAVDAATAEKVFKQYANDN (SEQ ID NO: 9), where each compound of the library comprises 10 or more mutations (e.g., 11, 12, 13, 14 or 15 or more mutations) in the V2 variable region.
- V2 comprises a sequence of the formula: TXXXXXXXAXXXFXXXAXXN (SEQ ID NO: 10), where each X is a variant amino acid.
- F3 comprises the sequence TYKLILNGKTLKGETT (SEQ ID NO: 7)
- F4 comprises the sequence GVDGEWTYDDATKTFTVTE (SEQ ID NO: 8)
- V2 comprises a sequence of the formula: TXXXXXXAXXXFXXXAXXN (SEQ ID NO: 11) where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- the mutation at position 3 of V2 includes insertion of 0, 1 or 2 variant amino acids.
- each compound of a subject library is described by formula (V), where:
- F5 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence KLILNGKTLKGETT (SEQ ID NO: 12);
- F6 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence EKVFKQYANDNGVDGEWT (SEQ ID NO: 13);
- F7 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence FTVTE (SEQ ID NO: 14);
- V3 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence TY;
- V4 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence TEAVDAATA (SEQ ID NO: 15); and
- V5 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence YDDATKT (SEQ ID NO: 16);
- each compound of the library comprises one or more mutation in the V3 variable region, 3 or more mutations (e.g., 4, 5, 6 or 7 or more mutations) in the V4 variable region, and 3 or more mutations (e.g., 4 or 5 or more mutations) in the V5 variable region.
- V3 comprises a sequence of the formula XY
- V4 comprises a sequence of the formula TXXXXXXA (SEQ ID NO: 17)
- V5 comprises a sequence of the formula YXXXXT (SEQ ID NO: 18) where each X is a variant amino acid.
- F5 comprises the sequence KLILNGKTLKGETT (SEQ ID NO: 12)
- F6 comprises the sequence EKVFKQYANDNGVDGEWT (SEQ ID NO: 13)
- F7 comprises the sequence FTVTE (SEQ ID NO: 14)
- V3 comprises a sequence of the formula XY
- V4 comprises a sequence of the formula TXXXXXXA (SEQ ID NO: 19)
- V5 comprises a sequence of the formula YXXXXT (SEQ ID NO: 20) where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- the mutation at position 1 of V3 includes insertion of 2 variant amino acids
- the mutations at positions 3 and 4 of V4 and V5, respectively, each include insertion of 0, 1 or 2 variant amino acids.
- each compound of a subject library is described by formula (VI), where:
- F8 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYKLI (SEQ ID NO: 21);
- F9 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence ETTTEAVDAATAEKVFKQYAN (SEQ ID NO: 22);
- F10 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYDDATKTFT (SEQ ID NO: 23);
- V6 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence LNGKTLKG (SEQ ID NO: 24);
- V7 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence DNGVDGEW (SEQ ID NO: 25);
- V8 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence VTE;
- each compound of the library comprises 3 or more mutations (e.g., 4, 5 or 6 or more mutations) in the V6 variable region, 3 or more mutations (e.g., 4, 5 or 6 or more mutations) in the V7 variable region; and one or more mutations (e.g., 2 or more mutations) in the V8 variable region.
- 3 or more mutations e.g., 4, 5 or 6 or more mutations
- 3 or more mutations e.g., 4, 5 or 6 or more mutations
- one or more mutations e.g., 2 or more mutations
- V6 comprises a sequence of the formula LXXXXXG (SEQ ID NO: 26)
- V7 comprises a sequence of the formula DXXXXXW (SEQ ID NO: 27)
- V8 comprises a sequence of the formula VXX where each X is a variant amino acid.
- F8 comprises the sequence TYKLI (SEQ ID NO: 21)
- F9 comprises the sequence ETTTEAVDAATAEKVFKQYAN (SEQ ID NO: 22)
- F10 comprises the sequence TYDDATKTFT (SEQ ID NO: 23)
- V6 comprises a sequence of the formula LXXXXXG (SEQ ID NO: 28)
- V7 comprises a sequence of the formula DXXXXXW (SEQ ID NO: 29)
- V8 comprises a sequence of the formula VXX where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- the mutations at position 4 of V6 and V7 each include insertion of 0, 1 or 2 variant amino acids, and the mutation at position 3 of V8 includes insertion of 1 variant amino acid.
- each compound of a subject library is described by formula (VII), where:
- F11 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence EAVDAATAEKVFKQYANDNGVDGEWTYDDATKT (SEQ ID NO: 30);
- V9 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence TYKLILNGKTLKGETTT (SEQ ID NO: 31); and
- V10 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence FTVTE (SEQ ID NO: 32);
- each compound of the library comprises 6 or more mutations (e.g., 7, 8, 9, 10 or 11 or more mutations) in the V9 variable region, and 2 or more mutations (e.g., 3 or more mutations) in the V10 variable region.
- V9 comprises a sequence of the formula TYXLXLXXXXXXXTXT (SEQ ID NO: 33), and V10 comprises a sequence of the formula FXVXX (SEQ ID NO: 34), where each X is a variant amino acid.
- F11 comprises the sequence EAVDAATAEKVFKQYANDNGVDGEWTYDDATKT (SEQ ID NO: 30);
- V9 comprises a sequence of the formula TYXLXLXXXXXXXTXT (SEQ ID NO: 35), and
- V10 comprises a sequence of the formula FXVXX (SEQ ID NO: 36), where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- the mutation at position 9 of V9 includes insertion of 0, 1 or 2 variant amino acids
- the mutation at position 5 of V10 includes insertion of 1 variant amino acid.
- each compound of a subject library is described by formula (VIII), where:
- F12 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence KTLKGETTTEAVDAATAEKVFKQYANDNGVD (SEQ ID NO: 37);
- V11 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYKLILNG (SEQ ID NO: 38);
- V12 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence GEWTYDDATKTFTVTE (SEQ ID NO: 39);
- each compound of the library comprises 3 or more mutations (e.g., 4 or more mutations) in the V11 variable region, and 5 or more mutations (e.g., 6, 7, 8, 9 or 10 or more mutations) in the V12 variable region.
- V11 comprises a sequence of the formula XYXLXLXG (SEQ ID NO: 40)
- V12 comprises a sequence of the formula GXWXYXXXXXXFXVXE (SEQ ID NO: 41), where each X is a variant amino acid.
- F12 comprises the sequence KTLKGETTTEAVDAATAEKVFKQYANDNGVD (SEQ ID NO: 37), V11 comprises a sequence of the formula XYXLXLXG (SEQ ID NO: 42), and V12 comprises a sequence of the formula GXWXYXXXXXXFXVXE (SEQ ID NO: 43), where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- the mutation at position 8 of V12 includes insertion of 0, 1 or 2 variant amino acids
- the mutation at position 1 of V11 includes insertion of 2 variant amino acids.
- each compound of the subject library includes a peptidic sequence of between 30 and 80 residues, such as between 40 and 70, between 45 and 60 residues, or between 52 and 58 residues. In certain embodiments, each compound of the subject library includes a peptidic sequence of 52, 53, 54, 55, 56, 57 or 58 residues. In certain embodiments, the peptidic sequence is of 55, 56, or 57 residues, such as 56 residues.
- each compound of the subject library includes a GB1 scaffold domain and a variable domain.
- the variable domain may be a part of the GB1 scaffold domain and may be either a continuous or a discontinuous sequence of residues.
- a variable domain that is defined by a discontinuous sequence of residues may include contiguous variant amino acids at positions that are arranged close in space relative to each other in the structure of the compound.
- the variable domain may form a potential binding interface of the compounds.
- the variable domain may define a binding surface area of a suitable size for forming protein-protein interactions.
- the variable domain may include a surface area of between 600 and 1800 ⁇ 2 , such as between 800 and 1600 ⁇ 2 , between 1000 and 1400 ⁇ 2 , between 1100 and 1300 ⁇ 2 , or about 1200 ⁇ 2 .
- Any GB1 scaffold as defined herein may be selected as a scaffold for a subject library.
- the positions of the mutations in the GB1 scaffold domain may be selected as described herein, e.g., as depicted in FIG. 3 for Libraries 1 to 6, where the GB1 scaffold domain may be aligned with the framework of FIG. 3 as described above.
- the nature of the mutation at each variant amino acid position may be selected, e.g., substitution with any naturally occurring amino acid, or substitution with a limited number of representative amino acids that provide a reasonable diversity of physiochemical properties (e.g., hydrophobicity, hydrophilicity, size, solubility).
- Certain variant amino acid positions may be selected as positions where mutations can include the insertion or deletion of amino acids, e.g., the insertion of 1 or 2 amino acids where the variant amino acid position occurs in a loop or turn region of the scaffold.
- the mutations can include the insertion or amino acids at one or more positions selected from positions 1, 9, 19, 38, 47 and 55. After selection of the GB1 scaffold, selection of the positions of variant amino acids, and selection of the nature of the mutations at each position, the individual sequences of the members of the library can be determined.
- each compound of the library is described by one of formulas (III) to (VIII), as defined above, and the library includes at least one member described by formula (III), at least one member described by formula (IV), at least one member described by formula (V), at least one member described by formula (VI), at least one member described by formula (VII), and at least one member described by formula (VIII).
- each compound of the library is described by one of formulas (III) to (VIII), where only two of the formulas (III) to (VIII) are represented by the members of the library. In certain embodiments, each compound of the library is described by one of formulas (III) to (VIII), where 5 or less, such as 4 or less or 3 or less of the formulas (III) to (VIII) are represented by the members of the library.
- the subject library includes a combination of libraries 1 to 6 depicted in FIG. 3 , e.g., a combination of 2 or more, such as 3 or more, 4 or more, or 5 or more of libraries 1 to 6.
- the subject library includes a combination of any 2 of the libraries 1 to 6 depicted in FIG. 3 , e.g., a combination of libraries 1 and 2, a combination of libraries 2 and 3, a combination of libraries 1 and 3, a combination of libraries 4 and 5, a combination of libraries 5 and 6, a combination of libraries 4 and 6, a combination of any one of libraries 1-3 and any one of libraries 4-6.
- the subject library includes a combination of any 3 of the libraries 1 to 6 depicted in FIG.
- the subject library includes a combination of all of libraries 1 to 6 depicted in FIG. 3 .
- the subject library is bifunctional in the sense that the GB1 compounds of the library have two potential binding surfaces.
- Such libraries can be screened to identify compounds having specific binding properties for two target molecules.
- the compounds may include a first potential binding surface for a first target molecule and a second potential binding surface for a second target molecule.
- the first target molecule is a therapeutic target protein and the second target molecule is an endogenous protein or receptor (e.g., an IgG, FcRn, or serum albumin protein) that is capable of modulating the pharmacokinetic properties (e.g., in vivo half-life) of a GB1 compound upon recruitment.
- any convenient endogenous protein target may be selected as one of the targets to be screened.
- the compounds of the library include two potential binding surfaces for the same target molecule, where the overall binding affinity of the compound may be modulated via an avidity effect.
- GB1 has binding affinity for human IgG fragments, e.g., hFc binds to the al helix motif and hFab binds to the second beta-strand ( ⁇ 2) motif.
- the IgG-binding properties of the GB1 scaffold are utilized to provide one potential binding surface of the subject bifunctional libraries.
- the bifunctional library has an IgG binding surface that includes the ⁇ 1 helix motif and a target binding surface, such as surface 5 or 6.
- any suitable combinations of potential binding surfaces may be utilized to produce the subject bifunctional libraries.
- the two potential binding surfaces of a bifunctional library are selected to minimize any potential steric interactions between the first and second target molecules, e.g., by binding the targets on opposite sides of the scaffold.
- a pair of potential biding surfaces of the subject bifunctional library are selected from surfaces 1 and 5, surfaces 3 and 4, surfaces 2 and 6, surfaces 1 and 6, surfaces 2 and 5, and surfaces 2 and 4, where the individual surfaces 1 to 6 are shown in FIGS. 2A and 2B , respectively.
- FIG. 13 illustrates exemplary pairs of potential binding surfaces for use in the subject bifunctional libraries.
- the subject bifunctional library may include one or more variable domains on each of the potential binding surfaces of the library. Any convenient variable domains as described herein for surfaces 1-6 may be employed in the subject bifunctional libraries.
- the subject bifunctional library includes 3 or more mutations, such as 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 10 or more, 12 or more or 14 or more mutations in the variable domain of a first surface, and 3 or more mutations, such as 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 10 or more, 12 or more or 14 or more mutations in the variable domain of a second surface. Any suitable mutations in the variable domains may be selected, as described above for the mutations surfaces 1-6 (see e.g., FIG. 3 ).
- the subject bifunctional library may be screened for specific binding to first and second target molecules using a variety of strategies.
- the libraries can be screened for binding to first and second target molecules using simultaneous screening, consecutive screening or convergent screening strategies.
- the bifunctional library is screened for simultaneous binding of first and second targets to first and second surfaces, respectively.
- a first library is screened for binding of a first target to a first surface to produce a second generation library based on a scaffold that binds the first target.
- such binding of a first target protein to a first surface is inherent in the scaffold, and does not require screening, although affinity maturation optimization of the binding of the first target may be performed.
- the second generation library based on the scaffold that binds the first target is then screening for binding to a second target at a second surface.
- a convergent screening strategy is utilized where a first library is screened for binding to a first target and a second library is screened for binding to a second target. Utilizing the results of these screens, first and second binding surfaces are then incorporated into the same GB1 scaffold to produce bifunctional GB1 compounds.
- Such bifunctional compounds and libraries can be optimized by affinity maturation.
- affinity maturation libraries e.g., second generation GB1 peptidic libraries based on a parent GB1 peptidic compound that binds to a certain target molecule
- the libraries can be screened to optimize for binding affinity and specificity, or any desirable property, such as, protein folding, protease stability, thermostability, compatibility with a pharmaceutical formulation, etc.
- the affinity maturation library is a GB1 peptidic library as described above, except that a fraction of the variant amino acid positions are held as fixed positions while the remaining variant amino acid positions define the new library.
- the mutations of these variant amino acids that define the affinity maturation library may include substitution with all 20 naturally occurring amino acids.
- the variant amino acids that are held as fixed become part of a new scaffold domain.
- the affinity maturation library is a GB1 peptidic library described herein, where 70% or more of the variant amino acids, such as 75% or more, 80% or more, or 85% or more are held fixed.
- the affinity maturation library is a GB1 peptidic library described herein, where 8 or more of the variant amino acids, such as 9 or more, 10 or more, or 11 or more, or 12 or more are held fixed. In some cases, the affinity maturation library includes 6 or less, such as 5 or less, 4 or less, or 3 or less variant amino acids. In certain embodiments, the affinity maturation library includes 4 remaining variant amino acids. In certain embodiments, the remaining variant amino acids are contiguous. In certain embodiments, the remaining variant amino acids form a continuous sequence of residues in the GB1 scaffold domain. In certain embodiments, the affinity maturation library is based on one of the GB1 peptidic libraries 1 to 6 as described in FIGS.
- any GB1 scaffold domain may be substituted for the scaffold domain shown in FIG. 3 .
- the scaffold domain of an affinity maturation library may be selected based on an initial selection for binding to a target molecule.
- a GB1 peptidic compound that is identified after initial screening a subject library for binding to a certain target molecule may be selected as a scaffold for an affinity maturation library. Any convenient methods of affinity maturation may be used.
- a number of affinity maturation libraries are prepared that include mutations at limited subsets of possible variant positions (e.g., mutations at 4 of a 15 variable positions), while the rest of the variant positions are held as fixed positions. The positions of the mutations may be tiled through the scaffold sequence to produce a series of libraries such that mutations at every variant position is represented and a diverse range of amino acids are substituted at every position (e.g., all 20 naturally occurring amino acids).
- Mutations that include deletion or insertion of one or more amino acids may also be included at variant positions of the affinity maturation libraries.
- An affinity maturation library may be prepared and screened using any convenient method, e.g., phage display library screening, to identify members of the library having an improved property, e.g., increased binding affinity for a target molecule, protein folding, protease stability, thermostability, compatibility with a pharmaceutical formulation, etc.
- variable regions of the parent GB1 compound are held as fixed positions, and contiguous mutations are introduced at positions adjacent to these variable regions.
- Such mutations may be introduced at positions in the parent GB1 compound that were previously considered fixed positions in the original GB1 scaffold.
- Such mutations may be used to optimize the GB1 compound variants for any desirable property, such as protein folding, protease stability, thermostability, compatibility with a pharmaceutical formulation, etc.
- Fusion polypeptides including GB1 peptidic compounds can be displayed on the surface of a cell or virus in a variety of formats and multivalent forms.
- a bivalent moiety for example, a hinge and dimerization sequence from a Fab template, an anti-MBP (maltose binding protein) Fab scaffold is used for displaying GB1 peptidic compound variants on the surface of a phage particle.
- other sequences encoding polypeptide tags useful for purification or detection such as a FLAG tag, can be fused at the 3′ end of the nucleic acid sequence encoding the GB1 peptidic compound.
- each polynucleotide of the library encodes a distinct GB1 peptidic compound that includes three or more, such as four or more or five or more mutations at non-core positions in a region outside of the ⁇ 1- ⁇ 2 region.
- the subject library of polynucleotides is a library of replicable expression vectors that includes a nucleic acid sequence encoding a gene fusion, where the gene fusion encodes a fusion protein including the GB1 peptidic compound fused to all or a portion of a viral coat protein. Also included is a library of diverse replicable expression vectors comprising a plurality of gene fusions encoding a plurality of different fusion proteins including a plurality of the antibody variable domains generated with diverse sequences as described above.
- the vectors can include a variety of components and can be constructed to allow for movement of the GB1 domain between different vectors and/or to provide for display of the fusion proteins in different formats.
- vectors examples include phage vectors and ribosome display vectors.
- the phage vector has a phage origin of replication allowing phage replication and phage particle formation.
- the phage is a filamentous bacteriophage, such as an M13, f1, fd, Pf3 phage or a derivative thereof, or a lambdoid phage, such as lambda, 21, phi80, phi81, 82, 424, 434, etc., or a derivative thereof.
- cell-based display techniques include phage display, bacterial display, yeast display and mammalian cell display.
- cell-free display techniques include mRNA display and ribosome display.
- the library of polynucleotides is a library that encodes 50 or more distinct compounds, such as 100 or more, 300 or more, 1 ⁇ 10 3 or more, 1 ⁇ 10 4 or more, 1 ⁇ 10 5 or more, 1 ⁇ 10 6 or more, 1 ⁇ 10 7 or more, 1 ⁇ 10 8 or more, 1 ⁇ 10 9 or more, 1 ⁇ 10 10 or more, 1 ⁇ 10 11 or more, or 1 ⁇ 10 12 or more, distinct compounds, where each polynucleotide of the library encodes a GB1 peptidic compound that comprises three or more, such as four or more or five or more different non-core mutations at positions in a region outside of the ⁇ 1- ⁇ 2 region.
- the library of polynucleotides is a library of replicable expression vectors.
- each polynucleotide of the library encodes a GB1 peptidic compound comprising ten or more variant amino acids at non core positions, wherein each variant amino acid is encoded by a random codon.
- the random codon is selected from the group consisting of NNK and KHT.
- the subject libraries may be prepared using any convenient methods, such as, methods that find use in the preparation of libraries of peptidic compounds, for example, phage display methods.
- the subject library is a phage display library.
- a utility of phage display is that large libraries of randomized protein variants can be rapidly and efficiently sorted for those sequences that bind to a target protein. Display of polypeptide libraries on phage may be used for screening for polypeptides with specific binding properties. Polyvalent phage display methods may be used for displaying polypeptides through fusions to either gene III or gene VIII of filamentous phage. Wells and Lowman (1992) Curr. Opin. Struct. Biol B:355-362 and references cited therein.
- a polypeptide library is fused to a gene III or a portion thereof and expressed at low levels in the presence of wild type gene III protein so that phage particles display one copy or none of the fusion proteins.
- Avidity effects are reduced relative to polyvalent phage so that sorting is on the basis of intrinsic ligand affinity, and phagemid vectors are used, which simplify DNA manipulations.
- the phenotype of the phage particle, including the displayed polypeptide corresponds to the genotype inside the phage particle, the DNA enclosed by the phage coat proteins.
- each GB1 peptidic compound of a subject library is fused to at least a portion of a viral coat protein.
- viral coat proteins include infectivity protein PIII, major coat protein PVIII, p3, Soc, Hoc, gpD (of bacteriophage lambda), minor bacteriophage coat protein 6 (pVI) (filamentous phage; J. Immunol. Methods, 1999, 231(1-2):39-51), variants of the M13 bacteriophage major coat protein (P8) (Protein Sci 2000 April; 9(4):647-54).
- the fusion protein can be displayed on the surface of a phage and suitable phage systems include M13KO7 helper phage, M13R408, M13-VCS, and Phi X 174, pJuFo phage system (J. Virol. 2001 August; 75(15):7107-13), hyperphage (Nat. Biotechnol. 2001 January; 19(1):75-8).
- the helper phage is M13KO7
- the coat protein is the M13 Phage gene III coat protein.
- the host is E. coli or protease deficient strains of E. coli .
- Vectors, such as the fth1 vector Nucleic Acids Res. 2001 May 15; 29(10):E50-0
- Vectors such as the fth1 vector (Nucleic Acids Res. 2001 May 15; 29(10):E50-0) can be useful for the expression of the fusion protein.
- Any convenient methods for displaying fusion polypeptides including GB1 peptidic compounds on the surface of bacteriophage may be used. For example methods as described in patent publication number WO 92/01047; WO 92/20791; WO 93/06213; WO 93/11236 and WO 93/19172.
- the expression vector also can have a secretory signal sequence fused to the DNA encoding each GB1 peptidic compound.
- This sequence may be located immediately 5′ to the gene encoding the fusion protein, and will thus be transcribed at the amino terminus of the fusion protein. However, in certain cases, the signal sequence has been demonstrated to be located at positions other than 5′ to the gene encoding the protein to be secreted. This sequence targets the protein to which it is attached across the inner membrane of the bacterial cell.
- the DNA encoding the signal sequence may be obtained as a restriction endonuclease fragment from any gene encoding a protein that has a signal sequence.
- Suitable prokaryotic signal sequences may be obtained from genes encoding, for example, LamB or OmpF (Wong et al., Gene, 68:1931 (1983), MalE, PhoA and other genes.
- a prokaryotic signal sequence for practicing this invention is the E. coli heat-stable enterotoxin II (STII) signal sequence as described by Chang et al., Gene 55:189 (1987), and malE.
- STII enterotoxin II
- the vector may also include a promoter to drive expression of the fusion protein.
- Promoters most commonly used in prokaryotic vectors include the lac Z promoter system, the alkaline phosphatase pho A promoter, the bacteriophage .gamma- PL promoter (a temperature sensitive promoter), the tac promoter (a hybrid trp-lac promoter that is regulated by the lac repressor), the tryptophan promoter, and the bacteriophage T7 promoter. While these are the most commonly used promoters, other suitable microbial promoters may be used as well.
- the vector can also include other nucleic acid sequences, for example, sequences encoding gD tags, c-Myc epitopes, FLAG tags, poly-histidine tags, fluorescence proteins (e.g., GFP), or beta-galactosidase protein which can be useful for detection or purification of the fusion protein expressed on the surface of the phage or cell.
- Nucleic acid sequences encoding, for example, a gD tag also provide for positive or negative selection of cells or virus expressing the fusion protein.
- the gD tag is fused to a GB1 peptidic compound which is not fused to the viral coat protein.
- Nucleic acid sequences encoding, for example, a polyhistidine tag are useful for identifying fusion proteins including GB1 peptidic compounds that bind to a specific target using immunohistochemistry.
- Tags useful for detection of target binding can be fused to either a GB1 peptidic compound not fused to a viral coat protein or a GB1 peptidic compound fused to a viral coat protein.
- phenotypic selection genes are those encoding proteins that confer antibiotic resistance upon the host cell.
- ampicillin resistance gene ampr
- tetr tetracycline resistance gene
- the vector can also include nucleic acid sequences containing unique restriction sites and suppressible stop codons.
- the unique restriction sites are useful for moving GB1 peptidic compound domains between different vectors and expression systems.
- the suppressible stop codons are useful to control the level of expression of the fusion protein and to facilitate purification of GB1 peptidic compounds.
- an amber stop codon can be read as Gln in a supE host to enable phage display, while in a non-supE host it is read as a stop codon to produce soluble GB1 peptidic compounds without fusion to phage coat proteins.
- These synthetic sequences can be fused to GB1 peptidic compounds in the vector.
- vector systems that allow the nucleic acid encoding a GB1 peptidic compound of interest to be easily removed from the vector system and placed into another vector system, may be used.
- appropriate restriction sites can be engineered in a vector system to facilitate the removal of the nucleic acid sequence encoding the GB1 peptidic compounds.
- the restriction sequences are usually chosen to be unique in the vectors to facilitate efficient excision and ligation into new vectors.
- GB1 peptidic compound domains can then be expressed from vectors without extraneous fusion sequences, such as viral coat proteins or other sequence tags.
- DNA encoding a termination codon may be inserted, such termination codons including UAG (amber), UAA (ocher) and UGA (opel).
- UAG amber
- UAA ocher
- UGA opel
- the termination codon expressed in a wild type host cell results in the synthesis of the gene 1 protein product without the gene 2 protein attached.
- growth in a suppressor host cell results in the synthesis of detectable quantities of fused protein.
- Such suppressor host cells are well known and described, such as E. coli suppressor strain (Bullock et al., BioTechniques 5:376-379 (1987)). Any acceptable method may be used to place such a termination codon into the mRNA encoding the fusion polypeptide.
- the suppressible codon may be inserted between the first gene encoding the GB1 peptidic compounds, and a second gene encoding at least a portion of a phage coat protein.
- the suppressible termination codon may be inserted adjacent to the fusion site by replacing the last amino acid triplet in the antibody variable domain or the first amino acid in the phage coat protein.
- the GB1 peptidic compound domain When the plasmid is grown in a non-suppressor host cell, the GB1 peptidic compound domain is synthesized substantially without fusion to the phage coat protein due to termination at the inserted suppressible triplet UAG, UAA, or UGA. In the non-suppressor cell the GB1 peptidic compound domain is synthesized and secreted from the host cell due to the absence of the fused phage coat protein which otherwise anchored it to the host membrane.
- the libraries may be selected for improved binding affinity to a certain target protein, e.g., as described above, for the preparation and screening of affinity maturation libraries.
- the target proteins may include any type of protein of interest in research or therapeutic applications. Aspects of these screening methods may include determining whether a compound of the subject libraries specifically binds to a target protein of interest. Screening methods may include screening for inhibition of a biological activity. Such methods may include: (i) contacting a sample containing a target protein with a library of the invention; and (ii) determining whether a compound of the library specifically binds to the target protein.
- the determining step may be carried out by any one or more of a variety a protocols for characterizing the specific binding or the inhibition of binding.
- screening may be a cell-based assay, an enzyme assay, a ELISA assay or other related biological assay for assessing specific binding or the inhibition of binding, and the determining or assessment step suitable for application in such assays are well known and involve routine protocols.
- Screening may also include in silico methods, in which one or more physical and/or chemical attributes of compounds of the library of interest are expressed in a computer-readable format and evaluated by any one or more of a variety of molecular modeling and/or analysis programs and algorithms suitable for this purpose.
- the in silico method includes inputting one or more parameters related to the D-target protein, such as but not limited to, the three-dimensional coordinates of a known X-ray crystal structure of the D-target protein.
- the in silico method includes inputting one or more parameters related to the compounds of the L-peptidic library, such as but not limited to, the three-dimensional coordinates of a known X-ray crystal structure of a parent scaffold domain of the library.
- the in silico method includes generating one or more parameters for each compound in a peptidic library in a computer readable format, and evaluating the capabilities of the compounds to specifically bind to the target protein.
- the in silico methods include, but are not limited to, molecular modelling studies, biomolecular docking experiments, and virtual representations of molecular structures and/or processes, such as molecular interactions.
- the in silico methods may be performed as a pre-screen (e.g., prior to preparing a L-peptidic library and performing in vitro screening), or as a validation of binding compounds identified after in vitro screening.
- the screening methods of the invention can be carried out in vitro or in vivo.
- the cell when the compound is in a cell, the cell may be in vitro or in vivo, and the determining of whether the compound is capable of specifically binding to a target protein in the cell includes: (i) contacting the cell with a library of the invention; and (ii) assessing whether a compound of the library specifically binds to the target protein.
- determining whether a GB1 peptidic compound of a subject library is capable of specifically binding a target protein may be carried out by any number of methods, as well as combinations thereof.
- the subject method includes:
- the target protein in the subject method, is a D-protein. In some embodiments, in the subject method, the target protein is a L-protein.
- a target protein can be attached with a detectable moiety, such as biotin.
- Phage that bind to the target molecule in solution can be separated from unbound phage by a molecule that binds to the detectable moiety, such as streptavidin-coated beads where biotin is the detectable moiety.
- Affinity of binders can be determined based on concentration of the target protein used, using any convenient formulas and criteria.
- the target protein may be attached to a suitable matrix such as agarose beads, acrylamide beads, glass beads, cellulose, various acrylic copolymers, hydroxyalkyl methacrylate gels, polyacrylic and polymethacrylic copolymers, nylon, neutral and ionic carriers, and the like. Attachment of the target protein to the matrix may be accomplished by any convenient methods, e.g., methods as described in Methods in Enzymology, 44 (1976). After attachment of the target protein to the matrix, the immobilized target is contacted with the library expressing the GB1 peptidic compound containing fusion polypeptides under conditions suitable for binding of at least a portion of the phage particles with the immobilized target.
- a suitable matrix such as agarose beads, acrylamide beads, glass beads, cellulose, various acrylic copolymers, hydroxyalkyl methacrylate gels, polyacrylic and polymethacrylic copolymers, nylon, neutral and ionic carriers, and the like. Attachment of the target protein to
- Binders to the immobilized target are separated from those particles that do not bind to the target by washing. Wash conditions can be adjusted to result in removal of all but the higher affinity binders. Binders may be dissociated from the immobilized target by a variety of methods. These methods include competitive dissociation using the wild-type ligand, altering pH and/or ionic strength, and methods known in the art. Selection of binders may involve elution from an affinity matrix with a ligand. Elution with increasing concentrations of ligand should elute displayed binding GB1 peptidic compounds of increasing affinity.
- the binders can be isolated and then reamplified or expressed in a host cell and subjected to another round of selection for binding of target molecules. Any number of rounds of selection or sorting can be utilized.
- One of the selection or sorting procedures can involve isolating binders that bind to an antibody to a polypeptide tag such as antibodies to the gD protein, FLAG or polyhistidine tags.
- Another selection or sorting procedure can involve multiple rounds of sorting for stability, such as binding to a target protein that specifically binds to folded GB1 peptidic compound containing polypeptide and does not bind to unfolded polypeptide followed by selecting or sorting the stable binders for binding to a target protein.
- suitable host cells are infected with the binders and helper phage, and the host cells are cultured under conditions suitable for amplification of the phagemid particles.
- the phagemid particles are then collected and the selection process is repeated one or more times until binders having the desired affinity for the target molecule are selected. In certain embodiments, two or more rounds of selection are conducted.
- the nucleic acid can be extracted. Extracted DNA can then be used directly to transform E. coli host cells or alternatively, the encoding sequences can be amplified, for example using PCR with suitable primers, and then inserted into a vector for expression.
- any convenient strategy may be used to select for high affinity binders to a target protein.
- the process of screening is carried out by automated systems to allow for high-throughput screening of library candidates.
- compounds of the subject peptidic library specifically bind to a target protein with high affinity, e.g., as determined by an SPR binding assay or an ELISA assay.
- the compounds of the subject peptidic library may exhibit an affinity for a target protein of 1 uM or less, such as 300 nM or less, 100 nM or less, 30 nM or less, 10 nM or less, 5 nM or less, 2 nM or less, 1 nM or less, 300 pM or less, or even less.
- the compounds of the subject peptidic libraries may exhibit a specificity for a target protein, e.g., as determined by comparing the affinity of the compound for the target protein with that for a reference protein (e.g., an albumin protein), that is 5:1 or more 10:1 or more, such as 30:1 or more, 100:1 or more, 300:1 or more, 1000:1 or more, or even more.
- a reference protein e.g., an albumin protein
- the subject libraries can be selected and/or screened for binding to one or more target molecules.
- the libraries may be selected for improved binding affinity to certain target molecule.
- the target molecules may be any type of protein-binding or antigenic molecule, such as proteins, nucleic acids, carbohydrates or small molecules.
- the target molecule is a therapeutic target molecule or a diagnostic target molecule, or a fragment thereof, or a mimic thereof.
- the target molecule is a hormone, a growth factor, a receptor, an enzyme, a cytokine, an osteoinductive factor, a colony stimulating factor or an immunoglobulin.
- the target molecule may be one or more of the following: growth hormone, bovine growth hormone, insulin like growth factors, human growth hormone including n-methionyl human growth hormone, parathyroid hormone, thyroxine, insulin, proinsulin, amylin, relaxin, prorelaxin, glycoprotein hormones such as follicle stimulating hormone (FSH), leutinizing hormone (LH), hemapoietic growth factor, Her-2, fibroblast growth factor, prolactin, placental lactogen, tumor necrosis factors, mullerian inhibiting substance, mouse gonadotropin-associated polypeptide, inhibin, activin, vascular endothelial growth factors, integrin, nerve growth factors such as NGF-beta, insulin-like growth factor-I and II, erythropoietin, osteoinductive factors, interferons, colony stimulating factors, interleukins (e.g., an IL-4 or an IL-8 protein), bone morphogenetic proteins, L
- the target molecule may be a therapeutic target protein for which structural information is known, such as, but not limited to: Raf kinase (a target for the treatment of melanoma), Rho kinase (a target in the prevention of pathogenesis of cardiovascular disease), nuclear factor kappaB (NF-.kappa.B, a target for the treatment of multiple myeloma), vascular endothelial growth factor (VEGF) receptor kinase (a target for action of anti-angiogenetic drugs), Janus kinase 3 (JAK-3, a target for the treatment of rheumatoid arthritis), cyclin dependent kinase (CDK) 2 (CDK2, a target for prevention of stroke), FMS-like tyrosine kinase (FLT) 3 (FLT-3; a target for the treatment of acute myelogenous leukemia (AML)), epidermal growth factor receptor (EGFR) kinase (a
- the target molecule is a target protein that is selected from the group consisting of a VEGF protein, a RANKL protein, a NGF protein, a TNF-alpha protein, a SH2 domain containing protein, a SH3 domain containing protein, an IgE protein a BLyS protein (Oren et al., “Structural basis of BLyS receptor recognition”, Nature Structural Biology 9, 288-292, 2002), a PCSK9 protein (Ni et al., “A proprotein convertase subtilisin-like/kexin type 9 (PCSK9) C-terminal domain antibody antigen-binding fragment inhibits PCSK9 internalization and restores low density lipoprotein uptake”, J.
- a target protein that is selected from the group consisting of a VEGF protein, a RANKL protein, a NGF protein, a TNF-alpha protein, a SH2 domain containing protein, a SH3 domain containing protein, an IgE protein
- the target protein is a VEGF protein.
- the target protein is a SH2 domain containing protein (e.g., a 3BP2 protein) or a SH3 domain containing protein (e.g., a ABL or a Src protein).
- the libraries of the invention find use in a variety of applications.
- Applications of interest include, but are not limited to, screening applications and research applications.
- the screening methods find use in a variety of applications, including selection and/or screening of the subject libraries in a wide range of research and therapeutic applications, such as therapeutic lead identification and affinity maturation, identification of diagnostic reagents, development of high throughput screening assays, development of drug delivery systems for the delivery of toxins or other therapeutic moieties.
- the subject screening methods may be exploited in multiple settings.
- the subject libraries may find use as research tools to analyze the roles of proteins of interest in modulating various biological processes, e.g., angiogenesis, inflammation, cellular growth, metabolism, regulation of transcription and regulation of phosphorylation.
- antibody libraries have been useful tools in many such areas of biological research and lead to the development of effective therapeutic agents, see Sidhu and Fellhouse, “Synthetic therapeutic antibodies,” Nature Chemical Biology, 2006, 2(12), 682-688.
- the subject libraries may be exploited as research tools in the development of clinical diagnostics, e.g., in vitro diagnostics (e.g., for targeting various biomarkers), or in vivo tumor imaging agents.
- clinical diagnostics e.g., in vitro diagnostics (e.g., for targeting various biomarkers), or in vivo tumor imaging agents.
- binding molecules e.g., aptamers and antibodies
- Jayasena “Aptamers: An Emerging Class of Molecules That Rival Antibodies in Diagnostics,” Clinical Chemistry. 1999; 45:1628-1650.
- This sequence was synthesized with NcoI and XbaI restriction sites at 5′ and 3′ respectively and cloned into a display vector as an N-terminal fusion to truncated protein 3 of M13 filamentous phage.
- the features of the vector include a ptac promoter and StII secretion leader sequence (MKKNIAFLLASMFVFSIATNAYA; SEQ ID NO: 45).
- MKKNIAFLLASMFVFSIATNAYA StII secretion leader sequence
- This display version allows the display of GB1 in amber suppressor bacterial strains and is useful for expression of the protein in non-suppressor strains.
- oligonucleotides were prepared (Integrated DNA Technologies Inc.), for site-directed mutagenesis:
- GB1 loops The mutational and insertion tolerance of GB1 loops was tested, by randomizing the loops and beta-turns and selecting for stably folded proteins.
- the loop lengths were varied from 4-6 residues and randomized with a NNK codon.
- the beta-turns and loop residues of GB1 are shown as underlined below:
- Regions B1 and L2 are contiguous and regions L1 and B2 are contiguous. These loops/turn regions were randomized together to produce libraries for screening. Site directed mutagenesis (Kunkel 1987) was used to introduce trip stop codons in the loop pairs. Since wild-type protein is more stable, it would have selective advantage over the rest of the library. The following oligonucleotides were used to make the stop-templates (Integrated DNA Technologies, Inc.):
- the number of transformants was 1 ⁇ 10 9 for Library B1-L2 and 1 ⁇ 10 10 for Library L1-B2.
- the selections were performed using the methods described below except that the library was directly added to selections wells coated with anti-FLAG antibody (5 ⁇ g/ml diluted in PBT) and there was no preincubation step. Selections on anti-FLAG were performed to identify folded variants (misfolded proteins are cleaved thereby losing N-terminal FLAG tag). Three rounds of selection (8 washes/round) were performed as good enrichment was observed in Pool ELISA at Rounds 2 and 3.
- the solvent accessible surface area (SASA) for each residue in the Protein Data Bank (PDB) structure 3 GB1 was estimated using the GETarea tool (Fraczkiewicz & Braun, “Exact and efficient analytical calculation of the accessible surface areas and their gradients for macromolecules,” J. Comput. Chem. 1998, 19, 319-333). This tool also calculates the ratio of SASA in structure compared to SASA in a random coil. A ratio of 0.4 was used to select solvent accessible residues (shown in bold): TYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVT E (SEQ ID NO: 1).
- positions in the loops were selected for mutations that include insertion of 0, 1 or 2 additional amino acid residues in addition to substitution.
- Library 1 +0-2 insertions at position 38;
- Library 2 +0-2 insertions at position 19;
- Library 3 +2 insertions at position 1, +0-2 insertions at positions 19 and 47;
- Library 4 +0-2 insertions at positions 9 and 38, +1 insertion at position 55;
- Library 5 +0-2 insertions at position 9, +1 insertion at position 55;
- Library 6 +1 insertion at position 1, +0-2 insertions at position 47.
- SEQ ID NO: 200 5′- GGTGAAACCACGACC KHT KHT KHT KHT KHT KHT KHT KHT GCA KHT KHT KHT TTC KHT KHT KHT GCC KHT KHT AATGGCGTGGATGGT -3′
- SEQ ID NO: 201 5′- GGTGAAACCACGACC KHT KHT KHT KHT KHT KHT KHT KHT KHT KHT GCA KHT KHT KHT KHT KHT KHT GCC KHT KHT AATGGCGTGGATGGT -3′
- SEQ ID NO: 202 5′- GGTGAAACCACGACC KHT KHT KHT KHT KHT KHT KHT KHT KHT KHT KHT KHT GCA KHT KHT KHT KHT KHT KHT GCC KHT KHT KHT AATGGCGTGGATGGT -3′
- SEQ ID NOs: 200-202 include insertion mutations of +0, 1 or 2
- SEQ ID NO: 203 includes an insertion mutation of +2 variant amino acids at the position equivalent to position 1 of the scaffold.
- SEQ ID NOs: 204-206 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent to position 19 of the scaffold.
- SEQ ID NOs: 207-209 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent to position 47 of the scaffold.
- SEQ ID NOs: 210-212 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent to position 9 of the scaffold.
- SEQ ID NOs: 213-215 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent to position 38 of the scaffold.
- SEQ ID NO: 216 includes an insertion mutation of +2 variant amino acids at the position equivalent to position 55 of the scaffold.
- SEQ ID Nos: 217-219 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent to position 9 of the scaffold.
- SEQ ID NO: 220 includes an insertion mutation of +2 variant amino acids at the position equivalent to position 55 of the scaffold.
- SEQ ID NO: 221 includes an insertion mutation of +1 variant amino acids at the position equivalent to position 1 of the scaffold.
- SEQ ID NOs: 222-224 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent to position 47 of the scaffold.
- the libraries were prepared using the same method described above for the GB1 template with Fab dimerization sequence (Fellouse & Sidhu, 2007). Oligonucleotides with 0/1/2 insertions have the same homology regions and compete for binding the template. Therefore they were pooled together (equimolar ratio) and treated as a single oligonucleotide for mutagenesis. The constructed libraries were pooled together for total diversity of 3.5 ⁇ 10 10 transformants. Selections were performed against L-VEGF and D-VEGF using a method as described below with the exception that 10 selection wells were used for Round 1.
- the selection procedure is essentially the same as described in previous protocols (Fellouse & Sidhu, 2007) with some minor changes.
- the method below is described for L-VEGF, the method can be adapted to screen for binding to any target.
- the media and buffer recipes are the same as in the described protocol.
- a more stringent negative selection procedure is as follows.
- the selection process is essentially the same as described above except that:
- the selections were performed on the GB1 Loop libraries by a method similar to the one described above except that the library was directly added to selection wells coated with anti-FLAG antibody (5 ⁇ g/ml diluted in PBT) and there was no preincubation step. Only three rounds of selection were performed as good enrichment was observed in Pool ELISA at Rounds 2 and 3.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Molecular Biology (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Immunology (AREA)
- Hematology (AREA)
- Physics & Mathematics (AREA)
- Urology & Nephrology (AREA)
- Biomedical Technology (AREA)
- Analytical Chemistry (AREA)
- Genetics & Genomics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Cell Biology (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Food Science & Technology (AREA)
- General Physics & Mathematics (AREA)
- Pathology (AREA)
- Gastroenterology & Hepatology (AREA)
- Peptides Or Proteins (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
Abstract
GB1 peptidic libraries and methods of screening the same for specific binding to a target protein are provided. Libraries of polynucleotides that encode GB1 peptidic compounds are provided. These libraries find use in a variety of applications in which specific binding to target molecules, e.g., target proteins is desired. Also provided are methods of screening the libraries for binding to a target.
Description
- Pursuant to 35 U.S.C. §119(e), this application claims priority to the filing date of U.S. provisional application Ser. No. 61/413,318, filed Nov. 12, 2010, the disclosure of which is herein incorporated by reference.
- This application is related to copending U.S. application entitled “GB1 peptidic compounds and methods for making and using the same” filed on Nov. 10, 2011 to Sidhu et al. (attorney reference number RFLX-003) and accorded Ser. No. ______, and U.S. provisional application Ser. No. 61/413,331 filed Nov. 12, 2010, which are entirely incorporated herein by reference.
- This application is related to copending U.S. application entitled “Methods and compositions for identifying D-peptidic compounds that specifically bind target proteins” filed on Nov. 10, 2011 to Ault-Riché et al. (attorney reference number RFLX-002) and accorded Ser. No. ______, and U.S. provisional application Ser. No. 61/413,316 filed Nov. 12, 2010, which are entirely incorporated herein by reference.
- Essentially all biological processes depend on molecular recognition mediated by proteins. The ability to manipulate the interactions of such proteins is of interest for both basic biological research and for the development of therapeutics and diagnostics.
- Libraries of polypeptides can be prepared, e.g., by manipulating the immune system or via chemical synthesis, from which specificity of binding to target molecules can be selected. Molecular diversity from which specificity can be selected is large for polypeptides having numerous possible sequence combinations of amino acids. In addition, proteins can form large binding surfaces with multiple contacts to a target molecule that leads to highly specific and high affinity binding events. For example, antibodies are a class of protein that has yielded highly specific and tight binding ligands for various target antigens.
- Because of the diversity of target molecules of interest and the binding properties of proteins, the screening of peptidic libraries to identify molecules with useful functions is of interest.
- GB1 peptidic libraries and methods of screening the same for specific binding to a target protein are provided. Libraries of polynucleotides that encode GB1 peptidic compounds are provided. These libraries find use in a variety of applications in which specific binding to target molecules, e.g., target proteins is desired. Also provided are methods of screening the libraries for binding to a target.
-
FIG. 1 depicts a ribbon structure of a GB1 protein that illustrates a 4β-1α motif (Mayo et al., Nature Structural Biology, 5(6), 1998, p. 470-475). -
FIGS. 2A and 2B depict six different libraries that include a GB1 scaffold, both in a ribbon representation (top) and a space filling representation (bottom). Amino acids at several positions of the GB1 scaffold that are selected for mutation are highlighted in darker shade (top). The space filling representations ofLibrary 1 to Library 6 (bottom) illustrate six different potential binding surfaces (shown in darker shade) on the GB1 scaffold. -
FIG. 3 illustrates the underlying sequence of the GB1 scaffold domain (SEQ ID NO: 1) ofFIGS. 2A-2B and the positions of the variant amino acids (shown in the grey blocks) inLibraries 1 to 6. The asterisks indicate positions at which mutations may include insertion of amino acids. -
FIG. 4A depicts the phage display of a GB1 peptidic compound fusion of coat protein p3 that includes a hinge and dimerization format.FIG. 4B illustrates display levels of various formats of the GB1 peptidic compound fusion on the phage particles. -
FIG. 5 illustrates the design of phage display Library 1 (SEQ ID NOs: 225, 226 and 197-199). -
FIG. 6 illustrates the design of phage display Library 2 (SEQ ID NOs: 225, 226 and 200-202). -
FIG. 7 illustrates the design of phage display Library 3 (SEQ ID NOs: 225, 226 and 203-209). -
FIG. 8 illustrates the design of phage display Library 4 (SEQ ID NOs: 225, 226 and 210-216). -
FIG. 9 illustrates the design of phage display Library 5 (SEQ ID NOs: 225, 226 and 217-220). -
FIG. 10 illustrates the design of phage display Library 6 (SEQ ID NOs: 225, 226 and 221-224). -
FIGS. 5 to 10 illustrate the design of phage display libraries based onLibraries 1 to 6 illustrated inFIGS. 2A-2B . Ribbon (left) and space filling (right) structural representations depict the variant amino acid positions in dark. Oligonucleotide and amino acid sequences (SEQ ID NOs: 225 and 226) show the GB1 scaffold in the context of the fusion protein with GGS linkers at the N- and C-termini of the scaffold. Also shown are the oligonucleotide sequences synthesized for use in preparation of the libraries by Kunkel mutagenesis that include KHT codons at variant amino acid positions to encode variable regions of GB1 peptidic compounds. -
FIG. 11 illustrates binding results from four rounds of phage display screening ofLibraries 1 to 6 against L-VEGF and D-VEGF. -
FIG. 12 illustrates binding assay results of individual clones identified from phage display screening of subject libraries against VEGF proteins. 10 nM or 100 nM VEGF protein was added to binding solutions in a competition binding assay. -
FIG. 13 illustrates exemplary bifunctional libraries having two potential binding surfaces. A: Solvent exposed residues of surface 1 (S1) and surface 5 (S5) are shown in dark. B: Solvent exposed residues of surface 4 (S4) and surface 3 (S3) are shown in dark. C: Solvent exposed residues of surface 2 (S2) and surface 6 (S6) are shown in dark. - As used herein, the term “peptidic” refers to a moiety that is composed of amino acid residues. The term “peptidic” includes compounds or libraries in which the conventional backbone has been replaced with non-naturally occurring or synthetic backbones, and peptides in which one or more naturally occurring amino acids have been replaced with one or more non-naturally occurring or synthetic amino acids, or a D-amino acid. Any of the depictions of sequences found herein (e.g., using one-letter or three-letter codes) may represent a L-amino acid or a D-amino acid version of the sequence. Unless noted otherwise, the capital and small letter codes of L- and D-amino acid residues are not utilized.
- As used herein, the terms “polypeptide” and “protein” are used interchangeably. The term “polypeptide” also includes post translational modified polypeptides or proteins. The term “polypeptide” includes polypeptides in which the conventional backbone has been replaced with non-naturally occurring or synthetic backbones, and peptides in which one or more of the conventional amino acids have been replaced with one or more non-naturally occurring or synthetic amino acids. In some instances, polypeptides may be of any length, e.g., 2 or more amino acids, 4 or more amino acids, 10 or more amino acids, 20 or more amino acids, 30 or more amino acids, 40 or more amino acids, 50 or more amino acids, 60 or more amino acids, 100 or more amino acids, 300 or more amino acids, 500 or more or 1000 or more amino acids.
- As used herein, the term “scaffold” or “scaffold domain” refers to a peptidic framework from which a library of compounds arose, and against which the compounds are able to be compared. When a compound of a library arises from amino acid mutations at various positions within a scaffold, the amino acids at those positions are referred to as “variant amino acids.” Such variant amino acids may confer on the resulting peptidic compounds different functions, such as specific binding to a target protein.
- As used herein, the term “mutation” is a deletion, insertion, or substitution of an amino acid(s) residue or nucleotide(s) residue relative to a reference sequence or motif, such as a scaffold sequence or motif.
- As used herein, the terms “GB1 scaffold domain” and “GB1 scaffold” refer to a scaffold that has a structural motif similar to the B1 domain of Protein G (GB1), where the structural motif is characterized by a motif including a four stranded β-sheet packed against a helix (also referred to as a 4β-1α motif). The arrangement of four β-strands and one α-helix may form a hairpin-helix-hairpin motif. An exemplary GB1 scaffold domain is depicted in
FIG. 1 . GB1 scaffold domains include members of the family of IgG binding B domains, e.g., Protein L B1 domain. Amino acid sequences of exemplary B domains that may be employed herein as GB1 scaffold domains are found in the Wellcome Trust Sanger Institute Pfam database (The Pfam protein families database: Finn et al., Nucleic Acids Research (2010) Database Issue 38:D211-222), see, e.g., Family: IgG_binding_B (PF01378) (pfam.sanger.ac.uk/family/PF01378.10#tabview=tab0) or in NCBI's protein database. Exemplary GB1 scaffold domain sequences include those described by SEQ ID NOs: 227-261. A GB1 scaffold domain may be a native sequence of a member of the B domain protein family, a B domain sequence with pre-existing amino acid sequence modifications (such as additions, deletions and/or substitutions), or a fragment or analogue thereof. A GB1 scaffold domain may be L-peptidic, D-peptidic or a combination thereof. In some cases, a “GB1 scaffold domain” may also be referred to as a “parent amino acid sequence.” - As used herein, the term “GB1 peptidic compound” refers to a compound composed of peptidic residues that has a parent GB1 scaffold domain.
- As used herein, the terms “parent amino acid sequence” and “parent polypeptide” refer to a polypeptide comprising an amino acid sequence from which a variant GB1 peptidic compound arose and against which the variant GB1 peptidic compound is being compared. In some cases, the parent polypeptide lacks one or more of the modifications disclosed herein and differs in function compared to a variant GB1 peptidic compound as disclosed herein. The parent polypeptide may comprise a native GB1 sequence or GB1 scaffold sequence with pre-existing amino acid sequence modifications (such as additions, deletions and/or substitutions).
- As used herein, the term “variable region” refers to a continuous sequence of residues that includes one or more variant amino acids. A variable region may also include one or more conserved amino acids at fixed positions. As used herein, the term “fixed region” refers to a continuous sequence of residues that does not include any mutations or variant amino acids, and is conserved across a library of compounds.
- As used herein, the term “variable domain” refers to a domain that includes all of the variant amino acids of a GB1 scaffold. The variable domain may include one or more variable regions, and may encompass a continuous or a discontinuous sequence of residues. The variable domain may be part of the scaffold domain.
- As used herein, the term “discontinuous sequence of residues” refers to a sequence of residues that is not continuous with respect to the primary sequence of a peptidic compound. A peptidic compound may fold to form a secondary or tertiary structure, e.g., a 4β-1α motif, where the amino acids of a discontinuous sequence of residues are adjacent to each other in space, i.e., contiguous. As used herein, the term “continuous sequence of residues” refers to a sequence of residues that is continuous in terms of the primary sequence of a peptidic compound.
- As used herein, the term “non-core mutation” refers to an amino acid mutation of a GB1 peptidic compound that is located at a position in the 4β-1α structure that is not part of the hydrophobic core of the structure. Amino acid residues in the hydrophobic core of a GB1 peptidic compound are not significantly solvent exposed but rather tend to form intramolecular hydrophobic contacts. Unless explicitly defined otherwise, a hydrophobic core residue or core position, as described herein, of a GB1 scaffold domain that is described by SEQ ID NO: 1 is defined by one of
2, 4, 6, 19, 25, 29, 33, 38, 42, 51 and 53 of the GB1 scaffold. The methodology used to specify hydrophobic core residues in GB1 is described by Dahiyat et al., (“Probing the role of packing specificity in protein design,” Proc. Natl. Acad. Sci. USA, 1997, 94, 10172-10177) where a PDB structure was used to calculate which side chains expose less than 10% of their surface area to solvent. Such methods can be modified for use with the GB1 scaffold domain.positions - As used herein, the term “surface mutation” refers to an amino acid mutation in a GB1 scaffold that is located at a position in the 4β-1α structure that is solvent exposed. Such variant amino acid residues at surface positions of a GB1 peptidic compound are capable of interacting directly with a target molecule, whether or not such an interaction occurs.
- As used herein, the term “boundary mutation” refers to an amino acid mutation of a GB1 scaffold that is located at a position in the 4β-1α structure that is at the boundary between the hydrophobic core and the solvent exposed surface. Such variant amino acid residues at boundary positions of a GB1 peptidic compound may be in part contacting hydrophobic core residues and/or in part solvent exposed and capable of some interaction with a target molecule, whether or not such an interaction occurs. One criteria for describing core, surface and boundary residues of a GB1 peptidic structure is described by Mayo et al. Nature Structural Biology, 5(6), 1998, 470-475. Such methods and criteria can be modified for use with the GB1 scaffold domain.
- As used herein, the term “linking sequence” refers to a continuous sequence of amino acid residues, or analogs thereof, that connect two peptidic motifs. In certain embodiments, a linking sequence is the loop connecting two β-strands in a 13-hairpin motif.
- As used herein, the term “phage display” refers to a technique by which variant peptidic compounds are displayed as fusion proteins to a coat protein on the surface of phage, e.g. filamentous phage particles. The term “phagemid” refers to a plasmid vector having a bacterial origin of replication, e.g., Co1E1, and a copy of an intergenic region of a bacteriophage. The phagemid may be based on any known bacteriophage, including filamentous bacteriophage. In some instances, the plasmid will also contain a selectable marker for antibiotic resistance. Segments of DNA cloned into these vectors can be propagated as plasmids. When cells harboring these vectors are provided with all genes necessary for the production of phage particles, the mode of replication of the plasmid changes to rolling circle replication to generate copies of one strand of the plasmid DNA and package phage particles. The phagemid may form infectious or non-infectious phage particles. This term includes phagemids which contain a phage coat protein gene or fragment thereof linked to a heterologous polypeptide gene as a gene fusion such that the heterologous polypeptide is displayed on the surface of the phage particle.
- As used herein, the term “phage vector” refers to a double stranded replicative form of a bacteriophage that contains a heterologous gene and is capable of replication. The phage vector has a phage origin of replication allowing phage replication and phage particle formation. In some cases, the phage is a filamentous bacteriophage, such as an M13, f1, fd, Pf3 phage or a derivative thereof, a lambdoid phage, such as lambda, 21, phi80, phi81, 82, 424, 434, etc., or a derivative thereof, a Baculovirus or a derivative thereof, a T4 phage or a derivative thereof, a T7 phage virus or a derivative thereof.
- As used herein, the term “stable” refers to a compound that is able to maintain a folded state under physiological conditions at a certain temperature, such that it retains at least one of its normal functional activities, for example binding to a target protein. The stability of the compound can be determined using standard methods. For example, the “thermostability” of a compound can be determined by measuring the thermal melt (“Tm”) temperature. The Tm is the temperature in degrees Celsius at which half of the compounds become unfolded. In some instances, the higher the Tm, the more stable the compound.
- The compounds of the subject libraries may contain one or more asymmetric centers and may thus give rise to enantiomers, diastereomers, and other stereoisomeric forms that may be defined, in terms of absolute stereochemistry, as (R)- or (S)- or, as (D)- or (L)- for amino acids and polypeptides. The present invention is meant to include all such possible isomers, as well as, their racemic and optically pure forms. When the compounds described herein contain olefinic double bonds or other centers of geometric asymmetry, and unless specified otherwise, it is intended that the compounds include both E and Z geometric isomers. Likewise, all tautomeric forms are also intended to be included.
- As used herein, the term “a target protein” refers to all members of the target family, and fragments and enantiomers thereof, and protein mimics thereof. The target proteins of interest that are described herein are intended to include all members of the target family, and fragments and enantiomers thereof, and protein mimics thereof, unless explicitly described otherwise. The target protein may be any protein of interest, such as a therapeutic or diagnostic target, including but not limited to: hormones, growth factors, receptors, enzymes, cytokines, osteoinductive factors, colony stimulating factors and immunoglobulins. The term “target protein” is intended to include recombinant and synthetic molecules, which can be prepared using any convenient recombinant expression methods or using any convenient synthetic methods, or purchased commercially, as well as fusion proteins containing a target molecule, as well as synthetic L- or D-proteins.
- As used herein, the term “protein mimic” refers to a peptidic compound that mimics a binding property of a protein of interest, e.g., a target protein. In general terms, the target protein mimic includes an essential part of the original target protein (e.g., an epitope or essential residues thereof) that is necessary for forming a potential binding surface, such that the target protein mimic and the original target protein are each capable of binding specifically to a binding moiety of interest, e.g., an antibody or a D-peptidic compound. In some embodiments, the part(s) of the original target protein that is essential for binding is displayed on a scaffold such that potential binding surface of the original target protein is mimicked. Any suitable scaffold for displaying the minimal essential part of the target protein may be used, including but not limited to antibody scaffolds, scFv, anticalins, non-antibody scaffolds, mimetics of protein secondary and tertiary structures. In some embodiments, a target protein mimic includes residues or fragments of the original target protein that are incorporated into a protein scaffold, where the scaffold mimics a structural motif of the target protein. For example, by incorporating residues of the target protein at desirable positions of a convenient scaffold, the protein mimic may present a potential binding surface that mimics that of the original target protein. In some embodiments, the native structure of the fragments of the original target protein are retained using methods of conformational constraint. Any convenient methods of conformationally constraining a peptidic compound may be used, such as but not limited to, bioconjugation, dimerization (e.g., via a linker), multimerization, or cyclization.
- GB1 peptidic libraries and methods of screening the same for the identification of compounds that specifically bind to target proteins are provided. The subject libraries include a plurality of GB1 peptidic compounds, where each GB1 peptidic compound has a scaffold domain of the same structural motif as the B1 domain of Protein G (GB1), where the structural motif of GB1 is characterized by a motif that includes an arrangement of four 13-strands and one α-helix around a hydrophobic core (also referred to as a 4β-1α motif). The GB1 peptidic compounds of the subject libraries include mutations at non-core positions, e.g., variant amino acids at positions within a GB1 scaffold domain that are not part of the hydrophobic core of the structure. A 4β-1α motif is depicted in
FIG. 1 . - A variety of libraries of GB1 peptidic compounds are provided. For library diversity, both the positions of the mutations and the nature of the mutation at each variable position of the scaffold may be varied. In some instances, the mutations are included at non-core positions, although mutations at core positions may also be included. The mutations may confer different functions on the resulting GB1 peptidic compounds, such as specific binding to a target molecule. The mutations may be selected at positions of a GB1 scaffold domain that are solvent exposed such that the variant amino acids at these positions can form part of a potential target molecule binding surface, although mutations at selected core and/or boundary positions may also be included. In a subject library, the mutations may be concentrated in a variable domain that defines one of several distinct potential binding surfaces of the GB1 scaffold domain. Libraries of GB1 peptidic compounds are provided that include distinct arrangements of mutations concentrated at various surfaces of the 413-1α motif, for example, as depicted in
FIGS. 2A-2B . The subject libraries may include compounds that specifically bind to a target molecule via one of the several potential binding sites of the GB1 scaffold domain. Mutations may be included at the potential binding surface to provide for specific binding to a target molecule without significantly disrupting the GB1 peptidic structure. - In the subject methods, a GB1 peptidic library is contacted with a target molecule to screen for a compound of the library that specifically binds to the target with high affinity. The subject methods and libraries find use in a variety of applications, including screening applications.
- Before certain embodiments are described in greater detail, it is to be understood that this invention is not limited to certain embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing certain embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.
- Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
- Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, representative illustrative methods and materials are now described.
- All publications and patents cited in this specification are herein incorporated by reference as if each individual publication or patent were specifically and individually indicated to be incorporated by reference and are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.
- It is noted that, as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements, or use of a “negative” limitation.
- Each of the individual embodiments described and illustrated herein has discrete components and features which may be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present invention. Any recited method can be carried out in the order of events recited or in any other order which is logically possible.
- In further describing the various aspects of the invention, the structures and sequences of members of the various libraries are described first in greater detail, followed by a description of methods of screening and applications in which the libraries finds use.
- As summarized above, aspects of the invention include libraries of GB1 peptidic compounds where each GB1 peptidic compound has a scaffold domain of the same structural motif as the B1 domain of Protein G (GB1), where the structural motif of GB1 is characterized by a motif that includes an arrangement of four β-strands and one α-helix (also referred to as a 4β-1α motif) around a hydrophobic core. The GB1 peptidic compounds of the subject libraries include mutations at various non-core positions of the 4β-1α motif, e.g., variant amino acids at non-core positions within a GB1 scaffold domain. In many embodiments, the four β-strands and one α-helix motifs of the structure are arranged in a hairpin-helix-hairpin motif, e.g., β1-β2-α1-β3-β4 where β1-β4 are β-strand motifs and α1 is a helix motif. A GB1 peptidic hairpin-helix-hairpin motif is depicted in
FIG. 1 . - A GB1 scaffold domain may be any polypeptide, or fragment thereof that includes the 4β-1α motif, whether naturally occurring or synthetic. The GB1 scaffold domain may be a native sequence of a member of the IgG binding B domain protein family, a IgG binding B domain sequence with pre-existing amino acid sequence modifications (such as additions, deletions and/or substitutions), or a fragment or analogue thereof. GB1 scaffold domains include those described in the following references Gronenborn et al., FEBS Letters 398 (1996), 312-316; Kotz et al., Eur. J. Biochem. 271, 1623-1629 (2004); Malakaukas et al., Nature Structural Biology, 5(6), 1998, p. 470-475; Minor Jr. et al., Nature, 367, 1994, 660-663; Nauli et al. Nature Structural Biology, 8(7), 2001, 602-605; Smith et al., Biochemistry, 1994, 33, 5510-5517; Wunderlich et al. J. Mol. Biol. (2006) 363, 545-557; and analogs or fragments thereof; and those scaffolds described in the definitions section above. In certain embodiments, a GB1 scaffold domain has an amino acid sequence as set forth in one of SEQ ID NOs: 1 and 227-261. In certain embodiments, a GB1 scaffold domain includes a sequence having 60% or more amino acid sequence identity, such as 70% or more, 80% or more, 90% or more, 95% or more or 98% or more amino acid sequence identity to an amino acid sequence set forth in one of SEQ ID NO: 1 and 227-261. A GB1 scaffold domain sequence may include 1 or more, such as 2 or more, 3 or more, 4 or more, 5 or more, 10 or more, 15 or more, or even 20 or more additional peptidic residues compared to a native IgG binding B domain sequence. Alternatively, a GB1 scaffold domain sequence may include fewer peptidic residues compared a native IgG binding B domain sequence, such as 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10, or even fewer residues.
- Exemplary GB1 scaffold domain sequences from the Wellcome Trust Sanger Institute Pfam database are shown in the following sequence alignments:
-
B4U242_STREM/244-298 (SEQ ID NO: 227) ...S.YKLVIKGATFSGETATKAVDAAVAEQ.TFRDYANKNGVDGVWAYDAATKTFTVTE... B4U242_STREM/316-370 (SEQ ID NO: 228) ....TYRLVIKGVTFSGETATKAVDAATAEQ.TFRQYANDNGITGEWAYDTATKTFTVTE... C0MA37_STRE4/228-282 (SEQ ID NO: 229) ...S.YKLVIKGATFSGETATKAVDAAVAEQ.TFRDYANKNGVDGVWAYDAATKTFTVTE... C0MA37_STRE4/300-354 (SEQ ID NO: 230) ....TYRLVIKGVTFSGETATKAVDAATAEQ.TFRQYANDNGVTGEWAYDAATKTFTVTE... C0MCK9_STRS7/228-282 (SEQ ID NO: 231) ...S.YKLVIKGATFSGETATKAVDAAVAEQ.TFRDYANKNGVDGVWAYDAATKTFTVTE... C0MCK9_STRS7/300-354 (SEQ ID NO: 232) ....TYRLVIKGVTFSGETSTKAVDAATAEQ.TFRQYANDNGVTGEWAYDAATKTFTVTE... Q1JGB6_STRPD/117-137 (SEQ ID NO: 233) ANIP........................AEK.AFRQYANDNGVDGV................. Q53291_PEPMA/330-384 (SEQ ID NO: 234) ....TYKLILNGKTLKGETTTEAVDAATAEK.VFKQYANDNGVDGEWTYDDATKTFTVTE... Q53291_PEPMA/400-454 (SEQ ID NO: 235) ....TYKLVINGKTLKGETTTKAVDAETAEK.AFKQYANDNGVDGVWTYDDATKTFTVTE... Q53337_9STRE/3-57 (SEQ ID NO: 236) ....TYKLVINGKTLKGETTTKTVDAETAEK.AFKQYANDNGVDGVWTYDDATKTFTVTE... Q53974_STRDY/258-312 (SEQ ID NO: 237) ....TYKLVINGKTLKGETTTKAVDAETAEK.AFKQYANENGVDGVWTYDDATKTFTVTE... Q53975_STRDY/224-278 (SEQ ID NO: 238) ....TYKLVVKGNTFSGETTTKAIDTATAEK.EFKQYATANNVDGEWSYDDATKTFTVTE... Q53975_STRDY/294-348 (SEQ ID NO: 239) ....TYKLIVKGNTFSGETTTKAVDAETAEK.AFKQYATANNVDGEWSYDDATKTFTVTE... Q53975_STRDY/364-418 (SEQ ID NO: 240) ....TYKLIVKGNTFSGETTTKAIDAATAEK.EFKQYATANGVDGEWSYDDATKTFTVTE... Q53975_STRDY/434-488 (SEQ ID NO: 241) ....TYKLIVKGNTFSGETTTKAVDAETAEK.AFKQYANENGVYGEWSYDDATKTFTVTE... Q53975_STRDY/504-558 (SEQ ID NO: 242) ....TYKLVINGKTLKGETTTKAVDAETAEK.AFKQYANENGVDGVWTYDDATKTFTVTE... Q54181_STRSG/1-45 (SEQ ID NO: 243) ..............MKGETTTEAVDAATAEK.VFKQYANDNGVDGEWTYDDATKTFTVTE... Q54181_STRSG/131-185 (SEQ ID NO: 244) ....TYKLVINGKTLKGETTTKAVDAETAEK.AFKQYANDNGVDGVWTYDDATKTFTVTE... Q54181_STRSG/61-115 (SEQ ID NO: 245) ....TYKLVINGKTLKGETTTEAVDAATAEK.VFKQYANDNGVDGEWTYDDATKTFTVTE... Q56192_STAXY/238-290 (SEQ ID NO: 246) ....TYKLILNGKTLKGETTTEAVDAATARSFNFPILENSSSVPGDPLESTCMH......VEH Q56193_STAXY/238-293 (SEQ ID NO: 247) ....TYKLILNGKTLKGETTTEAVDAATARSFNFPILENSSSVPGDPLESTCRHASFAQA... Q56212_STRSZ/228-282 (SEQ ID NO: 248) ...S.YKLVIKGATFSGETATKAVDAAVAEQ.TFRDYANKNGVDGVWAYDAATKTFTVTE... Q56212_STRSZ/300-354 (SEQ ID NO: 249) ....TYRLVIKGVTFSGETATKAVDAATAEQ.AFRQYANDNGVTGEWAYDAATKTFTVTE... Q76K19_STRSZ/232-286 (SEQ ID NO: 250) ...S.YKLVIKGATFSGETATKAVDAAVAEQ.TFRDYANKNGVDGVWAYDAATKTFTVTE... Q76K19_STRSZ/304-358 (SEQ ID NO: 251) ....TYRLVIKGVTFSGETATKAVDAATAEQ.TFRQYANDNGITGEWAYDTATKTFTVTE... Q93EM8_STRDY/224-278 (SEQ ID NO: 252) ....TYKLVVKGNTFSGETTTKAIDTATAEK.EFKQYATANNVDGEWSYDDATKTFTVTE... Q93EM8_STRDY/294-348 (SEQ ID NO: 253) ....TYKLIVKGNTFSGETTTKAIDAATAEK.EFKQYATANNVDGEWSYDYATKTFTVTE... Q93EM8_STRDY/364-418 (SEQ ID NO: 254) ....TYKLIVKGNTFSGETTTKAIDAATAEK.EFKQYATANNVDGEWSYDDATKTFTVTE... Q93EM8_STRDY/434-488 (SEQ ID NO: 255) ....TYKLIVKGNTFSGETTTKAVDAETAEK.AFKQYATANNVDGEWSYDDATKTFTVTE... Q93EM8_STRDY/504-558 (SEQ ID NO: 256) ....TYKLVINGKTLKGETTTKAVDVETAEK.AFKQYANENGVDGVWTYDDATKTFTVTE... SPG1_STRSG/228-282 (SEQ ID NO: 257) ....TYKLILNGKTLKGETTTEAVDAATAEK.VFKQYANDNGVDGEWTYDDATKTFTVTE... SPG1_STRSG/298-352 (SEQ ID NO: 258) ....TYKLVINGKTLKGETTTKAVDAETAEK.AFKQYANDNGVDGVWTYDDATKTFTVTE... SPG2_STRSG/303-357 (SEQ ID NO: 259) ....TYKLILNGKTLKGETTTEAVDAATAEK.VFKQYANDNGVDGEWTYDDATKTFTVTE... SPG2_STRSG/373-427 (SEQ ID NO: 260) ....TYKLVINGKTLKGETTTEAVDAATAEK.VFKQYANDNGVDGEWTYDDATKTFTVTE... SPG2_STRSG/443-497 (SEQ ID NO: 261) ....TYKLVINGKTLKGETTTKAVDAETAEK.AFKQYANDNGVDGVWTYDDATKTFTVTE... - In some embodiments, the GB1 scaffold domain is described by the following sequence: (T/S)Y(K/R)L(Z1)(Z1)(N/K)G(K/N/V/A)T(L/F)(K/S)GET(T/A/S)T(K/E)(A/T)(V/I)D(A/T/V) (A/E)(T/V)AE(K/Q)(A/E/T/V)F(K/R)(Q/D)YA(N/T)(A/D/E/K)N(G/N)(Z3)(D/T)G(E/V)W(A/T/S)YD(D/A/Y/T)ATKT(Z1)T(Z1)TE (SEQ ID NO:262) where each Z1 is independently a hydrophobic residue. In some embodiments, the GB1 scaffold domain is described by the following sequence: (T/S)Y(K/R)L(I/V)(L/I/V)(N/K)G(K/N/V/A)T(L/F)(K/S)GET(T/A/S)T(K/E)(A/T)(V/I)D(A/T/V)(A/E)(T/V)AE(K/Q)(A/E/T/V)F(K/R)(Q/D)YA(N/T)(A/D/E/K)N(G/N)(V/I)(D/T)G(E/V) W(A/T/S)YD(D/A/Y/T)ATKTFTVTE (SEQ ID NO:263). In certain embodiments, GB1 scaffold domain is described by the following sequence: TYKL(I/V)(L/I/V)(N/K)G(K/N)T(L/F)(K/S)GET(T/A)T(K/E)AVD(A/T/V)(A/E)TAE(K/Q)(A/E/T/V)F(K/R)QYA(N/T)(A/D/E/K)N(G/N)VDG(E/V)W(A/T/S)YD(D/A)ATKTFTVTE (SEQ ID NO:264). A mutation in a scaffold domain may include a deletion, insertion, or substitution of an amino acid residue at any convenient position to produce a sequence that is distinct from the reference scaffold domain sequence.
- In some embodiments, the GB1 scaffold domain is described by the following sequence: T(Z2)K(Z1)(Z1)(Z1)(N/V)(G/L/I)(K/G)(Q/T/D)(L/A/R)(K/V)(G/E/V)(E/V)(A/T/R/I/P/V)(T/I) (R/W/L/K/V/T/I)E(A/L/I)VDA(A/G)(T/E)(A/V/F)EK(V/I/Y)(F/L/W/I/A)K(L/Q)(Z1)(Z3)N(A/D)(K/N)(T/G)(V/I)(E/D)G(V/E)(W/F)TY(D/K)D(E/A)(T/I)KT(Z1)T(Z1)TE (SEQ ID NO:265), where each Z1 is independently a hydrophobic residue, Z2 is an aromatic hydrophobic residue, and Z3 is a non-aromatic hydrophobic residue.
- In some embodiments, the GB1 scaffold domain is described by the following sequence:
-
(SEQ ID NO: 266) T(Y/F/W/A)K(L/V/I/M/F/Y/A)(L/V/I/F/M)(L/V/I/F/M/A/Y/S)(N/V)(G/L/I)(K/G)(Q/T/D)(L/ A/R)(K/V)(G/E/V)(E/V)(A/T/R/I/P/V)(T/I)(R/W/L/K/V/T/I)E(A/L/I)VDA(A/G)(T/E)(A/V/F) EK(V/I/Y)(F/L/W/I/A)K(L/Q)(W/F/L/M/Y/I)(L/V/I/A)N(A/D)(K/N)(T/G)(V/I)(E/D)G(V/E) (W/F)TY(D/K)D(E/A)(T/I)KT(L/V/I/F/M/W)T(L/V/I/F/M)TE . - The diversity of the subject libraries is designed to maximize diversity while minimizing structural perturbations of the GB1 scaffold domain. The positions to be mutated are selected to ensure that the GB1 peptidic compounds of the subject libraries can maintain a folded state under physiological conditions. Another aspect of generating diversity in the subject libraries is the selection of amino acid positions to be mutated such that the amino acids can form a potential binding surface in the GB1 scaffold domain, whether or not the residues actually contact a target protein. One way of determining whether an amino acid position is part of a potential binding surface involves examining the three dimensional structure of the GB1 scaffold domain, using a computer program such as the UCSF Chimera program. Other ways include crystallographic and genetic mutational analysis. Any convenient method may be used to determine whether an amino acid position is part of a potential binding surface.
- The mutations may be found at positions in the GB1 scaffold domain where the amino acid residue is at least in part solvent exposed. Solvent exposed positions can be determined using software suitable for protein modeling and three-dimensional structural information obtained from a crystal structure. For example, solvent exposed residues may be determined using the Protein Data Bank (PDB)
structure 3 GB1 and estimating the solvent accessible surface area (SASA) for each residue using the GETarea tool (Fraczkiewicz & Braun, “Exact and efficient analytical calculation of the accessible surface areas and their gradients for macromolecules,” J. Comput. Chem. 1998, 19, 319-333). This tool calculates the ratio of SASA in structure compared to SASA in a random coil. A ratio of 0.4 was used in selecting the following solvent accessible residues (shown in bold): TYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVT E (SEQ ID NO:1). - The mutations of the GB1 scaffold domain may be concentrated at one of several different potential binding surfaces of the scaffold domain. Several distinct arrangements of mutations of the GB1 scaffold domain at non-core positions of the hairpin-helix-hairpin scaffold domain are provided. In some instances, the majority of the mutations are at non-core positions of the GB1 scaffold domain (e.g., solvent exposed or boundary positions) however in some cases one or more mutations may be located at hydrophobic core positions. In certain embodiments, mutations at hydrophobic core position may be tolerated without significantly disrupting the GB1 scaffold structure, such as, when those core mutations are selected in a loop region. In such cases the loop region may form a structure or conformation that is different to that of the parent scaffold.
- In certain embodiments, the GB1 scaffold may have loop regions that are independently selected from any one of the loop sequences set forth in Table 1: (SEQ ID NOs: 67-196 and 267-272). Any of the loop sequences 1-4 of Table 1 may be incorporated at the positions indicated in Table 1 into any convenient GB1 scaffold domain (e.g., SEQ ID NO: 1) to produce another GB1 scaffold domain.
- In certain embodiments, mutations at boundary positions may also be tolerated without significantly disrupting the GB1 scaffold structure. Mutations at such positions may confer desirable properties upon the resulting GB1 compound variants, such as stability, a certain structural property, or specific binding to a target molecule.
- The positions of the mutations in the GB1 scaffold domain may be described herein either by reference to a structural motif or region, or by reference to a position number in the primary sequence of the scaffold domain.
FIG. 3 illustrates the alignment of the position numbering scheme for a GB1 scaffold domain relative to its β1, β2, α1, β3 and β4 motifs, and relative to the mutations of certain libraries of the invention. Positions marked with an asterix indicate exemplary positions at which mutations that include the insertion of one or more amino acids may be included. Any GB1 scaffold domain sequence may be substituted for the scaffold sequence depicted inFIG. 3 , and the positions of the mutations that define a subject library may be transferred from one scaffold to another by any convenient method. For example, a sequence alignment method may be used to place any GB1 scaffold domain sequence within the framework of the position numbering scheme illustrated inFIG. 3 . Alignment methods based on structural motifs such as β-strands and α-helices may also be used to place a GB1 scaffold domain sequence within the framework of the position numbering scheme illustrated inFIG. 3 . - In some cases, a first GB1 scaffold domain sequence may be aligned with a second GB1 scaffold domain sequence that is one or more amino acids longer or shorter. For example, the second GB1 scaffold domain may have one or more additional amino acids at the N-terminal or C-terminal relative to the first GB1 scaffold, or may have one or more additional amino acids in one of the loop regions of the structure. In such cases, a numbering scheme such as is described below for insertion mutations may be used to relate two scaffold domain sequences.
- Another aspect of the diversity of the subject libraries is the size of the library, i.e, the number of distinct compounds of the library. In some embodiments, a subject library includes 50 or more distinct compounds, such as 100 or more, 300 or more, 1×103 or more, 1×104 or more, 1×105 or more, 1×106 or more, 1×107 or more, 1×108 or more, 1×109 or more, 1×1010 or more, 1×1011 or more, or 1×1012 or more, distinct compounds.
- A subject library may include GB1 peptidic compounds each having a hairpin-helix-hairpin scaffold domain described by formula (I):
-
P1-α1-P2 (I) - where P1 and P2 are independently beta-hairpin domains and α1 is a helix domain and P1, α1 and P2 are connected independently by linking sequences of between 1 and 10 residues in length. In some embodiments, in formula (I), P1 is β1-β2 and P2 is β3-β4 such that the compounds are described by formula (II):
-
β1-β2-α1-β3-β4 (II) - where β1, β2, β3 and β4 are independently beta-strand domains and α1 is a helix domain, and β1, β2, α1, β3 and β4 are connected independently by linking sequences of between 1 and 10 residues in length, such as, between 2 and 8 residues, or between 3 and 6 residues in length. In certain embodiments, each linking sequence is independently of 3, 4, 5, 6, 7 or 8 residues in length, such as 4 or 5 residues in length.
- In certain embodiments, the linking sequences may form a loop or a turn structure. For example, the two antiparallel β-strands of a hairpin motif may be connected via a loop. Mutations in a linking sequence that includes insertion or deletion of one or more amino acid residues may be tolerated without significantly disrupting the GB1 scaffold structure. In some embodiments, in formulas (I) and (II), each compound of the subject library includes mutations in one or more linking sequences. In certain embodiments, 80% or more, 90% or more, 95% or more, or even 100% of the mutations are at positions within the regions of the linking sequences. In certain embodiments, in formulas (I) and (II), at least one of the linking sequences is one or more (e.g., such as 2 or more) residues longer in length than the corresponding linking sequence of the GB1 scaffold. In certain embodiments, in formulas (I) and (II), at least one of the linking sequences is one or more residues shorter in length than the corresponding linking sequence of the GB1 scaffold.
- In some embodiments, one or more positions in the scaffold may be selected as positions at which to include insertion mutations, e.g., mutations that include the insertion of 1 or 2 additional amino acid residues in addition to the amino acid residue being substituted. In certain embodiments, the insertion mutations are selected for inclusion in one or more loop regions, or at the N-terminal or C-terminal of the scaffold. The positions of the variant amino acids that are inserted may be referred to using a letter designation with respect to the numbered position of the mutation, e.g., an insertion mutation of 2 amino acids at
position 38 may be referred to as positions 38a and 38b. - In certain embodiments, the subject library includes a mutation at
position 38 that includes insertion of 0, 1 or 2 variant amino acids. In certain embodiments, the subject library includes a mutation atposition 19 that includes insertion of 0, 1 or 2 variant amino acids. In certain embodiments, the subject library includes a mutation atposition 1 that includes insertion of 2 variant amino acids, and at 19 and 47 that each include insertion of 0, 1 or 2 variant amino acids. In certain embodiments, the subject library includes mutations atpositions 9 and 38 that each includes insertion of 0, 1 or 2 variant amino acids, and atpositions position 55 that includes insertion of 1 variant amino acid. In certain embodiments, the subject library includes a mutation atposition 9 that includes insertion of 0, 1 or 2 variant amino acids, and atposition 55 that includes insertion of 1 variant amino acid. In certain embodiments, the subject library includes a mutation atposition 1 that includes insertion of 1 variant amino acid and atposition 47 that includes insertion of 0, 1 or 2 variant amino acids. - In some cases, when an insertion mutation (e.g., insertion of one or more additional variant amino acids) is made in a GB1 scaffold, the resulting GB1 compound variants may be aligned with the parent GB1 scaffold in different ways. For example, an insertion mutation including 2 additional variant amino acids at
position 38 of the GB1 scaffold may lead to GB1 compound variants where the loop regions between the α1 and P3 regions can be aligned with the GB1 scaffold domain in two or more distinct ways. In other words, the resulting GB1 compounds may encompass various distinct loop sequences and/or structures that align differently with the parent GB1 scaffold domain. In some cases, the various distinct loop sequences are produced when the insertion mutation is in a variable loop region (e.g. where most of the loop region is being mutated). - In some embodiments, each compound of a subject library includes 4 or more, such as, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, 12 or more, 13 or more, 14 or more, or 15 or more mutations at different positions of a hairpin-helix-hairpin scaffold domain. The mutations may involve the deletion, insertion, or substitution of the amino acid residue at the position of the scaffold being mutated. The mutations may include substitution with any naturally or non-naturally occurring amino acid, or an analog thereof.
- In some embodiments, each compound of a subject library includes 3 or more different non-core mutations, such as, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, or 12 or more different non-core mutations in a region outside of the β1-β2 region.
- In some embodiments, each compound of a subject library includes 3 or more different non-core mutations, such as, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more or 11 or more different non-core mutations in the α1 region.
- In some embodiments, each compound of a subject library includes 3 or more different non-core mutations, such as 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more or 10 or more different non-core mutations in the β3-β4 region.
- In some embodiments, each compound of a subject library includes 5 or more different non-core mutations, such as 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, or 12 or more different non-core mutations in the α1-β3 region.
- In certain embodiments, each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of positions 21-24, 26, 27, 30, 31, 34, 35, 37-41.
- In certain embodiments, each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of positions 18-24, 26-28, 30-32, 34 and 35.
- In certain embodiments, each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of
positions 1, 18-24 and 45-49. In certain embodiments, each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of positions 7-12, 36-41, 54 and 55. - In certain embodiments, each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of
3, 5, 7-14, 16, 52, 54 and 55.positions - In certain embodiments, each compound of a subject library includes ten or more different mutations, where the ten or more different mutations are located at positions selected from the group consisting of
1, 3, 5, 7, 41, 43, 45-50 52 and 54.positions - In certain embodiments, each compound of a subject library includes five or more different mutations in the α1 region. In certain embodiments, five or more different mutations are located at positions selected from the group consisting of positions 22-24, 26, 27, 30, 31, 34 and 35.
- In certain embodiments, each compound of a subject library includes ten or more different mutations in the α1 region. In certain embodiments, the ten or more different mutations are located at positions selected from the group consisting of positions 22-24, 26, 27, 28, 30, 31, 32, 34 and 35.
- In certain embodiments, each compound of a subject library includes three or more different mutations in the β3-β4 region. In certain embodiments, the three or more different mutations are located at positions selected from the group consisting of
41, 54 and 55. In certain embodiments, the three or more different mutations are located at positions selected from the group consisting ofpositions 52, 54 and 55.positions - In certain embodiments, each compound of a subject library includes five or more different mutations in the β3-β4 region. In certain embodiments, the five or more different mutations are located at positions selected from the group consisting of positions 45-49. In certain embodiments, each compound of a subject library includes nine or more different mutations in the β3-β4 region. In certain embodiments, the nine or more different mutations are located at positions selected from the group consisting of
41, 43, 45-50 52 and 54.positions - In certain embodiments, each compound of a subject library includes two or more different mutations in the region between the al and β3 regions, e.g., mutations in the linking sequence between al and β3. In certain embodiments, the two or more different mutations are located at positions selected from the group consisting of positions 37-40.
- In certain embodiments, each compound of a subject library includes three or more, four or more, five or more, six or more, or ten or more different mutations in the β1-β2 region. In certain embodiments, the ten or more different mutations in the β1-β2 region are located at positions selected from the group consisting of
3, 5, 7-14 and 16.positions - In some embodiments, each compound of a subject library is described by a formula independently selected from the group consisting of:
-
F1-V1-F2 (III); -
F3-V2-F4 (IV); -
V3-F5-V4-F6-V5-F7 (V); -
F8-V6-F9-V7-F10-V8 (VI); -
V9-F11-V10 (VII); and -
V11-F12-V12 (VIII) - where F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11 and F12 are fixed regions and V1, V2, V3, V4, V5, V6, V7, V8, V9, V10, V11 and V12 are variable regions;
- where each fixed region is common to all compounds of the same formula and each compound of the library has a distinct variable region.
- In certain embodiments, each compound of a subject library is described by formula (III), where:
- F1 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYKLILNGKTLKGETTTEA (SEQ ID NO: 2);
- F2 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence TYDDATKTFTVTE (SEQ ID NO: 3); and
- V1 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence VDAATAEKVFKQYANDNGVDGEW (SEQ ID NO: 4), where each compound of the library comprises 10 or more mutations (e.g., 11, 12, 13, 14 or 15 or more mutations) in the V1 variable region.
- In certain embodiments, in formula (III), V1 comprises a sequence of the following formula: VXXXXAXXVFXXYAXXNXXXXXW (SEQ ID NO: 5), where each X is a variant amino acid.
- In certain embodiments, in formula (III), F1 comprises the sequence TYKLILNGKTLKGETTTEA (SEQ ID NO: 2), F2 comprises the sequence TYDDATKTFTVTE (SEQ ID NO: 3), and V1 comprises a sequence of the following formula: VXXXXAXXVFXXYAXXNXXXXXW (SEQ ID NO: 6) where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- In certain embodiments, in formula (III), the mutation at
position 19 of V1 includes insertion of 0, 1 or 2 variant amino acids. - In certain embodiments, each compound of a subject library is described by formula (IV), where:
- F3 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYKLILNGKTLKGETT (SEQ ID NO: 7);
- F4 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence GVDGEWTYDDATKTFTVTE (SEQ ID NO: 8); and
- V2 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TEAVDAATAEKVFKQYANDN (SEQ ID NO: 9), where each compound of the library comprises 10 or more mutations (e.g., 11, 12, 13, 14 or 15 or more mutations) in the V2 variable region.
- In certain embodiments, in formula (IV), V2 comprises a sequence of the formula: TXXXXXXXAXXXFXXXAXXN (SEQ ID NO: 10), where each X is a variant amino acid.
- In certain embodiments, in formula (IV), F3 comprises the sequence TYKLILNGKTLKGETT (SEQ ID NO: 7), F4 comprises the sequence GVDGEWTYDDATKTFTVTE (SEQ ID NO: 8), and V2 comprises a sequence of the formula: TXXXXXXXAXXXFXXXAXXN (SEQ ID NO: 11) where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- In certain embodiments, in formula (IV), the mutation at
position 3 of V2 includes insertion of 0, 1 or 2 variant amino acids. - In certain embodiments, each compound of a subject library is described by formula (V), where:
- F5 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence KLILNGKTLKGETT (SEQ ID NO: 12);
- F6 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence EKVFKQYANDNGVDGEWT (SEQ ID NO: 13);
- F7 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence FTVTE (SEQ ID NO: 14);
- V3 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence TY; and
- V4 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence TEAVDAATA (SEQ ID NO: 15); and
- V5 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence YDDATKT (SEQ ID NO: 16);
- where each compound of the library comprises one or more mutation in the V3 variable region, 3 or more mutations (e.g., 4, 5, 6 or 7 or more mutations) in the V4 variable region, and 3 or more mutations (e.g., 4 or 5 or more mutations) in the V5 variable region.
- In certain embodiments, in formula (V), V3 comprises a sequence of the formula XY, V4 comprises a sequence of the formula TXXXXXXXA (SEQ ID NO: 17), and V5 comprises a sequence of the formula YXXXXXT (SEQ ID NO: 18) where each X is a variant amino acid.
- In certain embodiments, in formula (V), F5 comprises the sequence KLILNGKTLKGETT (SEQ ID NO: 12), F6 comprises the sequence EKVFKQYANDNGVDGEWT (SEQ ID NO: 13), F7 comprises the sequence FTVTE (SEQ ID NO: 14), V3 comprises a sequence of the formula XY, V4 comprises a sequence of the formula TXXXXXXXA (SEQ ID NO: 19), and V5 comprises a sequence of the formula YXXXXXT (SEQ ID NO: 20) where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- In certain embodiments, in formula (V), the mutation at
position 1 of V3 includes insertion of 2 variant amino acids, and the mutations at 3 and 4 of V4 and V5, respectively, each include insertion of 0, 1 or 2 variant amino acids.positions - In certain embodiments, each compound of a subject library is described by formula (VI), where:
- F8 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYKLI (SEQ ID NO: 21);
- F9 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence ETTTEAVDAATAEKVFKQYAN (SEQ ID NO: 22);
- F10 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYDDATKTFT (SEQ ID NO: 23);
- V6 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence LNGKTLKG (SEQ ID NO: 24);
- V7 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence DNGVDGEW (SEQ ID NO: 25);
- V8 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence VTE;
- where each compound of the library comprises 3 or more mutations (e.g., 4, 5 or 6 or more mutations) in the V6 variable region, 3 or more mutations (e.g., 4, 5 or 6 or more mutations) in the V7 variable region; and one or more mutations (e.g., 2 or more mutations) in the V8 variable region.
- In certain embodiments, in formula (VI), V6 comprises a sequence of the formula LXXXXXXG (SEQ ID NO: 26), V7 comprises a sequence of the formula DXXXXXXW (SEQ ID NO: 27), and V8 comprises a sequence of the formula VXX where each X is a variant amino acid.
- In certain embodiments, in formula (VI), F8 comprises the sequence TYKLI (SEQ ID NO: 21), F9 comprises the sequence ETTTEAVDAATAEKVFKQYAN (SEQ ID NO: 22), F10 comprises the sequence TYDDATKTFT (SEQ ID NO: 23), V6 comprises a sequence of the formula LXXXXXXG (SEQ ID NO: 28), V7 comprises a sequence of the formula DXXXXXXW (SEQ ID NO: 29), and V8 comprises a sequence of the formula VXX where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- In certain embodiments, in formula (VI), the mutations at
position 4 of V6 and V7 each include insertion of 0, 1 or 2 variant amino acids, and the mutation atposition 3 of V8 includes insertion of 1 variant amino acid. - In certain embodiments, each compound of a subject library is described by formula (VII), where:
- F11 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence EAVDAATAEKVFKQYANDNGVDGEWTYDDATKT (SEQ ID NO: 30);
- V9 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence TYKLILNGKTLKGETTT (SEQ ID NO: 31); and
- V10 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to an amino acid sequence FTVTE (SEQ ID NO: 32);
- where each compound of the library comprises 6 or more mutations (e.g., 7, 8, 9, 10 or 11 or more mutations) in the V9 variable region, and 2 or more mutations (e.g., 3 or more mutations) in the V10 variable region.
- In certain embodiments, in formula (VII), V9 comprises a sequence of the formula TYXLXLXXXXXXXXTXT (SEQ ID NO: 33), and V10 comprises a sequence of the formula FXVXX (SEQ ID NO: 34), where each X is a variant amino acid.
- In certain embodiments, in formula (VII), F11 comprises the sequence EAVDAATAEKVFKQYANDNGVDGEWTYDDATKT (SEQ ID NO: 30); V9 comprises a sequence of the formula TYXLXLXXXXXXXXTXT (SEQ ID NO: 35), and V10 comprises a sequence of the formula FXVXX (SEQ ID NO: 36), where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- In certain embodiments, in formula (VII), the mutation at
position 9 of V9 includes insertion of 0, 1 or 2 variant amino acids, and the mutation atposition 5 of V10 includes insertion of 1 variant amino acid. - In certain embodiments, each compound of a subject library is described by formula (VIII), where:
- F12 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence KTLKGETTTEAVDAATAEKVFKQYANDNGVD (SEQ ID NO: 37);
- V11 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence TYKLILNG (SEQ ID NO: 38);
- V12 comprises a sequence having 60% or more (e.g., 70% or more, 80% or more, 90% or more, or 95% or more) amino acid sequence identity to the amino acid sequence GEWTYDDATKTFTVTE (SEQ ID NO: 39);
- where each compound of the library comprises 3 or more mutations (e.g., 4 or more mutations) in the V11 variable region, and 5 or more mutations (e.g., 6, 7, 8, 9 or 10 or more mutations) in the V12 variable region.
- In certain embodiments, in formula (VIII), V11 comprises a sequence of the formula XYXLXLXG (SEQ ID NO: 40), and V12 comprises a sequence of the formula GXWXYXXXXXXFXVXE (SEQ ID NO: 41), where each X is a variant amino acid.
- In certain embodiments, in formula (VIII), F12 comprises the sequence KTLKGETTTEAVDAATAEKVFKQYANDNGVD (SEQ ID NO: 37), V11 comprises a sequence of the formula XYXLXLXG (SEQ ID NO: 42), and V12 comprises a sequence of the formula GXWXYXXXXXXFXVXE (SEQ ID NO: 43), where each X is independently selected from the group consisting of A, D, F, S, V and Y.
- In certain embodiments, in formula (VIII), the mutation at
position 8 of V12 includes insertion of 0, 1 or 2 variant amino acids, and the mutation atposition 1 of V11 includes insertion of 2 variant amino acids. - In some embodiments, each compound of the subject library includes a peptidic sequence of between 30 and 80 residues, such as between 40 and 70, between 45 and 60 residues, or between 52 and 58 residues. In certain embodiments, each compound of the subject library includes a peptidic sequence of 52, 53, 54, 55, 56, 57 or 58 residues. In certain embodiments, the peptidic sequence is of 55, 56, or 57 residues, such as 56 residues.
- In certain embodiments, each compound of the subject library includes a GB1 scaffold domain and a variable domain. The variable domain may be a part of the GB1 scaffold domain and may be either a continuous or a discontinuous sequence of residues. A variable domain that is defined by a discontinuous sequence of residues may include contiguous variant amino acids at positions that are arranged close in space relative to each other in the structure of the compound. The variable domain may form a potential binding interface of the compounds. The variable domain may define a binding surface area of a suitable size for forming protein-protein interactions. The variable domain may include a surface area of between 600 and 1800 Å2, such as between 800 and 1600 Å2, between 1000 and 1400 Å2, between 1100 and 1300 Å2, or about 1200 Å2.
- The individual sequences of the members of any one of the subject libraries can be determined as follows. Any GB1 scaffold as defined herein may be selected as a scaffold for a subject library. The positions of the mutations in the GB1 scaffold domain may be selected as described herein, e.g., as depicted in
FIG. 3 forLibraries 1 to 6, where the GB1 scaffold domain may be aligned with the framework ofFIG. 3 as described above. The nature of the mutation at each variant amino acid position may be selected, e.g., substitution with any naturally occurring amino acid, or substitution with a limited number of representative amino acids that provide a reasonable diversity of physiochemical properties (e.g., hydrophobicity, hydrophilicity, size, solubility). Certain variant amino acid positions may be selected as positions where mutations can include the insertion or deletion of amino acids, e.g., the insertion of 1 or 2 amino acids where the variant amino acid position occurs in a loop or turn region of the scaffold. In certain embodiments, the mutations can include the insertion or amino acids at one or more positions selected from 1, 9, 19, 38, 47 and 55. After selection of the GB1 scaffold, selection of the positions of variant amino acids, and selection of the nature of the mutations at each position, the individual sequences of the members of the library can be determined.positions - In some embodiments, two or more of the subject libraries may be combined to produce a larger library. The combination library may include members that have any one of two or more distinct arrangements of mutations that define two or more potential binding surfaces of the GB1 scaffold. In some embodiments, each compound of the library is described by one of formulas (III) to (VIII), as defined above, and the library includes at least one member described by formula (III), at least one member described by formula (IV), at least one member described by formula (V), at least one member described by formula (VI), at least one member described by formula (VII), and at least one member described by formula (VIII).
- In certain embodiments, each compound of the library is described by one of formulas (III) to (VIII), where only two of the formulas (III) to (VIII) are represented by the members of the library. In certain embodiments, each compound of the library is described by one of formulas (III) to (VIII), where 5 or less, such as 4 or less or 3 or less of the formulas (III) to (VIII) are represented by the members of the library.
- In some embodiments, the subject library includes a combination of
libraries 1 to 6 depicted inFIG. 3 , e.g., a combination of 2 or more, such as 3 or more, 4 or more, or 5 or more oflibraries 1 to 6. In some embodiments, the subject library includes a combination of any 2 of thelibraries 1 to 6 depicted inFIG. 3 , e.g., a combination of 1 and 2, a combination oflibraries 2 and 3, a combination oflibraries 1 and 3, a combination oflibraries 4 and 5, a combination oflibraries 5 and 6, a combination oflibraries 4 and 6, a combination of any one of libraries 1-3 and any one of libraries 4-6. In some embodiments, the subject library includes a combination of any 3 of thelibraries libraries 1 to 6 depicted inFIG. 3 , e.g., a combination of libraries 1-3, a combination of libraries 4-6, a combination of any 2 libraries of 1-3 and any one library of 4-6, or a combination of any one library of 1-3 and any 2 libraries of 4-6. In some embodiments, the subject library includes a combination of all oflibraries 1 to 6 depicted inFIG. 3 . - In some embodiments, the subject library is bifunctional in the sense that the GB1 compounds of the library have two potential binding surfaces. Such libraries can be screened to identify compounds having specific binding properties for two target molecules. In certain embodiments, the compounds may include a first potential binding surface for a first target molecule and a second potential binding surface for a second target molecule. In certain embodiments, the first target molecule is a therapeutic target protein and the second target molecule is an endogenous protein or receptor (e.g., an IgG, FcRn, or serum albumin protein) that is capable of modulating the pharmacokinetic properties (e.g., in vivo half-life) of a GB1 compound upon recruitment. In some embodiments, any convenient endogenous protein target may be selected as one of the targets to be screened. In certain embodiments, the compounds of the library include two potential binding surfaces for the same target molecule, where the overall binding affinity of the compound may be modulated via an avidity effect.
- GB1 has binding affinity for human IgG fragments, e.g., hFc binds to the al helix motif and hFab binds to the second beta-strand (β2) motif. In some embodiments, the IgG-binding properties of the GB1 scaffold are utilized to provide one potential binding surface of the subject bifunctional libraries. In certain embodiments, the bifunctional library has an IgG binding surface that includes the α1 helix motif and a target binding surface, such as
5 or 6.surface - Any suitable combinations of potential binding surfaces may be utilized to produce the subject bifunctional libraries. In some cases, the two potential binding surfaces of a bifunctional library are selected to minimize any potential steric interactions between the first and second target molecules, e.g., by binding the targets on opposite sides of the scaffold. In some embodiments, a pair of potential biding surfaces of the subject bifunctional library are selected from
1 and 5, surfaces 3 and 4, surfaces 2 and 6, surfaces 1 and 6, surfaces 2 and 5, and surfaces 2 and 4, where thesurfaces individual surfaces 1 to 6 are shown inFIGS. 2A and 2B , respectively.FIG. 13 illustrates exemplary pairs of potential binding surfaces for use in the subject bifunctional libraries. - The subject bifunctional library may include one or more variable domains on each of the potential binding surfaces of the library. Any convenient variable domains as described herein for surfaces 1-6 may be employed in the subject bifunctional libraries. In some embodiments, the subject bifunctional library includes 3 or more mutations, such as 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 10 or more, 12 or more or 14 or more mutations in the variable domain of a first surface, and 3 or more mutations, such as 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 10 or more, 12 or more or 14 or more mutations in the variable domain of a second surface. Any suitable mutations in the variable domains may be selected, as described above for the mutations surfaces 1-6 (see e.g.,
FIG. 3 ). - The subject bifunctional library may be screened for specific binding to first and second target molecules using a variety of strategies. For example, the libraries can be screened for binding to first and second target molecules using simultaneous screening, consecutive screening or convergent screening strategies. In some embodiments, the bifunctional library is screened for simultaneous binding of first and second targets to first and second surfaces, respectively. In some embodiments, a first library is screened for binding of a first target to a first surface to produce a second generation library based on a scaffold that binds the first target. In certain embodiments, such binding of a first target protein to a first surface is inherent in the scaffold, and does not require screening, although affinity maturation optimization of the binding of the first target may be performed. The second generation library based on the scaffold that binds the first target is then screening for binding to a second target at a second surface. In some embodiments, a convergent screening strategy is utilized where a first library is screened for binding to a first target and a second library is screened for binding to a second target. Utilizing the results of these screens, first and second binding surfaces are then incorporated into the same GB1 scaffold to produce bifunctional GB1 compounds. Such bifunctional compounds and libraries can be optimized by affinity maturation.
- Also provided are affinity maturation libraries, e.g., second generation GB1 peptidic libraries based on a parent GB1 peptidic compound that binds to a certain target molecule, where the libraries can be screened to optimize for binding affinity and specificity, or any desirable property, such as, protein folding, protease stability, thermostability, compatibility with a pharmaceutical formulation, etc.
- In some embodiments, the affinity maturation library is a GB1 peptidic library as described above, except that a fraction of the variant amino acid positions are held as fixed positions while the remaining variant amino acid positions define the new library. The mutations of these variant amino acids that define the affinity maturation library may include substitution with all 20 naturally occurring amino acids. The variant amino acids that are held as fixed become part of a new scaffold domain. In certain embodiments, the affinity maturation library is a GB1 peptidic library described herein, where 70% or more of the variant amino acids, such as 75% or more, 80% or more, or 85% or more are held fixed. In certain embodiments, the affinity maturation library is a GB1 peptidic library described herein, where 8 or more of the variant amino acids, such as 9 or more, 10 or more, or 11 or more, or 12 or more are held fixed. In some cases, the affinity maturation library includes 6 or less, such as 5 or less, 4 or less, or 3 or less variant amino acids. In certain embodiments, the affinity maturation library includes 4 remaining variant amino acids. In certain embodiments, the remaining variant amino acids are contiguous. In certain embodiments, the remaining variant amino acids form a continuous sequence of residues in the GB1 scaffold domain. In certain embodiments, the affinity maturation library is based on one of the GB1
peptidic libraries 1 to 6 as described inFIGS. 2-3 , where a fraction of the variant amino acid positions are held as fixed positions while the remaining variant amino acid positions define the new library. In some cases, one or more of the variant amino acids that are held fixed may be different from the amino acids of the GB1 scaffold shown inFIG. 3 . Further, any GB1 scaffold domain may be substituted for the scaffold domain shown inFIG. 3 . The scaffold domain of an affinity maturation library may be selected based on an initial selection for binding to a target molecule. - In some instances, a GB1 peptidic compound that is identified after initial screening a subject library for binding to a certain target molecule may be selected as a scaffold for an affinity maturation library. Any convenient methods of affinity maturation may be used. In some cases, a number of affinity maturation libraries are prepared that include mutations at limited subsets of possible variant positions (e.g., mutations at 4 of a 15 variable positions), while the rest of the variant positions are held as fixed positions. The positions of the mutations may be tiled through the scaffold sequence to produce a series of libraries such that mutations at every variant position is represented and a diverse range of amino acids are substituted at every position (e.g., all 20 naturally occurring amino acids). Mutations that include deletion or insertion of one or more amino acids may also be included at variant positions of the affinity maturation libraries. An affinity maturation library may be prepared and screened using any convenient method, e.g., phage display library screening, to identify members of the library having an improved property, e.g., increased binding affinity for a target molecule, protein folding, protease stability, thermostability, compatibility with a pharmaceutical formulation, etc.
- In some embodiments, in an affinity maturation library, most or all of the variant amino acid positions in the variable regions of the parent GB1 compound are held as fixed positions, and contiguous mutations are introduced at positions adjacent to these variable regions. Such mutations may be introduced at positions in the parent GB1 compound that were previously considered fixed positions in the original GB1 scaffold. Such mutations may be used to optimize the GB1 compound variants for any desirable property, such as protein folding, protease stability, thermostability, compatibility with a pharmaceutical formulation, etc.
- Fusion polypeptides including GB1 peptidic compounds can be displayed on the surface of a cell or virus in a variety of formats and multivalent forms. In one embodiment, a bivalent moiety, for example, a hinge and dimerization sequence from a Fab template, an anti-MBP (maltose binding protein) Fab scaffold is used for displaying GB1 peptidic compound variants on the surface of a phage particle. Optionally, other sequences encoding polypeptide tags useful for purification or detection such as a FLAG tag, can be fused at the 3′ end of the nucleic acid sequence encoding the GB1 peptidic compound.
- Also provided is a library of polynucleotides that encodes a library of GB1 peptidic compounds as described above. In some embodiments, each polynucleotide of the library encodes a distinct GB1 peptidic compound that includes three or more, such as four or more or five or more mutations at non-core positions in a region outside of the β1-β2 region.
- In some embodiments, each polynucleotide of the library encodes a GB1 peptidic compound that includes 30 or more, 40 or more, or 50 or more amino acids. In some embodiments, each polynucleotide of the library encodes a GB1 peptidic compound where the compound includes three or more variant amino acids at non-core positions, and where each variant amino acid is encoded by a random codon. In certain embodiments, the random codon is selected from the group consisting of NNK (where N=A, G, C and T, and K=G and T) and KHT (where K=G and T, and H=A, C and T).
- In certain embodiments, the subject library of polynucleotides is a library of replicable expression vectors that includes a nucleic acid sequence encoding a gene fusion, where the gene fusion encodes a fusion protein including the GB1 peptidic compound fused to all or a portion of a viral coat protein. Also included is a library of diverse replicable expression vectors comprising a plurality of gene fusions encoding a plurality of different fusion proteins including a plurality of the antibody variable domains generated with diverse sequences as described above. The vectors can include a variety of components and can be constructed to allow for movement of the GB1 domain between different vectors and/or to provide for display of the fusion proteins in different formats. Examples of vectors include phage vectors and ribosome display vectors. The phage vector has a phage origin of replication allowing phage replication and phage particle formation. In certain embodiments, the phage is a filamentous bacteriophage, such as an M13, f1, fd, Pf3 phage or a derivative thereof, or a lambdoid phage, such as lambda, 21, phi80, phi81, 82, 424, 434, etc., or a derivative thereof.
- Any convenient display methods may be used to display GB1 peptidic compounds encoded by the subject library of polynucleotides, such as cell-based display techniques and cell-free display techniques. In certain embodiments, cell-based display techniques include phage display, bacterial display, yeast display and mammalian cell display. In certain embodiments, cell-free display techniques include mRNA display and ribosome display.
- In certain embodiments, the library of polynucleotides is a library that encodes 50 or more distinct compounds, such as 100 or more, 300 or more, 1×103 or more, 1×104 or more, 1×105 or more, 1×106 or more, 1×107 or more, 1×108 or more, 1×109 or more, 1×1010 or more, 1×1011 or more, or 1×1012 or more, distinct compounds, where each polynucleotide of the library encodes a GB1 peptidic compound that comprises three or more, such as four or more or five or more different non-core mutations at positions in a region outside of the β1-β2 region. In certain embodiments, the library of polynucleotides is a library of replicable expression vectors.
- In some embodiments, each polynucleotide of the library encodes a GB1 peptidic compound comprising ten or more variant amino acids at non core positions, wherein each variant amino acid is encoded by a random codon. In certain embodiments, the random codon is selected from the group consisting of NNK and KHT.
- The subject libraries may be prepared using any convenient methods, such as, methods that find use in the preparation of libraries of peptidic compounds, for example, phage display methods.
- In some embodiments, the subject library is a phage display library. A utility of phage display is that large libraries of randomized protein variants can be rapidly and efficiently sorted for those sequences that bind to a target protein. Display of polypeptide libraries on phage may be used for screening for polypeptides with specific binding properties. Polyvalent phage display methods may be used for displaying polypeptides through fusions to either gene III or gene VIII of filamentous phage. Wells and Lowman (1992) Curr. Opin. Struct. Biol B:355-362 and references cited therein. In monovalent phage display, a polypeptide library is fused to a gene III or a portion thereof and expressed at low levels in the presence of wild type gene III protein so that phage particles display one copy or none of the fusion proteins. Avidity effects are reduced relative to polyvalent phage so that sorting is on the basis of intrinsic ligand affinity, and phagemid vectors are used, which simplify DNA manipulations. Lowman and Wells (1991) Methods: A companion to Methods in Enzymology 3:205-216. In phage display, the phenotype of the phage particle, including the displayed polypeptide, corresponds to the genotype inside the phage particle, the DNA enclosed by the phage coat proteins.
- In some embodiments, each GB1 peptidic compound of a subject library is fused to at least a portion of a viral coat protein. Examples of viral coat proteins include infectivity protein PIII, major coat protein PVIII, p3, Soc, Hoc, gpD (of bacteriophage lambda), minor bacteriophage coat protein 6 (pVI) (filamentous phage; J. Immunol. Methods, 1999, 231(1-2):39-51), variants of the M13 bacteriophage major coat protein (P8) (Protein Sci 2000 April; 9(4):647-54). The fusion protein can be displayed on the surface of a phage and suitable phage systems include M13KO7 helper phage, M13R408, M13-VCS, and Phi X 174, pJuFo phage system (J. Virol. 2001 August; 75(15):7107-13), hyperphage (Nat. Biotechnol. 2001 January; 19(1):75-8). In certain embodiments, the helper phage is M13KO7, and the coat protein is the M13 Phage gene III coat protein. In certain embodiments, the host is E. coli or protease deficient strains of E. coli. Vectors, such as the fth1 vector (Nucleic Acids Res. 2001 May 15; 29(10):E50-0) can be useful for the expression of the fusion protein.
- Any convenient methods for displaying fusion polypeptides including GB1 peptidic compounds on the surface of bacteriophage may be used. For example methods as described in patent publication number WO 92/01047; WO 92/20791; WO 93/06213; WO 93/11236 and WO 93/19172.
- The expression vector also can have a secretory signal sequence fused to the DNA encoding each GB1 peptidic compound. This sequence may be located immediately 5′ to the gene encoding the fusion protein, and will thus be transcribed at the amino terminus of the fusion protein. However, in certain cases, the signal sequence has been demonstrated to be located at positions other than 5′ to the gene encoding the protein to be secreted. This sequence targets the protein to which it is attached across the inner membrane of the bacterial cell. The DNA encoding the signal sequence may be obtained as a restriction endonuclease fragment from any gene encoding a protein that has a signal sequence. Suitable prokaryotic signal sequences may be obtained from genes encoding, for example, LamB or OmpF (Wong et al., Gene, 68:1931 (1983), MalE, PhoA and other genes. A prokaryotic signal sequence for practicing this invention is the E. coli heat-stable enterotoxin II (STII) signal sequence as described by Chang et al., Gene 55:189 (1987), and malE.
- The vector may also include a promoter to drive expression of the fusion protein. Promoters most commonly used in prokaryotic vectors include the lac Z promoter system, the alkaline phosphatase pho A promoter, the bacteriophage .gamma-PL promoter (a temperature sensitive promoter), the tac promoter (a hybrid trp-lac promoter that is regulated by the lac repressor), the tryptophan promoter, and the bacteriophage T7 promoter. While these are the most commonly used promoters, other suitable microbial promoters may be used as well.
- The vector can also include other nucleic acid sequences, for example, sequences encoding gD tags, c-Myc epitopes, FLAG tags, poly-histidine tags, fluorescence proteins (e.g., GFP), or beta-galactosidase protein which can be useful for detection or purification of the fusion protein expressed on the surface of the phage or cell. Nucleic acid sequences encoding, for example, a gD tag, also provide for positive or negative selection of cells or virus expressing the fusion protein. In some embodiments, the gD tag is fused to a GB1 peptidic compound which is not fused to the viral coat protein. Nucleic acid sequences encoding, for example, a polyhistidine tag, are useful for identifying fusion proteins including GB1 peptidic compounds that bind to a specific target using immunohistochemistry. Tags useful for detection of target binding can be fused to either a GB1 peptidic compound not fused to a viral coat protein or a GB1 peptidic compound fused to a viral coat protein.
- Another useful component of the vectors used to practice this invention are phenotypic selection genes. The phenotypic selection genes are those encoding proteins that confer antibiotic resistance upon the host cell. By way of illustration, the ampicillin resistance gene (ampr), and the tetracycline resistance gene (tetr) are readily employed for this purpose.
- The vector can also include nucleic acid sequences containing unique restriction sites and suppressible stop codons. The unique restriction sites are useful for moving GB1 peptidic compound domains between different vectors and expression systems. The suppressible stop codons are useful to control the level of expression of the fusion protein and to facilitate purification of GB1 peptidic compounds. For example, an amber stop codon can be read as Gln in a supE host to enable phage display, while in a non-supE host it is read as a stop codon to produce soluble GB1 peptidic compounds without fusion to phage coat proteins. These synthetic sequences can be fused to GB1 peptidic compounds in the vector.
- In some cases, vector systems that allow the nucleic acid encoding a GB1 peptidic compound of interest to be easily removed from the vector system and placed into another vector system, may be used. For example, appropriate restriction sites can be engineered in a vector system to facilitate the removal of the nucleic acid sequence encoding the GB1 peptidic compounds. The restriction sequences are usually chosen to be unique in the vectors to facilitate efficient excision and ligation into new vectors. GB1 peptidic compound domains can then be expressed from vectors without extraneous fusion sequences, such as viral coat proteins or other sequence tags.
- Between nucleic acid encoding GB1 peptidic compounds (gene 1) and the viral coat protein (gene 2), DNA encoding a termination codon may be inserted, such termination codons including UAG (amber), UAA (ocher) and UGA (opel). (Microbiology, Davis et al., Harper & Row, New York, 1980, pp. 237, 245-47 and 374). The termination codon expressed in a wild type host cell results in the synthesis of the
gene 1 protein product without thegene 2 protein attached. However, growth in a suppressor host cell results in the synthesis of detectable quantities of fused protein. Such suppressor host cells are well known and described, such as E. coli suppressor strain (Bullock et al., BioTechniques 5:376-379 (1987)). Any acceptable method may be used to place such a termination codon into the mRNA encoding the fusion polypeptide. - The suppressible codon may be inserted between the first gene encoding the GB1 peptidic compounds, and a second gene encoding at least a portion of a phage coat protein. Alternatively, the suppressible termination codon may be inserted adjacent to the fusion site by replacing the last amino acid triplet in the antibody variable domain or the first amino acid in the phage coat protein. When the plasmid containing the suppressible codon is grown in a suppressor host cell, it results in the detectable production of a fusion polypeptide containing the polypeptide and the coat protein. When the plasmid is grown in a non-suppressor host cell, the GB1 peptidic compound domain is synthesized substantially without fusion to the phage coat protein due to termination at the inserted suppressible triplet UAG, UAA, or UGA. In the non-suppressor cell the GB1 peptidic compound domain is synthesized and secreted from the host cell due to the absence of the fused phage coat protein which otherwise anchored it to the host membrane.
- Also provided are methods of screening libraries of the compounds, e.g., as described above, for binding to a target protein. In addition, the libraries may be selected for improved binding affinity to a certain target protein, e.g., as described above, for the preparation and screening of affinity maturation libraries. The target proteins may include any type of protein of interest in research or therapeutic applications. Aspects of these screening methods may include determining whether a compound of the subject libraries specifically binds to a target protein of interest. Screening methods may include screening for inhibition of a biological activity. Such methods may include: (i) contacting a sample containing a target protein with a library of the invention; and (ii) determining whether a compound of the library specifically binds to the target protein.
- The determining step may be carried out by any one or more of a variety a protocols for characterizing the specific binding or the inhibition of binding.
- For example, screening may be a cell-based assay, an enzyme assay, a ELISA assay or other related biological assay for assessing specific binding or the inhibition of binding, and the determining or assessment step suitable for application in such assays are well known and involve routine protocols.
- Screening may also include in silico methods, in which one or more physical and/or chemical attributes of compounds of the library of interest are expressed in a computer-readable format and evaluated by any one or more of a variety of molecular modeling and/or analysis programs and algorithms suitable for this purpose. In some embodiments, the in silico method includes inputting one or more parameters related to the D-target protein, such as but not limited to, the three-dimensional coordinates of a known X-ray crystal structure of the D-target protein. In some embodiments, the in silico method includes inputting one or more parameters related to the compounds of the L-peptidic library, such as but not limited to, the three-dimensional coordinates of a known X-ray crystal structure of a parent scaffold domain of the library. In some instances, the in silico method includes generating one or more parameters for each compound in a peptidic library in a computer readable format, and evaluating the capabilities of the compounds to specifically bind to the target protein. The in silico methods include, but are not limited to, molecular modelling studies, biomolecular docking experiments, and virtual representations of molecular structures and/or processes, such as molecular interactions. The in silico methods may be performed as a pre-screen (e.g., prior to preparing a L-peptidic library and performing in vitro screening), or as a validation of binding compounds identified after in vitro screening.
- Thus the screening methods of the invention can be carried out in vitro or in vivo. For example, when the compound is in a cell, the cell may be in vitro or in vivo, and the determining of whether the compound is capable of specifically binding to a target protein in the cell includes: (i) contacting the cell with a library of the invention; and (ii) assessing whether a compound of the library specifically binds to the target protein.
- As such, determining whether a GB1 peptidic compound of a subject library is capable of specifically binding a target protein may be carried out by any number of methods, as well as combinations thereof.
- In some embodiments, the subject method includes:
- (a) contacting a target protein with a library including 50 or more distinct GB1 peptidic compounds, where each compound includes a β1-β2 region and three or more, such as four or more or five or more mutations at non-core positions in a region outside of the β1-β2 region; and
- (b) identifying a compound of the library that specifically binds to the target protein.
- In some embodiments, in the subject method, the target protein is a D-protein. In some embodiments, in the subject method, the target protein is a L-protein.
- Screening for the ability of a fusion polypeptide including a GB1 peptidic compound of a subject library to bind a target molecule can also be performed in solution phase. For example, a target protein can be attached with a detectable moiety, such as biotin. Phage that bind to the target molecule in solution can be separated from unbound phage by a molecule that binds to the detectable moiety, such as streptavidin-coated beads where biotin is the detectable moiety. Affinity of binders (GB1 peptidic compound fusions that bind to target protein) can be determined based on concentration of the target protein used, using any convenient formulas and criteria.
- In some embodiments, the target protein may be attached to a suitable matrix such as agarose beads, acrylamide beads, glass beads, cellulose, various acrylic copolymers, hydroxyalkyl methacrylate gels, polyacrylic and polymethacrylic copolymers, nylon, neutral and ionic carriers, and the like. Attachment of the target protein to the matrix may be accomplished by any convenient methods, e.g., methods as described in Methods in Enzymology, 44 (1976). After attachment of the target protein to the matrix, the immobilized target is contacted with the library expressing the GB1 peptidic compound containing fusion polypeptides under conditions suitable for binding of at least a portion of the phage particles with the immobilized target. In some instances, the conditions, including pH, ionic strength, temperature and the like will mimic physiological conditions. Bound particles (“binders”) to the immobilized target are separated from those particles that do not bind to the target by washing. Wash conditions can be adjusted to result in removal of all but the higher affinity binders. Binders may be dissociated from the immobilized target by a variety of methods. These methods include competitive dissociation using the wild-type ligand, altering pH and/or ionic strength, and methods known in the art. Selection of binders may involve elution from an affinity matrix with a ligand. Elution with increasing concentrations of ligand should elute displayed binding GB1 peptidic compounds of increasing affinity.
- The binders can be isolated and then reamplified or expressed in a host cell and subjected to another round of selection for binding of target molecules. Any number of rounds of selection or sorting can be utilized. One of the selection or sorting procedures can involve isolating binders that bind to an antibody to a polypeptide tag such as antibodies to the gD protein, FLAG or polyhistidine tags. Another selection or sorting procedure can involve multiple rounds of sorting for stability, such as binding to a target protein that specifically binds to folded GB1 peptidic compound containing polypeptide and does not bind to unfolded polypeptide followed by selecting or sorting the stable binders for binding to a target protein.
- In some cases, suitable host cells are infected with the binders and helper phage, and the host cells are cultured under conditions suitable for amplification of the phagemid particles. The phagemid particles are then collected and the selection process is repeated one or more times until binders having the desired affinity for the target molecule are selected. In certain embodiments, two or more rounds of selection are conducted.
- After binders are identified by binding to the target protein, the nucleic acid can be extracted. Extracted DNA can then be used directly to transform E. coli host cells or alternatively, the encoding sequences can be amplified, for example using PCR with suitable primers, and then inserted into a vector for expression.
- Any convenient strategy may be used to select for high affinity binders to a target protein. In certain embodiments, the process of screening is carried out by automated systems to allow for high-throughput screening of library candidates.
- In certain embodiments, compounds of the subject peptidic library specifically bind to a target protein with high affinity, e.g., as determined by an SPR binding assay or an ELISA assay. The compounds of the subject peptidic library may exhibit an affinity for a target protein of 1 uM or less, such as 300 nM or less, 100 nM or less, 30 nM or less, 10 nM or less, 5 nM or less, 2 nM or less, 1 nM or less, 300 pM or less, or even less. The compounds of the subject peptidic libraries may exhibit a specificity for a target protein, e.g., as determined by comparing the affinity of the compound for the target protein with that for a reference protein (e.g., an albumin protein), that is 5:1 or more 10:1 or more, such as 30:1 or more, 100:1 or more, 300:1 or more, 1000:1 or more, or even more.
- Once the subject libraries are prepared they can be selected and/or screened for binding to one or more target molecules. In addition, the libraries may be selected for improved binding affinity to certain target molecule. The target molecules may be any type of protein-binding or antigenic molecule, such as proteins, nucleic acids, carbohydrates or small molecules. In certain embodiments, the target molecule is a therapeutic target molecule or a diagnostic target molecule, or a fragment thereof, or a mimic thereof.
- In certain embodiments, the target molecule is a hormone, a growth factor, a receptor, an enzyme, a cytokine, an osteoinductive factor, a colony stimulating factor or an immunoglobulin.
- In certain embodiments, the target molecule may be one or more of the following: growth hormone, bovine growth hormone, insulin like growth factors, human growth hormone including n-methionyl human growth hormone, parathyroid hormone, thyroxine, insulin, proinsulin, amylin, relaxin, prorelaxin, glycoprotein hormones such as follicle stimulating hormone (FSH), leutinizing hormone (LH), hemapoietic growth factor, Her-2, fibroblast growth factor, prolactin, placental lactogen, tumor necrosis factors, mullerian inhibiting substance, mouse gonadotropin-associated polypeptide, inhibin, activin, vascular endothelial growth factors, integrin, nerve growth factors such as NGF-beta, insulin-like growth factor-I and II, erythropoietin, osteoinductive factors, interferons, colony stimulating factors, interleukins (e.g., an IL-4 or an IL-8 protein), bone morphogenetic proteins, LIF, SCF, FLT-3 ligand, kit-ligand, SH3 domain, apoptosis protein, hepatocyte growth factor, hepatocyte growth factor receptor, neutravidin, maltose binding protein, angiostatin, aFGF, bFGF, TGF-alpha, TGF-beta, HGF, TNF-alpha, angiogenin, IL-8, thrombospondin, the 16-kilodalton N-terminal fragment of prolactin and endostatin.
- In certain embodiments, the target molecule may be a therapeutic target protein for which structural information is known, such as, but not limited to: Raf kinase (a target for the treatment of melanoma), Rho kinase (a target in the prevention of pathogenesis of cardiovascular disease), nuclear factor kappaB (NF-.kappa.B, a target for the treatment of multiple myeloma), vascular endothelial growth factor (VEGF) receptor kinase (a target for action of anti-angiogenetic drugs), Janus kinase 3 (JAK-3, a target for the treatment of rheumatoid arthritis), cyclin dependent kinase (CDK) 2 (CDK2, a target for prevention of stroke), FMS-like tyrosine kinase (FLT) 3 (FLT-3; a target for the treatment of acute myelogenous leukemia (AML)), epidermal growth factor receptor (EGFR) kinase (a target for the treatment of cancer), protein kinase A (PKA, a therapeutic target in the prevention of cardiovascular disease), p21-activated kinase (a target for the treatment of breast cancer), mitogen-activated protein kinase (MAPK, a target for the treatment of cancer and arthritis), c-Jun NH.sub.2-terminal kinase (JNK, a target for treatment of diabetes), AMP-activated kinase (AMPK, a target for prevention and treatment of insulin resistance), lck kinase (a target for immuno-suppression), phosphodiesterase PDE4 (a target in treatment of inflammatory diseases such as rheumatoid arthritis and asthma), Abl kinase (a target in treatment of chronic myeloid leukemia (CML)), phosphodiesterase PDE5 (a target in treatment of erectile dysfunction), a disintegrin and metalloproteinase 33 (ADAM33, a target for the treatment of asthma), human immunodeficiency virus (HIV)-1 protease and HIV integrase (targets for the treatment of HIV infection), respiratory syncytial virus (RSV) integrase (a target for the treatment of infection with RSV), X-linked inhibitor of apoptosis (XIAP, a target for the treatment of neurodegenerative disease and ischemic injury), thrombin (a therapeutic target in the treatment and prevention of thromboembolic disorders), tissue type plasminogen activator (a target in prevention of neuronal death after injury of central nervous system), matrix metalloproteinases (targets of anti-cancer agents preventing angiogenesis), beta secretase (a target for the treatment of Alzheimer's disease), src kinase (a target for the treatment of cancer), fyn kinase, lyn kinase, zeta-chain associated protein 70 (ZAP-70) protein tyrosine kinase, extracellular signal-regulated kinase 1 (ERK-1), p38 MAPK, CDK4, CDK5, glycogen synthase kinase 3 (GSK-3), KIT tyrosine kinase, FLT-1, FLT-4, kinase insert domain-containing receptor (KDR) kinase, and cancer osaka thyroid (COT) kinase.
- In certain embodiments, the target molecule is a target protein that is selected from the group consisting of a VEGF protein, a RANKL protein, a NGF protein, a TNF-alpha protein, a SH2 domain containing protein, a SH3 domain containing protein, an IgE protein a BLyS protein (Oren et al., “Structural basis of BLyS receptor recognition”,
Nature Structural Biology 9, 288-292, 2002), a PCSK9 protein (Ni et al., “A proprotein convertase subtilisin-like/kexin type 9 (PCSK9) C-terminal domain antibody antigen-binding fragment inhibits PCSK9 internalization and restores low density lipoprotein uptake”, J. Biol. Chem. 2010 Apr. 23; 285(17):12882-91), a DLL4 protein (Garber, “Targeting Vessel Abnormalization in Cancer”, JNCI Journal of the National Cancer Institute 2007 99(17):1284-1285), an Ang2 (Angiopoietin-2) protein, a Clostridium difficile Toxin A or B protein (e.g., Ho et al., “Crystal structure of receptor-binding C-terminal repeats from Clostridium difficile toxin A”, (2005) Proc. Natl. Acad. Sci. Usa 102: 18373-18378), a CTLA4 protein (Cytotoxic T-Lymphocyte Antigen 4), and fragments thereof. In certain embodiments, the target protein is a VEGF protein. In certain embodiments, the target protein is a SH2 domain containing protein (e.g., a 3BP2 protein) or a SH3 domain containing protein (e.g., a ABL or a Src protein). - The libraries of the invention, e.g., as described above, find use in a variety of applications. Applications of interest include, but are not limited to, screening applications and research applications.
- The screening methods, e.g., as described above, find use in a variety of applications, including selection and/or screening of the subject libraries in a wide range of research and therapeutic applications, such as therapeutic lead identification and affinity maturation, identification of diagnostic reagents, development of high throughput screening assays, development of drug delivery systems for the delivery of toxins or other therapeutic moieties. The subject screening methods may be exploited in multiple settings.
- In some cases, the subject libraries may find use as research tools to analyze the roles of proteins of interest in modulating various biological processes, e.g., angiogenesis, inflammation, cellular growth, metabolism, regulation of transcription and regulation of phosphorylation. For example, antibody libraries have been useful tools in many such areas of biological research and lead to the development of effective therapeutic agents, see Sidhu and Fellhouse, “Synthetic therapeutic antibodies,” Nature Chemical Biology, 2006, 2(12), 682-688.
- The subject libraries may be exploited as research tools in the development of clinical diagnostics, e.g., in vitro diagnostics (e.g., for targeting various biomarkers), or in vivo tumor imaging agents. The screening of libraries of binding molecules (e.g., aptamers and antibodies) has found use in the development of such clinical diagnostics, see for example, Jayasena, “Aptamers: An Emerging Class of Molecules That Rival Antibodies in Diagnostics,” Clinical Chemistry. 1999; 45:1628-1650.
- The following examples are offered by way of illustration and not by way of limitation.
- The wild-type sequence of the Protein G B1 domain (Gronenborn et al., Science 253, 657-61, 1991) was prepared (Genscript USA Inc.) with an N-terminal FLAG tag and a C-terminal 10×His tag spaced by a Glycine-Glycine-Serine linker, is shown below: DYKDDDDK-GGS-TYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEW TYDDATKTFTVTE-GGS-HHHHHHHHHH-amber stop (SEQ ID NO: 44)
- This sequence was synthesized with NcoI and XbaI restriction sites at 5′ and 3′ respectively and cloned into a display vector as an N-terminal fusion to
truncated protein 3 of M13 filamentous phage. The features of the vector include a ptac promoter and StII secretion leader sequence (MKKNIAFLLASMFVFSIATNAYA; SEQ ID NO: 45). This display version allows the display of GB1 in amber suppressor bacterial strains and is useful for expression of the protein in non-suppressor strains. - The presence of the His-tag and amber-stop at the C-terminus of the protein allows the purification of proteins/mutants without additional mutagenesis. In addition, to optimize for display of GB1 peptidic compounds, two additional constructs were tested for display-levels of GB1 (i) without His-tag and amber-stop (ii) with a hinge and dimerization sequence derived from a Fab-template (DKTHTCGRP; SEQ ID NO: 46) for dimeric display.
- The following oligonucleotides were prepared (Integrated DNA Technologies Inc.), for site-directed mutagenesis:
-
i) 5′-GTT ACC GAA GGC GGT TCT TCT AGA AGT GGT TCC GGT-3′ SEQ ID NO: 47 V T E G G S S R S G S G SEQ ID NO: 48 - For removal of 10×His and amber-stop
-
ii) 5′-TT ACC GAA GGC GGT TCT GAC AAA ACT CAC ACA TGC GGC CGG CCC AGT GGT TCC GGT GAT T-3′ SEQ ID NO: 49 V T E G G S D K T H T C G R P S G S G D F SEQ ID NO: 50 - For insertion of Fab-dimerization sequence to replace His-tag and amber stop
- Site-directed mutagenesis was performed by methods described by Kunkel et al. (Methods Enzymol., 1987, 154, 367-82) and the sequence was confirmed by DNA sequencing. For comparing display levels, phage for each construct was harvested from a 25 mL overnight culture using methods described previously (Fellouse & Sidhu, “Making antibodies in bacteria. Making and using antibodies” Howard & Kaser, Eds., CRC Press, Boca Raton, Fla., 2007). The phage concentrations were estimated using a spectrophotometer (OD268=1 for 5×1012 phage/ml) and normalized to the lowest concentration. Three-fold serial dilutions of phage for each construct were prepared and added to NUNC maxisorb plates previously coated with anti-FLAG antibody (5 μg/ml) and blocked with BSA (0.2% BSA in PBS). The plates were washed and assayed with anti-M13-HRP to detect binding. The HRP signal was plotted as function of phage concentration.
- The mutational and insertion tolerance of GB1 loops was tested, by randomizing the loops and beta-turns and selecting for stably folded proteins. The loop lengths were varied from 4-6 residues and randomized with a NNK codon. The beta-turns and loop residues of GB1 are shown as underlined below:
-
(SEQ ID NO: 1) TYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1 L1 L2 B2 - Regions B1 and L2 are contiguous and regions L1 and B2 are contiguous. These loops/turn regions were randomized together to produce libraries for screening. Site directed mutagenesis (Kunkel 1987) was used to introduce trip stop codons in the loop pairs. Since wild-type protein is more stable, it would have selective advantage over the rest of the library. The following oligonucleotides were used to make the stop-templates (Integrated DNA Technologies, Inc.):
-
B1-L2- Stop template SEQ ID NO: 51 5′- TAC AAA CTG ATT CTG AAC TAA TAA TAA AAA GGT GAA ACC ACG AC-3′ (For B1) SEQ ID NO: 52 5′- G TAC GCC AAC GAT AAT TAA TAA TAA GAA TGG ACC TAC GAT G-3′ (For L2) L1-B2- Stop template SEQ ID NO: 53 5′- GGT GAA ACC ACG ACC TAA TAA TAA GCA GCA ACG GCA GAA AAA-3′ (For L1) SEQ ID NO: 54 5′- GT GAA TGG ACC TAC GAT TAA TAA TAA ACC TTC ACG GTT ACC G-3′ (For B2) - These stop templates were mutated to construct the Loop libraries using methods described in previous protocols (Kunkel 1987). The following oligonucleotides were used for randomization (Integrated DNA Technologies, Inc.):
-
Library B1-L2 SEQ ID NO: 55 5′- TAC AAA CTG ATT CTG AAC NNK NNK NNK NNK AAA GGT GAA ACC ACG AC-3′ SEQ ID NO: 56 5′- TAC AAA CTG ATT CTG AAC NNK NNK NNK NNK NNK AAA GGT GAA ACC ACG AC-3′ SEQ ID NO: 57 5′- TAC AAA CTG ATT CTG AAC NNK NNK NNK NNK NNK NNK AAA GGT GAA ACC ACG AC-3′ SEQ ID NO: 58 5′- G TAC GCC AAC GAT AAT NNK NNK NNK NNK GAA TGG ACC TAC GAT G-3′ SEQ ID NO: 59 5′- G TAC GCC AAC GAT AAT NNK NNK NNK NNK NNK GAA TGG ACC TAC GAT G-3′ SEQ ID NO: 60 5′- G TAC GCC AAC GAT AAT NNK NNK NNK NNK NNK NNK GAA TGG ACC TAC GAT G-3′ Library L1-B2 SEQ ID NO: 61 5′- GGT GAA ACC ACG ACC NNK NNK NNK NNK GCA GCA ACG GCA GAA AAA-3′ SEQ ID NO: 62 5′- GGT GAA ACC ACG ACC NNK NNK NNK NNK NNK GCA GCA ACG GCA GAA AAA-3′ SEQ ID NO: 63 5′- GGT GAA ACC ACG ACC NNK NNK NNK NNK NNK NNK GCA GCA ACG GCA GAA AAA-3′ SEQ ID NO: 64 5′- GT GAA TGG ACC TAC GAT NNK NNK NNK NNK ACC TTC ACG GTT ACC G-3′ SEQ ID NO: 65 5′- GT GAA TGG ACC TAC GAT NNK NNK NNK NNK NNK ACC TTC ACG GTT ACC G-3′ SEQ ID NO: 66 5′- GT GAA TGG ACC TAC GAT NNK NNK NNK NNK NNK NNK ACC TTC ACG GTT ACC G-3′ - The number of transformants was 1×109 for Library B1-L2 and 1×1010 for Library L1-B2. The selections were performed using the methods described below except that the library was directly added to selections wells coated with anti-FLAG antibody (5 μg/ml diluted in PBT) and there was no preincubation step. Selections on anti-FLAG were performed to identify folded variants (misfolded proteins are cleaved thereby losing N-terminal FLAG tag). Three rounds of selection (8 washes/round) were performed as good enrichment was observed in Pool ELISA at
2 and 3.Rounds - The results of anti-FLAG selections of the loop libraries showed that all loops tolerated mutations, including insertion mutations, while maintaining the structure of the scaffold. The following exemplary loop sequences were identified following anti-FLAG selection:
-
TABLE 1 anti-FLAG selection of loop libraries B1-L2 and L1-B2. Loop 1 SEQ Loop 3 SEQ Loop 2 SEQ Loop 4 SEQ 8 9 10 11 a b ID NO: 37 38 39 40 a b ID NO: 18 19 20 21 a b ID 46 47 48 49 a b ID NO: W P C G V 67 Q V G S 104 G R R T 135 L I P N C Y 166 E V G G V 68 G V W S Q G 105 F E C G W G 136 S S A L K R 167 S S A W R 69 W G C R 106 D R G S 137 E L G G 168 C R G T 70 S T L G G 107 T C T P 138 C A R R H C 169 W G E E 71 F V L A H S 108 V E G G 139 C W P S G 170 G S K T G 72 R H A M 109 S L D E R 140 G A S I N C 171 A S T G 73 T K F C 110 G G A E 141 G C G R 172 G G R W R 74 F C G S R G 111 A F E A E 142 Y K C T D D 173 R G G E 75 M F T E 112 P E S I M R 143 C R G P R 174 S D H S 76 G V G G 113 G E V T 144 S S V G 175 S D G M 77 L R G L 114 S S V D G 145 A C L G G 176 N A H R 78 R R I Q C G 115 V G G A 146 Q N C E M 177 C G E P E 79 Q N L V 116 G W C A P R 147 K E R G A G 178 T H G A 80 Y T D A L S 117 G E C W G 148 P D E M V 179 T G L V R 81 K A V S V R 118 H H G C R A 149 N S D Q Q 180 G A C V R 82 H G R T A G 119 C D D R 150 G A G G 181 G Q Q H 83 R G V V 120 D W G R 151 Q G C G E 182 G T S R E 84 V W L G 121 T R G N 152 C P S R 183 C A T T W 85 G E D A 122 D S S A 153 S D G C 184 G V A G 86 S V W E C 123 L S C Q 154 A G S S P 185 C A R Y G 87 S K Y V L G 124 C V E T R 155 A P Q V G 186 L D F L C 88 A P L R M Q 125 V V G E 156 G C S A 187 C N T R 89 Y G W K H 126 R P T S D M 157 G C R G E S 188 L P S R 90 G C G S R L 127 W E D T C V 158 P R P D A 189 R D I Y 91 D A M C K G 128 S C L G 159 S G N L G G 190 G W G G A W 92 R G K Y 129 K E V K Q 160 R G M A 191 L C V P I N 93 E G G G 130 D S S V 161 E G G G 192 W E K E D 94 D S S C G 131 C T L K 162 R R D D E 193 W G S Q 95 G I G V A 132 P S G H 163 L P Y P 194 G D H A F S 96 M C S S G 133 W S Q C 164 G R A G 195 W G G G A C 97 C P T R 134 Q C N N 165 Y R L G R 196 G C V K 98 S I I L 267 E G H S A 99 Q R Y D 268 G Y G G R 100 K E Y Y N M 269 C C G L 101 G G H S 270 K D G G 102 E F F S 271 T S N G V 103 G V V LK 272 - The solvent accessible surface area (SASA) for each residue in the Protein Data Bank (PDB)
structure 3 GB1 was estimated using the GETarea tool (Fraczkiewicz & Braun, “Exact and efficient analytical calculation of the accessible surface areas and their gradients for macromolecules,” J. Comput. Chem. 1998, 19, 319-333). This tool also calculates the ratio of SASA in structure compared to SASA in a random coil. A ratio of 0.4 was used to select solvent accessible residues (shown in bold): TYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVT E (SEQ ID NO: 1). - Various contiguous stretches of solvent-accessible residues were selected for randomization (shown in dark in
FIGS. 5 to 10 ) taking into account the oligonucleotide length and homology requirements for Kunkel mutagenesis. The parent sequence is also shown inFIG. 3 with the numbering scheme and loop/beta-turn regions defined. - In addition, positions in the loops were selected for mutations that include insertion of 0, 1 or 2 additional amino acid residues in addition to substitution. Library 1: +0-2 insertions at
position 38; Library 2: +0-2 insertions atposition 19; Library 3: +2 insertions atposition 1, +0-2 insertions at 19 and 47; Library 4: +0-2 insertions atpositions 9 and 38, +1 insertion atpositions position 55; Library 5: +0-2 insertions atposition 9, +1 insertion atposition 55; Library 6: +1 insertion atposition 1, +0-2 insertions atposition 47. - The following oligonucleotides were prepared (Integrated DNA Technologies) to make the libraries using the Kunkel mutagensis method:
-
-
(SEQ ID NO: 197) 5′-ACGACCGAAGCAGTG KHT KHT KHT KHT GCA KHT KHT GTT TTC KHT KHT TAC GCC KHT KHT AAT KHT KHT KHT KHT KHT TGGACCTACGATGAT-3′ (SEQ ID NO: 198) 5′-ACGACCGAAGCAGTG KHT KHT KHT KHT GCA KHT KHT GTT TTC KHT KHT TAC GCC KHT KHT AAT KHT KHT KHT KHT KHT KHT TGGACCTACGATGAT-3′ (SEQ ID NO: 199) 5′-ACGACCGAAGCAGTG KHT KHT KHT KHT GCA KHT KHT GTT TTC KHT KHT TAC GCC KHT KHT AAT KHT KHT KHT KHT KHT KHT KHT TGGACCTACGATGAT-3′
These oligonucleotides include the variable regions where each variant amino acid position is encoded by a KHT codon. SEQ ID NOs: 197-199 include insertion mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent toposition 38 of the scaffold. -
-
(SEQ ID NO: 200) 5′-GGTGAAACCACGACC KHT KHT KHT KHT KHT KHT KHT GCA KHT KHT KHT TTC KHT KHT KHT GCC KHT KHT AATGGCGTGGATGGT-3′ (SEQ ID NO: 201) 5′-GGTGAAACCACGACC KHT KHT KHT KHT KHT KHT KHT KHT GCA KHT KHT KHT TTC KHT KHT KHT GCC KHT KHT AATGGCGTGGATGGT-3′ (SEQ ID NO: 202) 5′-GGTGAAACCACGACC KHT KHT KHT KHT KHT KHT KHT KHT KHT GCA KHT KHT KHT TTC KHT KHT KHT GCC KHT KHT AATGGCGTGGATGGT-3′
These oligonucleotides include the variable regions where each variant amino acid position is encoded by a KHT codon. SEQ ID NOs: 200-202 include insertion mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent toposition 19 of the scaffold. -
-
(SEQ ID NO: 203) 5′-GATGATAAAGGCGGTAGC KHT KHT KHT TACAAACTGATTCTG AAC-3′ (SEQ ID NO: 204) 5′-AAAGGTGAAACCACGACC KHT KHT KHT KHT KHT KHT KHT GCAGAAAAAGTTTTCAAA-3′ (SEQ ID NO: 205) 5′-AAAGGTGAAACCACGACC KHT KHT KHT KHT KHT KHT KHT KHT GCAGAAAAAGTTTTCAAA-3′ (SEQ ID NO: 206) 5′-AAAGGTGAAACCACGACC KHT KHT KHT KHT KHT KHT KHT KHT KHT GCAGAAAAAGTTTTCAAA-3′ (SEQ ID NO: 207) 5′-GATGGTGAATGGACCTAC KHT KHT KHT KHT KHT ACCTTCACGGTTACCGAA-3′ (SEQ ID NO: 208) 5′-GATGGTGAATGGACCTAC KHT KHT KHT KHT KHT KHT ACCTTCACGGTTACCGAA-3′ (SEQ ID NO: 209) 5′-GATGGTGAATGGACCTAC KHT KHT KHT KHT KHT KHT KHT ACCTTCACGGTTACCGAA-3′
These oligonucleotides include the variable regions where each variant amino acid position is encoded by a KHT codon. SEQ ID NO: 203 includes an insertion mutation of +2 variant amino acids at the position equivalent toposition 1 of the scaffold. SEQ ID NOs: 204-206 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent toposition 19 of the scaffold. SEQ ID NOs: 207-209 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent toposition 47 of the scaffold. -
-
(SEQ ID NO: 210) 5′-ACGTACAAACTGATTCTG KHT KHT KHT KHT KHT KHT GGTGAAACCACGACCGAA-3′ (SEQ ID NO: 211) 5′-ACGTACAAACTGATTCTG KHT KHT KHT KHT KHT KHT KHT GGTGAAACCACGACCGAA-3′ (SEQ ID NO: 212) 5′-ACGTACAAACTGATTCTG KHT KHT KHT KHT KHT KHT KHT KHT GGTGAAACCACGACCGAA-3′ (SEQ ID NO: 213) 5′-AAACAGTACGCCAACGAT KHT KHT KHT KHT KHT KHT TGGACCTACGATGATGCG-3′ (SEQ ID NO: 214) 5′-AAACAGTACGCCAACGAT KHT KHT KHT KHT KHT KHT KHT TGGACCTACGATGATGCG-3′ (SEQ ID NO: 215) 5′-AAACAGTACGCCAACGAT KHT KHT KHT KHT KHT KHT KHT KHT TGGACCTACGATGATGCG-3′ (SEQ ID NO: 216) 5′-ACGAAAACCTTCACGGTT KHT KHT KHT GGCGGTTCTGACAAA ACT-3′
These oligonucleotides include the variable regions where each variant amino acid position is encoded by a KHT codon. SEQ ID NOs: 210-212 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent toposition 9 of the scaffold. SEQ ID NOs: 213-215 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent toposition 38 of the scaffold. SEQ ID NO: 216 includes an insertion mutation of +2 variant amino acids at the position equivalent toposition 55 of the scaffold. -
-
(SEQ ID NO: 217) 5′-AAAGGCGGTAGCACGTAC KHT CTG KHT CTG KHT KHT KHT KHT KHT KHT KHT KHT ACC KHT ACCGAAGCAGTGGATGCA-3′ (SEQ ID NO: 218) 5′-AAAGGCGGTAGCACGTAC KHT CTG KHT CTG KHT KHT KHT KHT KHT KHT KHT KHT KHT ACC KHT ACCGAAGCAGTGGATGCA-3′ (SEQ ID NO: 219) 5′-AAAGGCGGTAGCACGTAC KHT CTG KHT CTG KHT KHT KHT KHT KHT KHT KHT KHT KHT KHT ACC KHT ACCGAAGCAGTGGATGCA-3′ (SEQ ID NO: 220) 5′-GATGCGACGAAAACCTTC KHT GTT KHT KHT KHT GGCGGTTCTGACAAAACT-3′
These oligonucleotides include the variable regions where each variant amino acid position is encoded by a KHT codon. SEQ ID NOs: 217-219 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent toposition 9 of the scaffold. SEQ ID NO: 220 includes an insertion mutation of +2 variant amino acids at the position equivalent toposition 55 of the scaffold. -
-
(SEQ ID NO: 221) 5′-GATGATAAAGGCGGTAGC KHT KHT TAC KHT CTG KHT CTG KHT GGCAAAACCCTGAAAGGT-3′ (SEQ ID NO: 222) 5′-GATAATGGCGTGGATGGT KHT TGG KHT TAC KHT KHT KHT KHT KHT KHT TTC KHT GTT KHT GAAGGCGGTTCTGACAAA-3′ (SEQ ID NO: 223) 5′-GATAATGGCGTGGATGGT KHT TGG KHT TAC KHT KHT KHT KHT KHT KHT KHT TTC KHT GTT KHT GAAGGCGGTTCTGACAAA-3′ (SEQ ID NO: 224) 5′-GATAATGGCGTGGATGGT KHT TGG KHT TAC KHT KHT KHT KHT KHT KHT KHT KHT TTC KHT GTT KHT GAAGGCGGTTCTGACAAA-3′
These oligonucleotides include the variable regions where each variant amino acid position is encoded by a KHT codon. SEQ ID NO: 221 includes an insertion mutation of +1 variant amino acids at the position equivalent toposition 1 of the scaffold. SEQ ID NOs: 222-224 include mutations of +0, 1 or 2 additional variant amino acids, respectively, at the position equivalent toposition 47 of the scaffold. - The libraries were prepared using the same method described above for the GB1 template with Fab dimerization sequence (Fellouse & Sidhu, 2007). Oligonucleotides with 0/1/2 insertions have the same homology regions and compete for binding the template. Therefore they were pooled together (equimolar ratio) and treated as a single oligonucleotide for mutagenesis. The constructed libraries were pooled together for total diversity of 3.5×1010 transformants. Selections were performed against L-VEGF and D-VEGF using a method as described below with the exception that 10 selection wells were used for
Round 1. - Selections were also performed against 3BP2-SH2, ABL-SH3 and v-Src-SH3 proteins using similar methods to those described below.
- Individual clones were analyzed by direct-binding ELISA as described below and by single-point competitive ELISA (Fellouse & Sidhu, 2007).
- 4.1 Library Selections Against VEGF Protein and Negative Selection with BSA
- The selection procedure is essentially the same as described in previous protocols (Fellouse & Sidhu, 2007) with some minor changes. Although the method below is described for L-VEGF, the method can be adapted to screen for binding to any target. The media and buffer recipes are the same as in the described protocol.
- 1. Coat NUNC Maxisorp plate wells with 100 μl of L-VEGF (5 μg/ml in PBS) for 2 h at room temperature.
Coat 5 wells for selection and 1 well for phage pool ELISA.
2. Remove the coating solution and block for 1 h with 200 μl of PBS, 0.2% BSA. At the same time, block an uncoated well as a negative control for pool ELISA. Also block 7 wells for pre-incubation of library on a separate plate.
3. Remove the block solution from the pre-incubation plate and wash four times with PT buffer.
4. Add 100 μl of library phage solution (precipitated and resuspended in PBT buffer) to each blocked wells. Incubate at room temperature for 1 h with gentle shaking.
5. Remove the block solution from selection plate and wash four times with PT buffer.
6. Transfer library phage solution from pre-incubation plate to selection plate (5 selection wells+2 controls for pool ELISA)
7. Remove the phage solution and wash 8-10 times with PT buffer (increased based on pool ELISA signal from previous round).
8. To elute bound phage from selection wells, add 100 μl of 100 mM HCl.Incubate 5 min at room temperature. Transfer the HCl solution to a 1.5-ml microfuge tube. Adjust to neutral pH with 11 μl of 1.0 M Tris-HCl, pH 11.0.
9. In the meantime add 100 μl of anti-M13 HRP conjugate (1:5000 dilution in PBT buffer) to the control wells and incubate for 30 min.
10. Wash control wells four times with PT buffer. Add 100 μl of freshly prepared TMB substrate. Allow color to develop for 5-10 min.
11. Stop the reaction with 100 μl of 1.0 M H3PO4 and read absorbance at 450 nm in a microtiter plate reader. The enrichment ratio can be calculated as the ratio of signal from coated vs uncoated well.
12. Add 250 μl eluted phage solution to 2.5 ml of actively growing E. coli XL1-Blue (OD600<0.8) in 2YT/tet medium. Incubate for 20 min at 37° C. with shaking at 200 rpm.
13. Add M13KO7 helper phage to a final concentration of 1010 phage/ml. Incubate for 45 min at 37° C. with shaking at 200 rpm.
14. Transfer the culture from the antigen-coated wells to 25 volumes of 2YT/carb/kan medium and incubate overnight at 37° C. with shaking at 200 rpm.
15. Isolate phage by precipitation with PEG/NaCl solution, resuspend in 1.0 ml of PBT buffer
16. Repeat the selection cycle for 4 rounds.
4.2. Negative Selection with GST Tagged Protein - A more stringent negative selection procedure is as follows. The selection process is essentially the same as described above except that:
- i) For
1 and 2 the libraries were pre-incubated on GST coated (10 μg/ml in PBS) and blocked wells.Rounds
ii) For 3 and 4, the libraries were pre-incubated with 0.2 mg/ml GST in solution for 1 hr before transfer to selection wellsRounds
iii) The control wells for pool ELISA were coated with GST (5 μg/ml in PBS) - Misfolded proteins are degraded in the periplasm and will not be displayed on phage (Missiakas & Raina, “Protein misfolding in the cell envelope of Escherichia coli: new signaling pathways,” Trends in Biochemical Sciences, 1997, 22, 59-63). Stably folded proteins can therefore be selected for display of the N-terminal FLAG tag.
- The selections were performed on the GB1 Loop libraries by a method similar to the one described above except that the library was directly added to selection wells coated with anti-FLAG antibody (5 μg/ml diluted in PBT) and there was no preincubation step. Only three rounds of selection were performed as good enrichment was observed in Pool ELISA at
2 and 3.Rounds - The following protocol is an adapted version of previous protocols (Fellouse & Sidhu 2007; Tonikian et al., “Identifying specificity profiles for peptide recognition modules from phage-displayed peptide libraries,” Nat. Protoc., 2007, 2, 1368-86):
- 1. Inoculate 450 μl aliquots of 2YT/carb/KO7 medium in 96-well microtubes with single colonies harboring phagemids and grow for 21 hrs at 37° C. with shaking at 200 rpm.
2. Centrifuge at 4,000 rpm for 10 min and transfer phage supernatants to fresh tubes.
3.Coat 3 wells of a 384 well NUNC maxisorb plate per clone, with 2 μg/ml of L-VEGF, Neutravidn, Erbin-GST respectively and leave one well uncoated. Incubate for 2 hrs at room temperature and block the plates (all 4 well).
4. Wash the plate four times with PT buffer.
5.Transfer 30 μl of phage supernatant to each well and incubate for 2 hrs at room temperature with gentle shaking.
6. Wash four times with PT buffer.
7. Add 30 μl of anti-M13-HRP conjugate (diluted 1:5000 in PBT buffer).Incubate 30 min with gentle shaking.
8. Wash four times with PT buffer
9. Add 30 μl of freshly prepared TMB substrate. Allow color to develop for 5-10 min.
10. Stop the reaction with 100 μl of 1.0 M H3PO4 and read absorbance at 450 nm in a microtiter plate reader. - Although the particular embodiments have been described in some detail by way of illustration and example for purposes of clarity of understanding, it is readily apparent in light of the teachings of this invention that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.
- Accordingly, the preceding merely illustrates the principles of the invention. Various arrangements may be devised which, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended to aid the reader in understanding the principles of the invention and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents and equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. The scope of the present invention, therefore, is not intended to be limited to the exemplary embodiments shown and described herein. Rather, the scope and spirit of present invention is embodied by the appended claims.
Claims (47)
1. A library comprising 50 or more distinct GB1 peptidic compounds;
wherein each compound of the library comprises a β1-β2 region and has three or more different non-core mutations in a region outside of the β1-β2 region.
2. The library according to claim 1 , wherein the library comprises 1×104 or more distinct compounds.
3-4. (canceled)
5. The library according to claim 1 , wherein each compound of the library comprises six or more different non-core mutations in a region outside of the β1-β2 region.
6. The library according to claim 1 , wherein each compound of the library comprises ten or more different mutations.
7-13. (canceled)
14. The library according to claim 6 , wherein the ten or more different mutations are located at positions selected from the group consisting of positions 21-24, 26, 27, 30, 31, 34, 35, 37-41.
15. The library according to claim 6 , wherein the ten or more different mutations are located at positions selected from the group consisting of positions 18-24, 26-28, 30-32, 34 and 35.
16. The library according to claim 6 , wherein the ten or more different mutations are located at positions selected from the group consisting of positions 1, 18-24 ad 45-49.
17. The library according to claim 6 , wherein the ten or more different mutations are located at positions selected from the group consisting of positions 7-12, 36-41, 54 and 55.
18. The library according to claim 6 , wherein the ten or more different mutations are located at positions selected from the group consisting of positions 3, 5, 7-14, 16, 52, 54 and 55.
19. The library according to claim 6 , wherein the ten or more different mutations are located at positions selected from the group consisting of positions 1, 3, 5, 7, 41, 43, 45-50 52 and 54.
20. The library according to claim 6 , wherein each compound of the library comprises five or more different mutations in the α1 region.
21-23. (canceled)
24. The library according to claim 6 , wherein each compound of the library comprises three or more different mutations in the β3-β4 region.
25-30. (canceled)
31. The library according to claim 6 , wherein each compound of the library comprises two or more different mutations in the region between the α1 and β3 regions.
32. (canceled)
33. The library according to claim 6 , wherein each compound of the library comprises ten or more different mutations in the β1-β2 region.
34-35. (canceled)
36. The library according to claim 33 , wherein P1 is β1-β2 and P2 is β3-β4 such that the compound is described by the formula (II):
β1-β2-α1-β3-β4 (II)
β1-β2-α1-β3-β4 (II)
wherein β1, β2, β3 and ⊖4 are independently beta-strand domains; and
β1, β2, α1, β3 and β4 are connected independently by linking sequences of between 1 and 10 residues in length.
37-38. (canceled)
39. The library according to claim 36 , wherein each compound of the library is described by a formula independently selected from the group consisting of:
F1-V1-F2 (III);
F3-V2-F4 (IV);
V3-F5-V4-F6-V5-F7 (V);
F8-V6-F9-V7-F10-V8 (VI);
V9-F11-V10 (VII); and
V11-F12-V12 (VIII)
F1-V1-F2 (III);
F3-V2-F4 (IV);
V3-F5-V4-F6-V5-F7 (V);
F8-V6-F9-V7-F10-V8 (VI);
V9-F11-V10 (VII); and
V11-F12-V12 (VIII)
wherein F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11 and F12 are fixed regions and V1, V2, V3, V4, V5, V6, V7, V8, V9, V10, V11 and V12 are variable regions;
wherein each fixed region is common to all compounds of the same formula and each compound of the library has a distinct variable region.
40. The library according to claim 39 , wherein each compound of the library is described by formula (III), wherein:
F1 comprises a sequence having 75% or more amino acid sequence identity to the amino acid sequence set forth in SEQ ID NO: 2;
F2 comprises a sequence having 75% or more amino acid sequence identity to an amino acid sequence set forth in SEQ ID NO: 3; and
V1 comprises a sequence that comprises 10 or more mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 4.
41. The library according to claim 40 , wherein V1 comprises a sequence of the formula:
wherein each X is independently a mutation that comprises substitution with a variant amino acid, wherein the mutation at position 19 of V1 comprises insertion of 0, 1 or 2 additional variant amino acids.
42. (canceled)
43. The library according to claim 39 , wherein each compound of the library is described by formula (IV), wherein:
F3 comprises a sequence having 75% or more amino acid sequence identity to the amino acid sequence set forth in SEQ ID NO: 7;
F4 comprises a sequence having 75% or more amino acid sequence identity to an amino acid sequence set forth in SEQ ID NO: 8; and
V2 comprises a sequence that comprises 10 or more mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 9.
44. The library according to claim 39 , wherein V2 comprises a sequence of the formula:
wherein each X is independently a mutation that comprises substitution with a variant amino acid, wherein the mutation at position 3 of V2 comprises insertion of 0, 1 or 2 additional variant amino acids.
45. (canceled)
46. The library according to claim 39 , wherein each compound of the library is described by formula (V), wherein:
F5 comprises a sequence having 75% or more amino acid sequence identity to the amino acid sequence set forth in SEQ ID NO: 12;
F6 comprises a sequence having 75% or more amino acid sequence identity to an amino acid sequence set forth in SEQ ID NO: 13;
F7 comprises a sequence having 75% or more amino acid sequence identity to an amino acid sequence set forth in SEQ ID NO: 14;
V3 comprises a sequence that comprises one or more mutation compared to a parent amino acid sequence that is TY; and
V4 comprises a sequence that comprises 7 or more mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 15; and
V5 comprises a sequence that comprises 5 or more mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 16.
47. The library according to claim 46 , wherein:
V3 comprises a sequence of the formula XY;
V4 comprises a sequence of the formula TXXXXXXXA (SEQ ID NO: 17); and
V5 comprises a sequence of the formula YXXXXXT (SEQ ID NO: 18);
wherein each X is independently a mutation that comprises substitution with a variant amino acid, wherein the mutation at position 1 of V3 comprises insertion of 2 additional variant amino acids and the mutations at positions 3 and 4 of V4 and V5 each independently comprise insertion of 0, 1 or 2 additional variant amino acids.
48. (canceled)
49. The library according to claim 39 , wherein each compound of the library is described by formula (VI), wherein:
F8 comprises a sequence having 75% or more amino acid sequence identity to the amino acid sequence set forth in SEQ ID NO: 21;
F9 comprises a sequence having 75% or more amino acid sequence identity to the amino acid sequence set forth in SEQ ID NO: 22;
F10 comprises a sequence having 75% or more amino acid sequence identity to the amino acid sequence set forth in SEQ ID NO: 23;
V6 comprises a sequence that comprises 6 or more mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 24;
V7 comprises a sequence that comprises 6 or more mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 25; and
V8 comprises a sequence that comprises 2 or more mutations compared to a parent amino acid sequence that is VTE.
50. The library according to claim 49 , wherein:
V6 comprises a sequence of the formula LXXXXXXG (SEQ ID NO: 26);
V7 comprises a sequence of the formula DXXXXXXW (SEQ ID NO: 27); and
V8 comprises a sequence of the formula VXX;
wherein each X is independently a mutation that comprises substitution with a variant amino acid, wherein the mutations at position 4 of V6 and V7 each independently comprise insertion of 0, 1 or 2 additional variant amino acids and the mutation at position 3 of V8 comprises insertion of 1 additional variant amino acid.
51. (canceled)
52. The library according to claim 39 , wherein each compound of the library is described by formula (VII), wherein:
F11 comprises a sequence having 75% or more amino acid sequence identity to an amino acid sequence set forth in SEQ ID NO: 30;
V9 comprises a sequence that comprises at least 11 mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 31; and
V10 comprises a sequence that comprises 3 or more mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 32.
53. The library according to claim 52 , wherein:
V9 comprises a sequence of the formula TYXLXLXXXXXXXXTXT (SEQ ID NO: 33); and
V10 comprises a sequence of the formula FXVXX (SEQ ID NO: 34);
wherein each X is independently a mutation that comprises substitution with a variant amino acid, wherein the mutations at position 9 of V9 comprises insertion of 0, 1 or 2 additional variant amino acids and the mutation at position 5 of V10 comprises insertion of 1 additional variant amino acid.
54. (canceled)
55. The library according to claim 39 , wherein each compound of the library is described by formula (VIII), wherein:
F12 comprises a sequence having 75% or more amino acid sequence identity to an amino acid sequence set forth in SEQ ID NO: 37;
V11 comprises a sequence that comprises 4 or more mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 38;
V12 comprises a sequence that comprises 10 or more mutations compared to a parent amino acid sequence set forth in SEQ ID NO: 39.
56. The library according to claim 55 , wherein:
V11 comprises a sequence of the formula XYXLXLXG (SEQ ID NO: 40); and
V12 comprises a sequence of the formula GXWXYXXXXXXFXVXE (SEQ ID NO: 41);
wherein each X is independently a mutation that comprises substitution with a variant amino acid, wherein the mutation at position 8 of V12 comprises insertion of 0, 1 or 2 additional variant amino acids and the mutation at position 1 of V11 comprises insertion of 1 additional variant amino acid.
57-65. (canceled)
66. A library of polynucleotides that encodes 50 or more distinct compounds, wherein each polynucleotide encodes a GB1 peptidic compound that comprises a β1-β2 region and has three or more different non-core mutations at positions in a region outside of the β1-β2 region.
67-72. (canceled)
73. The library according to claim 66 , wherein each polynucleotide encodes a GB1 peptidic compounds comprising ten or more variant amino acids at non core positions, wherein each variant amino acid is encoded by a random codon.
74. (canceled)
75. A method comprising:
contacting a target protein with a library comprising:
50 or more distinct GB1 peptidic compounds, wherein each compound of the library comprises a β1-β2 region and has three or more different non-core mutations in a region outside of the β1-β2 region; and
identifying a compound of the library that specifically binds to the target protein.
76-79. (canceled)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/294,072 US20120129715A1 (en) | 2010-11-12 | 2011-11-10 | Gb1 peptidic libraries and methods of screening the same |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US41331810P | 2010-11-12 | 2010-11-12 | |
| US41331610P | 2010-11-12 | 2010-11-12 | |
| US41333110P | 2010-11-12 | 2010-11-12 | |
| US13/294,072 US20120129715A1 (en) | 2010-11-12 | 2011-11-10 | Gb1 peptidic libraries and methods of screening the same |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20120129715A1 true US20120129715A1 (en) | 2012-05-24 |
Family
ID=46064897
Family Applications (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US13/294,072 Abandoned US20120129715A1 (en) | 2010-11-12 | 2011-11-10 | Gb1 peptidic libraries and methods of screening the same |
| US13/294,078 Active 2034-10-18 US9285372B2 (en) | 2010-11-12 | 2011-11-10 | Methods and compositions for identifying D-peptidic compounds that specifically bind target proteins |
| US15/012,603 Abandoned US20160223560A1 (en) | 2010-11-12 | 2016-02-01 | Methods and Compositions for Identifying D-Peptidic Compounds that Specifically Bind Target Proteins |
Family Applications After (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US13/294,078 Active 2034-10-18 US9285372B2 (en) | 2010-11-12 | 2011-11-10 | Methods and compositions for identifying D-peptidic compounds that specifically bind target proteins |
| US15/012,603 Abandoned US20160223560A1 (en) | 2010-11-12 | 2016-02-01 | Methods and Compositions for Identifying D-Peptidic Compounds that Specifically Bind Target Proteins |
Country Status (4)
| Country | Link |
|---|---|
| US (3) | US20120129715A1 (en) |
| EP (1) | EP2638063A4 (en) |
| CA (1) | CA2817579A1 (en) |
| WO (1) | WO2012078313A2 (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114989298A (en) * | 2019-03-22 | 2022-09-02 | 反射制药有限公司 | D-peptide compounds against VEGF |
| US20230234993A1 (en) * | 2020-12-17 | 2023-07-27 | National Marine Biodiversity Institute Of Korea | Composition for preventing or treating neurological diseases related to copper metabolism, comprising multi-copper oxidase peptide |
Families Citing this family (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9309291B2 (en) | 2011-12-02 | 2016-04-12 | Reflexion Pharmaceuticals, Inc. | Broad spectrum influenza A neutralizing vaccines and D-peptidic compounds, and methods for making and using the same |
| RU2592686C2 (en) | 2011-04-22 | 2016-07-27 | ВАЙЕТ ЭлЭлСи | Compositions related to clostridium difficile mutant toxin, and methods of application thereof |
| US9255154B2 (en) | 2012-05-08 | 2016-02-09 | Alderbio Holdings, Llc | Anti-PCSK9 antibodies and use thereof |
| BR122016023101B1 (en) | 2012-10-21 | 2022-03-22 | Pfizer Inc | Polypeptide, immunogenic composition comprising it, as well as recombinant cell derived from Clostridium difficile |
| US10093921B2 (en) | 2013-03-14 | 2018-10-09 | The Governing Council Of The University Of Toronto | Scaffolded peptidic libraries and methods of making and screening the same |
| CN106146627B (en) * | 2015-03-31 | 2019-11-12 | 上海业力生物科技有限公司 | Fc Specific binding proteins, IgG affinity chromatography medium and the preparation method and application thereof |
| CN108351914B (en) | 2015-10-30 | 2022-04-29 | 扬森疫苗与预防公司 | Structure-based design of D-protein ligands |
| US10489661B1 (en) | 2016-03-08 | 2019-11-26 | Ocuvera LLC | Medical environment monitoring system |
| US10600204B1 (en) | 2016-12-28 | 2020-03-24 | Ocuvera | Medical environment bedsore detection and prevention system |
| WO2020198075A2 (en) * | 2019-03-22 | 2020-10-01 | Reflexion Pharmaceuticals, Inc. | Multivalent d-peptidic compounds for target proteins |
| WO2022029512A1 (en) * | 2020-08-06 | 2022-02-10 | Tsinghua University | Chemical synthesis of large and mirror-image proteins and uses thereof |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA2137378C (en) | 1992-06-05 | 2008-08-05 | Stephen Brian Henry Kent | D-enzyme compositions and methods of their use |
| US5780221A (en) | 1995-05-03 | 1998-07-14 | Whitehead Institute For Biomedical Research | Identification of enantiomeric ligands |
| US6184344B1 (en) | 1995-05-04 | 2001-02-06 | The Scripps Research Institute | Synthesis of proteins by native chemical ligation |
| US6663862B1 (en) * | 1999-06-04 | 2003-12-16 | Duke University | Reagents for detection and purification of antibody fragments |
| WO2010014830A2 (en) | 2008-07-30 | 2010-02-04 | Cosmix Therapeutics Llc | Peptide therapeutics that bind vegf and methods of use thereof |
-
2011
- 2011-11-10 EP EP11846540.0A patent/EP2638063A4/en not_active Withdrawn
- 2011-11-10 US US13/294,072 patent/US20120129715A1/en not_active Abandoned
- 2011-11-10 WO PCT/US2011/060276 patent/WO2012078313A2/en not_active Ceased
- 2011-11-10 US US13/294,078 patent/US9285372B2/en active Active
- 2011-11-10 CA CA2817579A patent/CA2817579A1/en not_active Abandoned
-
2016
- 2016-02-01 US US15/012,603 patent/US20160223560A1/en not_active Abandoned
Non-Patent Citations (1)
| Title |
|---|
| Nord et al. (1997) Nature Biotechnology volume 15 pages 772 to 777 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114989298A (en) * | 2019-03-22 | 2022-09-02 | 反射制药有限公司 | D-peptide compounds against VEGF |
| US20230234993A1 (en) * | 2020-12-17 | 2023-07-27 | National Marine Biodiversity Institute Of Korea | Composition for preventing or treating neurological diseases related to copper metabolism, comprising multi-copper oxidase peptide |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2012078313A3 (en) | 2012-08-23 |
| US20120178643A1 (en) | 2012-07-12 |
| WO2012078313A2 (en) | 2012-06-14 |
| EP2638063A2 (en) | 2013-09-18 |
| US20160223560A1 (en) | 2016-08-04 |
| CA2817579A1 (en) | 2012-04-14 |
| US9285372B2 (en) | 2016-03-15 |
| EP2638063A4 (en) | 2014-04-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20120129715A1 (en) | Gb1 peptidic libraries and methods of screening the same | |
| CN110725009B (en) | Protein scaffolds based on fibronectin type III repeats with alternative binding surfaces | |
| US20250327065A1 (en) | Disulfide-rich peptide libraries and methods of use thereof | |
| AU2014229549B2 (en) | Scaffolded peptidic libraries and methods of making and screening the same | |
| US10087249B2 (en) | Alphabody libraries and methods for producing the same | |
| AU2019206064A1 (en) | Synthetic library of specific binding molecules | |
| US11085038B2 (en) | Polypeptide library | |
| JP7545894B2 (en) | Polypeptides based on novel scaffolds | |
| WO2024218156A1 (en) | Method of identifying, characterising and/or designing agent or a target binding site of an agent | |
| CN120157769A (en) | A novel cyclic peptide configuration, library preparation method and application thereof | |
| Ernst et al. | Phage display systems for protein engineering |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: THE GOVERNING COUNCIL OF THE UNIVERSITY OF TORONTO Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:UPPALAPATI, MARUTI;SIDHU, SACHDEV;REEL/FRAME:028229/0549 Effective date: 20101221 |
|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |