HK1189387B

HK1189387B - Improved n-terminal capping modules for designed ankyrin repeat proteins

Info

Publication number: HK1189387B
Application number: HK14102499.6A
Authority: HK
Inventors: Hans Kaspar Binz
Original assignee: Molecular Partners Ag
Priority date: 2010-11-26
Filing date: 2011-11-25
Publication date: 2018-07-13

Description

Improved N-terminal capping modules for designed ankyrin repeat proteins

Technical Field

The present invention relates to improved N-terminal capping modules for designed ankyrin repeat proteins (darpins) which confer improved thermostability to the darpins, nucleic acids encoding such proteins, pharmaceutical compositions comprising such proteins, and the use of such proteins in the treatment of disease.

Background

In addition to antibodies, novel binding proteins or binding domains can also be used to specifically bind to target molecules(e.g., Binz, H.K., Amstutz, P. and Pl ü ckthun, A., nat. Biotechnol).231257-1268,2005) one such novel class of binding proteins or binding domains is based on designed repeat proteins or designed repeat domains (WO 2002/020565; Binz, H.K., Amstutz, P., Kohl, A., Stumpp, M.T., Briand, C., Forrer, P., Gr ü tter, M.G., and Pl ü ckthun, A., nat. Biotechnol).22575-13These designed repeat domains control the modular nature of the repeat proteins and have N-and C-terminal capping modules to prevent aggregation of the designed repeat domains by shielding the hydrophobic core of the domains (Forrer, P., Stumpp, M.T., Binz, H.K., and Pl ü ckthun, A., FEBS letters)5392-6, 2003.) these capping modules are based on the capping repeat sequence of the native guanine-adenine-binding protein (GA-binding protein) it has been shown that the thermal and thermodynamic stability of these designed ankyrin repeat domains can be further increased by modifying the C-terminal capping repeat sequence derived from GA-binding proteins (Interlandi, G., Wetzel, S.K, Settanni, G., Pl ü ckthun, A. and Caflisch, A., J. mol. Biol.).375837-854,2008, Kramer, M.A, Wetzel, s.k., Pl ü ckthun, a., Mittl, p.r.e., and Gr ü tter, m.g., j.mol.404, 381-391, 2010). The authors introduced a total of 8 mutations into this capping module and extended its C-terminal helix by adding 3 unique amino acids. Nevertheless, the introduction of these modifications in the C-terminal capping module leads to an undesirable tendency for dimerization of the repeat domain of the designed C-terminal capping module carrying the mutation. Thus, there is a need to produce repeat proteins that are further optimized by modifying the C-or N-terminal capping modules or C-or N-terminal capping repeat sequences of designed ankyrin repeat domains.

In summary, there is a need for target-specific ankyrin repeat proteins with improved stability for the treatment of cancer and other pathological conditions.

The technical problem underlying the present invention is to identify novel ankyrin repeat proteins with improved stability for improved treatment of cancer and other pathological conditions. A solution to this technical problem is achieved by providing the embodiments characterized in the claims.

Disclosure of Invention

The present invention relates to binding proteins comprising at least one ankyrin repeat domain, wherein the ankyrin repeat domain comprises an N-terminal capping module having the following amino acid sequence:

GSDLGKKLLE AARAGQDDEV RILLKAGADV NA (SEQ ID NO:14) or

GSDLGKKLLE AARAGQDDEV RELLKAGADV NA (SEQ ID NO:15), wherein

Amino acid residue L at position 24 of SEQ ID NO. 14 or SEQ ID NO. 15 is optionally replaced with V, I or A;

14 or 15 optionally substituted with any amino acid at a position other than position 24; and is

Wherein G at position 1 and/or S at position 2 of SEQ ID NO 14 or SEQ ID NO 15 are optionally deleted.

In particular, the invention relates to such binding proteins, wherein the N-terminal capping module comprises the following sequence:

GSX₁LX₂KKLLE AARAGQDDEV X₃X₄LX₅X₆X₇GADV NA (SEQ ID NO:5), wherein

5G at position 1 and/or S at position 2 optionally deleted;

X₁represents amino acid residue G, A or D;

X₂represents an amino acid residue G or D;

X₃represents an amino acid residue R or E;

X₄represents amino acid residue I, E or V;

X₅represents amino acid residue L, V, I or A;

X₆represents amino acid residue A, K or E; and

X₇represents an amino acid residue selected from: A. h, Y, K and R.

In another embodiment, the invention relates to a binding protein comprising at least one ankyrin repeat domain, wherein the ankyrin repeat domain comprises an N-terminal capping module having an amino acid sequence with at least 70% amino acid sequence identity to:

GSDLGKKLLE AARAGQDDEV RILLKAGADV NA (SEQ ID NO:14) or

GSDLGKKLLE AARAGQDDEV RELLKAGADV NA (SEQ ID NO:15) and with the proviso that the amino acid residue at position 24 of the amino acid sequence of the N-terminal capping module is L, V, I or A;

and N-terminally capped molecules in which the amino acids at position 1 and/or position 2 are deleted.

Such binding proteins exhibit improved thermostability compared to the same binding protein differing only in the N-terminal capping module, e.g., compared to binding proteins having a prior art N-terminal capping module, e.g., an N-terminal capping module having an amino acid sequence that is the amino acid M (methionine) at position 24, e.g., SEQ ID NO:14 or SEQ ID NO:15 in which L at position 24 is replaced by M.

The invention further relates to a binding protein comprising at least one ankyrin repeat domain, wherein said ankyrin repeat domain comprises a C-terminal capping module having the following amino acid sequence:

X₁DKX₂GKTX₃X₄DX₅X₆X₇DX₈GX₉EDX₁₀AEX₁₁LQKAA (SEQ ID NO:6)。

the invention further relates to nucleic acid molecules encoding the binding proteins of the invention, as well as pharmaceutical compositions comprising the binding proteins or nucleic acid molecules described above.

The invention further relates to methods of treating pathological conditions using the binding proteins of the invention.

Drawings

FIG. 1 shows a schematic view of aThermal stability of DARPin #17 and DARPin #18

Heat denatured traces of DARPin #17 and DARPin #18 are shown. The thermal denaturation was followed by an increase in the fluorescence intensity of the dye SYPRO Orange present in PBS at pH 7.4. The Tm values of DARPin #17 and DARPin #18 were estimated to be 64.5 ℃ and 71.0 ℃, respectively. F, relative to a fluorescence unit (RFU), excitation is carried out at 515-535 nm, and detection is carried out at 560-580 nm; t, temperature in ° C; see below for DARPin definition.

FIG. 2Thermal stability of DARPin #19 and DARPin #20

Heat denatured traces of DARPin #19 and DARPin #20 are shown. The thermal denaturation was followed by CD signal at 222 nm in PBS at pH 7.4. The Tm values for DARPin #19 and DARPin #20 were estimated to be 72.3 ℃ and 74.8 ℃ respectively. FU, unfolded fraction; t, temperature in ° C; see below for DARPin definition.

FIG. 3Thermal stability of DARPin #21 and DARPin #23

Heat denatured traces of DARPin #21 and DARPin #23 are shown. The thermal denaturation was followed by CD signal at 222 nm in PBS at pH 7.4. The Tm values for DARPin #21 and DARPin #23 were estimated to be 56.5 ℃ and 63.5 ℃, respectively. FU, unfolded fraction; t, temperature in ° C; see below for DARPin definition.

FIG. 4Thermal stability of DARPin #24 and DARPin #26

Heat denatured traces of DARPin #24 and DARPin #26 are shown. The thermal denaturation was followed by CD signal at 222 nm in PBS at pH 7.4. The Tm values for DARPin #24 and DARPin #26 were estimated to be 83 ℃ and 89 ℃, respectively. RE, relative CD signal at 222 nm normalized to the signal measured at 20 ℃; t, temperature in ° C; see below for DARPin definition.

Detailed Description

The term "protein" refers to a polypeptide, wherein at least part of the polypeptide has or is capable of attaining a defined three-dimensional arrangement within and/or between its polypeptide chains by forming a secondary, tertiary or quaternary structure. If the protein comprises two or more polypeptides, each polypeptide chain may be non-covalently linked or covalently linked, for example by a disulfide bond between the two polypeptides. Protein portions, each having or capable of obtaining a defined three-dimensional arrangement by forming secondary or tertiary structures, are referred to as "protein domains". Such protein domains are well known to those skilled in the art.

The term "recombinant" as used in recombinant proteins, recombinant protein domains, recombinant binding proteins, and the like, means that the polypeptide is prepared by using recombinant DNA techniques well known to those skilled in the relevant art. For example, a recombinant DNA molecule encoding a polypeptide (e.g., prepared by gene synthesis) can be cloned into a bacterial expression plasmid (e.g., pQE30, Qiagen), a yeast expression plasmid, or a mammalian expression plasmid. When, for example, the thus constructed recombinant bacterial expression plasmid is inserted into an appropriate bacterium (e.g.Escherichia coli(Escherichia coli) The bacterium can produce the polypeptide encoded by such recombinant DNA. The correspondingly produced polypeptide is referred to as recombinant polypeptide.

Within the scope of the present invention, the term "polypeptide" refers to a molecule consisting of one or more chains of multiple (i.e. two or more) amino acids linked via peptide bonds. Preferably, the polypeptide consists of more than eight amino acids linked via peptide bonds.

The term "polypeptide tag" refers to an amino acid sequence attached to a polypeptide/protein, wherein the amino acid sequence is useful for purifying, detecting or targeting the polypeptide/protein, or wherein the amino acid sequence improves the physicochemical behavior of the polypeptide/protein, or wherein the amino acid sequence has effector function. The individual polypeptide tags, portions and/or domains of the binding protein may be linked to each other directly or via a polypeptide linker. These polypeptide tags are well known in the art and are fully available to those skilled in the art. Examples of polypeptide tags are small polypeptide sequences allowing the detection of the polypeptide/protein, e.g. His (e.g. His-tag of SEQ ID NO:16), myc, FLAG or Strep-tags or moieties such as enzymes (e.g. enzymes like alkaline phosphatase), or moieties that can be used for targeting (such as immunoglobulins or fragments thereof) and/or as effector molecules.

The term "polypeptide linker" refers to an amino acid sequence that is capable of linking, for example, two protein domains, a polypeptide tag and a protein domain, a protein domain and a non-polypeptide moiety such as polyethylene glycol, or two sequence tags. Such additional domains, tags, non-polypeptide moieties and linkers are known to those of skill in the relevant art. A list of examples is provided in the description of patent application WO 2002/020565. Specific examples of such linkers are glycine-serine-linkers and proline-threonine-linkers of various lengths; preferably the linker has a length of 2 to 24 amino acids; more preferably the linker has a length of 2 to 16 amino acids.

The term "binding protein" refers to a protein comprising one or more binding domains, one or more biologically active compounds, and one or more polymer moieties, as explained further below. Preferably the binding protein comprises up to four binding domains. More preferably the binding protein comprises at most two binding domains. Most preferably the binding protein comprises only one binding domain. Moreover, any such binding protein may comprise additional protein domains of non-binding domains, multimerization moieties, polypeptide tags, polypeptide linkers, and/or single Cys residues. Examples of multimerizing moieties are immunoglobulin heavy chain constant regions (which pair to provide a functional immunoglobulin Fc domain), and leucine zippers or polypeptides that contain free sulfhydryl groups that form intermolecular disulfide bonds between two such polypeptides. A single Cys residue may be used to conjugate other moieties to the polypeptide, for example, by using maleimide chemistry, which is well known to those skilled in the art. Preferably, the binding protein is a recombinant binding protein. Also preferably, the binding domains of the binding proteins have different target specificities.

The term "binding domain" means an ankyrin repeat domain having predetermined properties as defined below. Such binding domains can be obtained by rational, or most often combinatorial, protein engineering techniques known in the art (Binz et al, 2005, supra). For example, a binding domain having a predetermined property may be obtained by a method comprising the steps of: (a) providing a distinct set of repeat domains; and (b) screening and/or selecting from said different repertoires to obtain at least one repeat domain having said predetermined property. The collection of distinct repeat domains can be provided by several methods depending on the screening and/or selection system used, and can include the use of methods well known to those skilled in the art, such as phage display or ribosome display. Preferably, the binding domain is a recombinant binding domain.

The term "predetermined property" refers to properties such as binding to a target, blocking a target, activating a target-mediated reaction, enzymatic activity, and related other properties. Depending on the type of property desired, one of ordinary skill will be able to identify the format and necessary steps for performing the screening and/or selecting binding domains with the desired property. Preferably the predetermined property is binding to a target.

A preferred binding protein comprises at least one repeat domain.

The terms "having binding specificity for," "specifically binds to" or "target-specific" and the like mean that the dissociation constant for binding of a binding protein or binding domain to a target in PBS is lower than for binding to unrelated proteins such as e.coli Maltose Binding Protein (MBP)And (4) a separation constant. Preferably, the dissociation constant for the target in PBS is at least 10-fold lower than the corresponding dissociation constant for MBP, more preferably 10²Times, even more preferably 10³Times or most preferably 10⁴And (4) doubling.

Methods for determining the dissociation constant of protein-protein interactions, such as Surface Plasmon Resonance (SPR) -based techniques (e.g. SPR equilibrium analysis) or Isothermal Titration Calorimetry (ITC), are well known to those skilled in the art. The Kd values measured for a particular protein-protein interaction may vary if measured under different conditions (e.g., salt concentration, pH). Thus, the measurement of the Kd value is preferably performed with a standardized protein solution and a standardized buffer (such as PBS).

The term "target" refers to an individual molecule such as a nucleic acid molecule, polypeptide or protein, carbohydrate, or any other naturally occurring molecule, including any portion of such individual molecule, or a complex of two or more such molecules. The target may be a whole cell or tissue sample, or may be any non-native molecule or moiety. Preferred targets are naturally occurring or non-naturally occurring polypeptides or polypeptides containing chemical modifications, for example by natural or non-natural phosphorylation, acetylation or methylation modifications.

The following definitions of the repeat proteins are based on patent application WO 2002/020565. Patent application WO2002/020565 further contains a general description of repetitive protein features, techniques and applications.

The term "repeat protein" refers to a protein comprising one or more repeat domains. Preferably each of said repeat proteins comprises up to four repeat domains. More preferably each of said repeat proteins comprises at most two repeat domains. Most preferably, each repeat protein comprises only one repeat domain. Furthermore, the repeat protein may comprise additional non-repeat protein domains, polypeptide tags and/or polypeptide linkers.

The term "repeat domain" refers to a protein domain comprising two or more consecutive repeat units (modules) as structural units, wherein the structural units have identical folds and are closely stacked to create, for example, a supercoiled structure with a contiguous hydrophobic core. Preferably, the repeat domain additionally comprises an N-terminal and/or C-terminal capping unit (or module). Even more preferably, the N-terminal and/or C-terminal capping unit (or module) is a capping repeat sequence.

The terms "designed repeat protein" and "designed repeat domain" refer to the repeat protein or repeat domain, respectively, obtained by the inventive procedure explained in patent application WO 2002/020565. Designed repeat proteins and designed repeat domains are synthetic and not natural. They are respectively artificial proteins or domains obtained by expression of correspondingly designed nucleic acids. Preferably, expression is carried out in eukaryotic or prokaryotic cells, such as bacterial cells, or by using a cell-free in vitro expression system. Thus, the designed ankyrin repeat protein (i.e. DARPin) corresponds to a binding protein of the invention comprising at least one ankyrin repeat domain.

The term "structural unit" refers to a partially ordered portion of a polypeptide formed by three-dimensional interaction between two or more secondary structural segments that are adjacent to each other along the polypeptide chain. Such building blocks represent structural motifs. The term "structural motif refers to the three-dimensional arrangement of secondary structural elements present in at least one structural unit. Structural motifs are well known to those skilled in the art. The individual building blocks do not allow a defined three-dimensional arrangement to be obtained; however, their sequential arrangement, for example as repeating modules of repeating domains, leads to mutual stabilization of adjacent units, resulting in a supercoiled structure.

The term "repeat unit" refers to an amino acid sequence comprising a repeat sequence motif of one or more naturally occurring repeat proteins, wherein said "repeat unit" is present in multiple copies and exhibits a defined folding topology common to all said motifs which determines the folding of the protein. Such repeating units correspond to: the "repeat structural unit (repeat sequence)" of the repeat protein described by Forrer et al (2003, supra), or the "consecutive homologous structural unit (repeat sequence)" of the repeat protein described by Binz et al (2004, supra). Such repeat units comprise framework residues and interacting residues. Examples of such repeat units are armadillo repeat units, leucine-rich repeat units, ankyrin repeat units, thirty-four peptide repeat units, HEAT repeat units, and leucine-rich variant repeat units. Naturally occurring proteins containing two or more such repeat units are referred to as "naturally occurring repeat proteins". The amino acid sequences of the individual repeat units of a repeat protein may have a significant number of mutations, substitutions, additions and/or deletions when compared to one another, while still substantially retaining the general pattern or motif of the repeat unit.

The term "ankyrin repeat" refers to a repeat unit that is an ankyrin repeat sequence such as described by Forrer et al (2003, supra). Ankyrin repeats are well known to those skilled in the art.

The term "framework residue" refers to an amino acid residue of a repeating unit, or of a repeating module, which participates in the folding topology, i.e. which participates in the folding of said repeating unit (or module) or in its interaction with neighboring units (or modules). Such participation may be in interaction with other residues in the repeat unit (or module), or in the influence of the conformation of the polypeptide backbone or the amino acid segments forming a linear polypeptide or loop present in the alpha-helix or beta-sheet.

The term "target interacting residue" refers to an amino acid residue of a repeating unit, or the corresponding amino acid residue of a repeating module, which participates in the interaction with a target substance. Such participation may be in direct interaction with the target substance, or in the influence of other directly interacting residues, for example by stabilizing the polypeptide conformation of the repeat units (or modules) to allow or enhance the interaction of the directly interacting residues with the target. Such framework and target interacting residues can be identified by the following methods: structural data obtained by physicochemical methods, such as X-ray crystallography, NMR and/or CD spectroscopy, is analyzed or compared with known and relevant structural information known to those skilled in the art of structural biology and/or bioinformatics.

Preferably, the repeat units used to infer the repeat sequence motif are homologous repeat units, wherein the repeat units comprise the same structural motif, and wherein more than 70% of the framework residues of the repeat units are homologous to each other. Preferably, more than 80% of the framework residues of the repeat unit are homologous. Most preferably, more than 90% of the framework residues of the repeat unit are homologous. Computer programs (such as Fasta, Blast or Gap) for determining the percent homology between polypeptides are known to those skilled in the art. Further preferably, the repeat unit used to infer the repeat sequence motif is a homologous repeat unit obtained from a repeat domain selected on the target and having the same target specificity.

The term "repeat sequence motif refers to an amino acid sequence that is deduced from one or more repeat units or repeat modules. Preferably, the repeat unit or repeat module is from a repeat domain having binding specificity for the same target. Such repeat sequence motifs include framework residue positions and target interaction residue positions. The framework residue positions correspond to the positions of the framework residues of the repeat unit (or module). Likewise, the target interacting residue positions correspond to the positions of the target interacting residues of the repeating unit (or module). Repeat sequence motifs include fixed positions and randomized positions. The term "fixed position" refers to an amino acid position in such a repeat sequence motif, wherein the position is set to a particular amino acid. Most often, such fixed positions correspond to the positions of framework residues and/or the positions of target-interacting residues specific for a certain target. The term "randomized position" denotes an amino acid position in a repeating sequence motif where two or more amino acids are allowed at the amino acid position, for example, where any of the common twenty naturally occurring amino acids are allowed, or where a majority of the twenty naturally occurring amino acids are allowed, such as amino acids other than cysteine, or amino acids other than glycine, cysteine, and proline. Most commonly, such randomized positions correspond to the positions of target interacting residues. However, some positions of the framework residues may also be randomized.

The term "folding topology" refers to the tertiary structure of the repeating unit or repeating module. The folding topology depends on the amino acid segments forming at least part of the alpha-helix or beta-sheet, or the amino acid segments forming a linear polypeptide or loop, or any combination of alpha-helix, beta-sheet and/or linear polypeptides/loops.

The term "contiguous" refers to an arrangement in which repeating units or repeating modules are arranged in series. In the design of repeat proteins, there are at least 2, usually about 2 to 6, especially at least about 6, often 20 or more repeat units (or modules). In most cases, the repeat units (or modules) of a repeat domain will exhibit a high degree of sequence identity (identical amino acid residues at corresponding positions) or sequence similarity (amino acid residues are different but have similar physicochemical properties), and some amino acid residues may be very conserved key residues. However, a high degree of sequence variability due to amino acid insertions and/or deletions and/or substitutions between different repeat units (or modules) of a repeat domain will be possible as long as the co-folding topology of the repeat units (or modules) is maintained.

Methods for directly determining the folding topology of repetitive proteins by physicochemical means such as X-ray crystallography, NMR or CD spectroscopy are well known to the person skilled in the art. Methods for identifying and determining repeat units or repeat sequence motifs, or identifying related protein families comprising such repeat units or motifs, are well established in the field of bioinformatics and are well known to those skilled in the art, such as homology searches (BLAST et al). The step of refining the initial repeat sequence motif may comprise an iterative process.

The term "repeating module" refers to the repeating amino acid sequence of a designed repeating domain that is originally derived from a repeating unit of a naturally occurring repeat protein. Each repeat module contained in a repeat domain is derived from one or more repeat units of a family or subfamily of naturally occurring repeat proteins (e.g., a family of armadillo repeat proteins or ankyrin repeat proteins).

A "repeat module" can comprise positions having amino acid residues present in all copies of the corresponding repeat module ("fixed positions") and positions having different or "randomized" amino acid residues ("randomized positions").

The term "capping module" refers to a polypeptide fused to an N-terminal or C-terminal repeat module of a repeat domain, wherein the capping module forms a tight tertiary interaction (i.e., tertiary structure interaction) with the repeat module, thereby providing a cap that separates the hydrophobic core of the repeat module from the solvent on the side not in contact with consecutive repeat modules. The N-terminal and/or C-terminal capping module may be, or may be derived from, a capping unit or other structural unit present in the naturally occurring repeat protein adjacent to the repeat unit. The term "capping unit" refers to a naturally occurring, folded polypeptide, wherein the polypeptide defines a specific structural unit fused to a repeating unit at the N-or C-terminus, wherein the polypeptide forms a tight tertiary structural interaction with the repeating unit, thereby providing a cap that separates the hydrophobic core of the repeating unit on one side from the solvent. Preferably, the capping module or capping unit is a capping repetitive sequence. The term "capping repeat sequence" denotes such a capping module or capping unit: which have a similar or identical fold as the adjacent repeat unit (or module), and/or sequence similarity to the adjacent repeat unit (or module). See for capping modules and capping repeat sequences: WO2002/020565, and Interlandi et al, 2008 (supra). For example, WO2002/020565 describes an N-terminal capping module (i.e. a capping repeat sequence) having the following amino acid sequence:

GSDLGKKLLEAARAGQDDEVRILMANGADVNA (SEQ ID NO:1), and

a C-terminal capping module (i.e., a capping repeat sequence) having the following amino acid sequence

QDKFGKTAFDISIDNGNEDLAEILQKLN (SEQ ID NO:2)。

Et al, 2008 (above) describe a C-terminal capping module having the following amino acid sequence: QDKFGKTPFDLAIREGHEDIAEVLQKAA (SEQ ID NO:3) and QDKFGKTPFDLAIDNGNEDIAEVLQKAA (SEQ ID NO: 4).

For example, the N-terminal capping module of SEQ ID NO. 17 is encoded by amino acids from positions 1-32 and the C-terminal capping module of SEQ ID NO. 17 is encoded by amino acids from positions 99-126.

GSDLGKKLLE AARAGQDDEV RILLKAGADV NA (SEQ ID NO:14) or

GSDLGKKLLE AARAGQDDEV RELLKAGADV NA (SEQ ID NO:15), wherein

It has been found that position 24 in the N-terminal capping module should not be methionine (M). In the sequences SEQ ID NO:14 and SEQ ID NO:15, position 24 is leucine (L). The amino acid at this position may likewise be V, I or A. Preferably L in position 24, or by A. L is most preferred at position 24.

The substitution principle of methionine at position 24 can be applied to a variety of other N-terminal capping modules. As a consequence thereof, the subject of the invention also includes all such N-terminal capping modules: it differs from the amino acid sequences SEQ ID NO:14 and SEQ ID NO:15 by up to 9 amino acid substitutions at positions other than position 24. More preferably, such N-terminal capping modules differ by the substitution of 8 amino acids, more preferably 7 amino acids, more preferably 6 amino acids, more preferably 5 amino acids, even more preferably 4 amino acids, more preferably 3 amino acids, more preferably 2 amino acids, and most preferably 1 amino acid.

The amino acid substitution may be by any of the 20 most common naturally occurring amino acids, preferably by an amino acid selected from the group consisting of: A. d, E, F, H, I, K, L, M, N, Q, R, S, T, V, W and Y; more preferably by an amino acid selected from the group consisting of: A. d, E, H, I, K, L, Q, R, S, T, V and Y. Also preferably, the amino acid is replaced by a homologous amino acid; i.e., the amino acid is replaced with an amino acid having a side chain with similar biophysical properties. For example, a negatively charged amino acid D may be replaced by a negatively charged amino acid E, or a hydrophobic amino acid such as L may be replaced by A, I or V. Substitution of homologous amino acids for amino acids is well known to those skilled in the art.

The amino acid G at position 1 and/or the amino acid S at position 2 of SEQ ID NO:14 or SEQ ID NO:15 can be removed from the N-terminal capping module without any significant effect on the properties. These 2 amino acids serve as linkers connecting the ankyrin repeat domain to other amino acids and proteins. The invention also includes N-terminal capping modules in which G at position 1 and/or S at position 2 are removed. It will be appreciated that the definition of "position 24" herein should be adapted accordingly, to be position 23 (if 1 amino acid is deleted) or 22 (if 2 amino acids are deleted), respectively.

Compared to an ankyrin repeat domain having the same amino acid sequence (including an N-terminal capping module) but having M (in place of L, V, I or A) in the N-terminal capping module at the amino acid residue corresponding to position 24 of SEQ ID NO:14 or SEQ ID NO:15, a substitution of methionine at position 24 of the N-terminal capping module confers a higher thermostability, i.e.a higher Tm value in PBS, to the ankyrin repeat domain. Examples of such ankyrin repeat domain and binding protein pairs (M at position 24, relative to L, V, I or a at position 24) and their Tm values are described in the examples and shown in the figures. Preferred are N-terminal capping modules wherein the substitution of M at position 24 by another amino acid results in an increase in Tm of the ankyrin repeat domain carrying such N-terminal capping module of at least 1 ℃, preferably at least 2 ℃, more preferably at least 3 ℃, or most preferably at least 4 ℃.

Using fluorescence-based thermal stability assay (Niesen, F.H., Nature Protocols)2(9)2212-2221, 2007) it is possible to analyze the thermostability of proteins, in particular of ankyrin repeat domains. Thus, the temperature at which the protein unfolds is measured by the increase in fluorescence of a dye having affinity for the hydrophobic portion of the protein that is exposed as the protein unfolds. The temperature at the midpoint of the fluorescence transition thus obtained (from lower to higher fluorescence intensity) then corresponds to the midpoint denaturation temperature (Tm) of the analyzed protein. Alternatively, the thermal stability of the protein can be analyzed by CD spectroscopy; i.e. its Circular Dichroism (CD) signal at 222 nm is followed by techniques well known to those skilled in the art to measure its thermal denaturation.

In one embodiment, when at most 9 amino acids of SEQ ID NO. 14 or SEQ ID NO. 15 at positions other than position 24 are optionally substituted with other amino acids, preferably the amino acid residue A at position 26 of SEQ ID NO. 14 or SEQ ID NO. 15 is substituted with H, Y, K or R. More preferably, however, the amino acid residue a at position 26 is not substituted.

In another embodiment, when at most 9 amino acids of SEQ ID NO. 14 or SEQ ID NO. 15 at positions other than position 24 are optionally substituted with other amino acids, preferably, the amino acid residue R at position 21 of SEQ ID NO. 14 or SEQ ID NO. 15 is substituted with E. More preferably, however, the amino acid residue R at position 26 is not substituted.

In another embodiment, when at most 9 amino acids of SEQ ID NO. 14 or SEQ ID NO. 15 at positions other than position 24 are optionally substituted with other amino acids, preferably amino acid residue I at position 22 of SEQ ID NO. 14 or amino acid residue E at position 22 of SEQ ID NO. 15 is substituted with V. More preferably, however, the amino acid residue I or E at position 22, respectively, is not substituted; see, for example, the compound pairs shown in figure 2.

In another embodiment, when up to 9 amino acids of SEQ ID NO. 14 or SEQ ID NO. 15 at positions other than position 24 are optionally substituted with other amino acids, preferably the amino acid residue K at position 25 of SEQ ID NO. 14 or SEQ ID NO. 15 is substituted with A or E. More preferably, however, the amino acid residue at position 25, K, is not substituted, or is substituted with A, as shown in the compound pair shown in FIG. 1.

In another embodiment, when at most 9 of the amino acids of SEQ ID NO. 14 or SEQ ID NO. 15 at positions other than position 24 are optionally substituted with other amino acids, preferably the amino acid residues RILLKA at positions 21-26 of SEQ ID NO. 14 or the amino acid residues RELLLKA at positions 21-26 of SEQ ID NO. 15 are not substituted.

Another preferred N-terminal capping module comprises the following sequence motif

GSX₁LX₂KKLLE AARAGQDDEV X₃X₄LX₅X₆X₇GADV NA (SEQ ID NO:5), wherein

5G at position 1 and/or S at position 2 optionally deleted;

X₁represents amino acid residue G, A or D; preferably, a or D;

X₂represents an amino acid residue G or D;

X₃represents an amino acid residue R or E;

X₄represents amino acid residue I, E or V; preferably, I or E;

X₅represents amino acid residue L, V, I or A; preferably, L or a;

X₆represents amino acid residue A, K or E; preferably, a or K; and

X₇represents an amino acid residue selected from: A. h, Y, K and R; preferably a or H.

In another embodiment, the N-terminal capping module comprises the following sequence:

X₁LX₂KKLLEAARAGQDDEVRILX₃AX₄GADVNA (SEQ ID NO:13)

wherein X₁Represents amino acid residue G, A or D;

wherein X₂Represents an amino acid residue G or D;

wherein X₃Represents amino acid residue L, V, I or A; preferably L; and is

Wherein X₄Represents amino acid residue A, H, Y, K, R or N; preferably, a or N.

Most preferred are binding proteins comprising at least one ankyrin repeat domain, wherein the ankyrin repeat domain comprises an N-terminal capping module having the amino acid sequence:

GSDLGKKLLE AARAGQDDEV RILLKAGADV NA (SEQ ID NO:14) or

GSDLGKKLLE AARAGQDDEV RELLKAGADV NA (SEQ ID NO:15), wherein

G at position 1 and/or S at position 2 of SEQ ID NO:14 and SEQ ID NO:15 are optionally deleted.

GSDLGKKLLE AARAGQDDEV RILLKAGADV NA (SEQ ID NO:14) or

Preferably, the amino acid sequence of the corresponding N-terminal capping module has at least 75% amino acid sequence identity, more preferably 80% amino acid sequence identity, even more preferably 85% amino acid sequence identity, more preferably 90% amino acid sequence identity and most preferably 95% amino acid sequence identity to SEQ ID No. 14 or SEQ ID No. 15, always under the following conditions: the amino acid residue at position 24 in the amino acid sequence of the N-terminal capping module is L, V, I or A, more preferably L or A, most preferably L.

In specific embodiments, the N-terminal capping module has a specified percentage of amino acid sequence identity to SEQ ID No. 14 or SEQ ID No. 15, and amino acid residue A, H, Y, K or R at position 26 and/or amino acid residue R or E at position 21, always under the following conditions: the amino acid residue at position 24 in the amino acid sequence of the N-terminal capping module is L, V, I or A.

Further preferred is any such N-terminal capping module comprising an N-terminal capping repeat sequence, wherein one or more amino acid residues in the capping repeat sequence are replaced with a specific amino acid residue that is present at the corresponding position when the corresponding capping unit or repeat unit is aligned.

The binding proteins of the invention comprise at least one ankyrin repeat domain, wherein said ankyrin repeat domain comprises an N-terminal capping module as defined herein, which ankyrin repeat domain may additionally comprise one of the preferred C-terminal capping modules described below. A preferred C-terminal capping module comprises the following sequence motif

X₁DKX₂GKTX₃X₄D X₅X₆X₇DX₈GX₉EDX₁₀AEX₁₁LQKAA (SEQ ID NO:6), wherein

X₁Represents an amino acid residue Q or K;

X₂represents amino acid residue A, S or F; preferably, S or F;

X₃represents amino acid residue A or P;

X₄represents amino acid residue A or F;

X₅represents amino acid residue I or L;

X₆represents amino acid residue S or a;

X₇represents amino acid residue I or A;

X₈represents amino acid residue A, E or N; preferably, a or N;

X₉represents amino acid residue N or H;

X₁₀represents amino acid residue L or I;

X₁₁represents amino acid residue I or V; and is

Wherein if X is₄Represents F, and X₇Represents I, and X₈Represents N or E, then X₂And does not represent F.

Another preferred C-terminal capping module comprises the following sequence motif

X₁DKX₂GKTX₃AD X₄X₅X₆DX₇GX₈EDX₉AEX₁₀LQKAA (SEQ ID NO:7), wherein

X₁Represents an amino acid residue Q or K;

X₂represents amino acid residue A, S or F; preferably, S or F;

X₃represents amino acid residue A or P;

X₄represents amino acid residue I or L;

X₅represents amino acid residue S or a;

X₆represents amino acid residue I or A;

X₇represents amino acid residue A, E or N; preferably, a or N;

X₈represents amino acid residue N or H;

X₉represents amino acid residue L or I; and

X₁₀represents amino acid residue I or V.

X₁DKX₂GKTX₃AD X₄X₅ADX₆GX₇EDX₈AEX₉LQKAA (SEQ ID NO:8), wherein

X₁Represents an amino acid residue Q or K;

X₂represents amino acid residue A, S or F; preferably, S or F;

X₃represents amino acid residue A or P;

X₄represents amino acid residue I or L;

X₅represents amino acid residue S or a;

X₆represents amino acid residue A, E or N; preferably, a or N;

X₇represents amino acid residue N or H;

X₈represents amino acid residue L or I; and

X₉represents amino acid residue I or V.

Preferably, such a C-terminal capping module comprising a sequence motif of SEQ ID NO 6, 7 or 8 has amino acid residue A, I or K at a position corresponding to position 3 of said sequence motif; preferably, I or K.

Also preferably, such a C-terminal capping module comprising a sequence motif of SEQ ID NO 6, 7 or 8 has an amino acid residue R or D at a position corresponding to position 14 of said sequence motif.

A preferred C-terminal capping module is one having the following amino acid sequence: QDKSGKTPADLAADAGHEDIAEVLQKAA (SEQ ID NO: 9).

The invention further relates to a binding protein comprising at least one ankyrin repeat domain, wherein said ankyrin repeat domain comprises a C-terminal capping module having the following amino acid sequence: SEQ ID NO 6, 7 or 8 as defined above.

The ankyrin repeat domain of the invention may be genetically constructed as follows: by means of gene synthesis, an N-terminal capping module (i.e.the N-terminal capping repeat of SEQ ID NO:14) was assembled, followed by one or more repeating modules (i.e.the repeating module comprising amino acid residues 33-98 of SEQ ID NO:17) and a C-terminal capping module (i.e.the C-terminal capping repeat of SEQ ID NO: 9). The genetically assembled repeat domain gene can then be expressed in E.coli as described above.

Further preferred are binding proteins, repeat domains, N-terminal capping modules or C-terminal capping modules having an amino acid sequence that does not contain amino acid C, M or N.

Further preferred are binding proteins, repeat domains, N-terminal capping modules or C-terminal capping modules with an amino acid sequence without the amino acid N and subsequently G.

Further preferred are non-naturally occurring capping modules, repeating modules, binding proteins or binding domains.

The term "non-naturally occurring" means synthetic or not from natural sources, and more specifically the term means made by the human hand. The term "non-naturally occurring binding protein" or "non-naturally occurring binding domain" means that the binding protein or the binding domain is synthetic (i.e., prepared from amino acids by chemical synthesis) or recombinant and is not from nature. A "non-naturally occurring binding protein" or "non-naturally occurring binding domain" is an artificial protein or knob domain, respectively, obtained by expression of a correspondingly designed nucleic acid. Preferably, expression is carried out in eukaryotic or bacterial cells, or by using a cell-free in vitro expression system. Furthermore, the term means that the sequence of said binding protein or said binding domain is not present as a non-artificial sequence entry in a sequence database, such as GenBank, EMBL-Bank or Swiss-Prot. These databases and other similar sequence databases are well known to those skilled in the art.

The term "PBS" means an aqueous phosphate buffered solution containing 137 mM NaCl, 10 mM phosphate, and 2.7 mM KCl, and having a pH of 7.4.

In a specific embodiment, the invention relates to a binding protein comprising an ankyrin repeat domain comprising an N-terminal capping module according to the invention and comprising a biologically active compound.

The term "biologically active compound" means a compound that, when administered to a mammal having a disease, alters the disease. The bioactive compound may have antagonistic or agonistic properties and may be a proteinaceous bioactive compound or a non-proteinaceous bioactive compound.

Such proteinaceous bioactive compounds can be covalently linked, for example, to the ankyrin repeat domain of the present invention by preparing gene fusion polypeptides using standard DNA cloning techniques, followed by standard expression and purification.

Such non-proteinaceous bioactive compounds can be covalently linked, for example, to the ankyrin repeat domain of the present invention by chemical means, e.g., via maleimide linker coupling to cysteine thiol, wherein cysteine is coupled via a peptide linker to the N-or C-terminus of the binding domain described herein.

Examples of proteinaceous bioactive compounds are binding domains with unique target specificity (e.g. neutralizing it by binding to a growth factor), cytokines (e.g. interleukins), growth factors (e.g. human growth hormone), antibodies and fragments thereof, hormones (e.g. GLP-1) and any possible proteinaceous drugs.

Examples of non-proteinaceous biologically active compounds are toxins, such as DM1 from immunogens (ImmunoGen), small molecules targeting GPCRs, antibiotics and any possible non-proteinaceous drugs.

Another preferred embodiment is a recombinant binding protein comprising a binding domain, wherein the binding domain is an ankyrin repeat domain or a designed ankyrin repeat domain. Such ankyrin repeat domains may comprise 1,2, 3 or more internal repeat modules which will be involved in binding to the target. Such ankyrin repeat domains comprise an N-terminal capping module, 2-4 internal repeat modules and a C-terminal capping module as defined herein. Preferably, the binding domain is an ankyrin repeat domain or designed ankyrin repeat domain.

Preferably, the binding protein as defined above, wherein said ankyrin repeat domain or said designed ankyrin repeat domain comprises a repeat module having the following ankyrin repeat sequence motif:

X₁DX₂X₃GX₄TPLHLAAX₅X₆GHLEIVEVLLKX₇GADVNA (SEQ ID NO:10)

wherein X₁、X₂、X₃、X₄、X₅、X₆And X₇Independently of one another represent an amino acid residue selected from: A. d, E, F, H, I, K, L, M, N, Q,R, S, T, V, W and Y; preferably, the first and second electrodes are formed of a metal,

X₁represents an amino acid residue selected from: A. d, M, F, S, I, T, N, Y and K; more preferably K and A; and

X₇represents an amino acid residue selected from: s, A, Y, H and N; more preferably Y or H.

In other embodiments, any of the binding proteins or domains described herein can be covalently bound to one or more additional moieties including, for example, moieties that bind different targets to produce bispecific binders, biologically active compounds, tagging moieties (e.g., fluorescent tags such as fluorescein, or radiotracers), moieties that facilitate protein purification (e.g., small peptide tags such as His-or strep-tags), moieties that provide effector function for improved therapeutic effect (e.g., antibody Fc moieties that provide antibody-dependent cell-mediated cytotoxicity, toxic protein moieties such asPseudomonas aeruginosa(Pseudomonas aeruginosa) Exotoxin a (eta) or a small molecule toxic agent such as a maytansinoid or a DNA alkylating agent) or a moiety that provides improved pharmacokinetics. Improved pharmacokinetics can be assessed according to perceived therapeutic need. It is often desirable to increase bioavailability and/or increase the time between administrations, which may be achieved by increasing the time available for protein maintenance in serum after administration. In some cases, it is desirable to improve the continuity of protein serum concentration over time (e.g., reduce the difference in protein serum concentration between the concentration shortly after administration and the concentration shortly before the next administration).

In another embodiment, the invention relates to nucleic acid molecules encoding specific binding proteins, specific ankyrin repeat domains and specific N-terminal capping modules. Furthermore, it relates to vectors comprising said nucleic acid molecules.

Furthermore, it relates to a pharmaceutical composition comprising one or more of the above-described binding proteins comprising ankyrin repeat domains, or a nucleic acid molecule encoding a particular binding protein, and optionally a pharmaceutically acceptable carrier and/or diluent. Pharmaceutically acceptable carriers and/or diluents are known to those skilled in the art and will be explained in more detail below. Still further, it relates to diagnostic compositions comprising one or more of the above-described binding proteins, in particular binding proteins comprising ankyrin repeat domains.

Pharmaceutical compositions comprise a binding protein as described above and a pharmaceutically acceptable carrier, excipient or stabilizer, e.g. as described in Remington's Pharmaceutical Sciences 16 th edition, Osol, a. eds [1980 ]. Suitable carriers, excipients or stabilizers known to the skilled worker are saline, ringer's solution, dextrose (dextrose) solution, Hank's solution, fixed oils, ethyl oleate, 5% dextrose in saline, substances which improve isotonicity and chemical stability, buffers and preservatives. Other suitable carriers include any carrier that does not itself cause the production of antibodies harmful to the individual receiving the composition, such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, and amino acid copolymers. The pharmaceutical composition may also be a combined preparation comprising an additional active agent, such as an anti-cancer agent or an anti-angiogenic agent.

Formulations to be used for in vivo administration must be sterile or aseptic. This is easily accomplished by filtration with sterile filtration membranes.

The pharmaceutical composition may be administered by any suitable method within the knowledge of the skilled person. The preferred route of administration is parenteral. In parenteral administration, the medicaments of the invention are formulated in unit dose injectable forms, such as solutions, suspensions or emulsions, in combination with pharmaceutically acceptable excipients as defined above. The dosage and mode of administration will depend on the individual to be treated and the particular disease. In general, the pharmaceutical composition is administered such that the binding protein of the invention is administered at a dose of 1. mu.g/kg to 20 mg/kg, more preferably 10. mu.g/kg to 5 mg/kg, most preferably 0.1 to 2 mg/kg. Preferably it is administered as a bolus dose. Continuous infusion, including continuous subcutaneous delivery via osmotic mini-pumps, may also be used. If so, the pharmaceutical composition may be infused at a dose of 5 to 20. mu.g/kg/min, more preferably 7 to 15. mu.g/kg/min.

Furthermore, any of the above pharmaceutical compositions is contemplated for use in the treatment of a disorder. The invention additionally provides methods of treatment. The method comprises the following steps: administering to a patient in need thereof a therapeutically effective amount of a binding protein of the invention.

Further, a method of treating a pathological condition in a mammal (including a human) is contemplated, the method comprising: administering to a patient in need thereof an effective amount of the above-described pharmaceutical composition.

The binding proteins according to the invention can be obtained and/or further evolved by several methods, such as display on the surface of bacteriophages (WO1990/002809, WO 2007/006665) or bacterial cells (WO 1993/010214), ribosome display (WO1998/048008), plasmid display (WO 1993/008278), or by using covalent RNA-repeat protein hybrid constructs (WO 2000/032823), or intracellular expression and selection/screening, for example by protein complementation assays (WO 1998/341120). Such methods are known to those skilled in the art.

The ankyrin repeat protein library used for selecting/screening binding proteins according to the invention can be obtained according to protocols known to the person skilled in the art (WO 2002/020565, Binz, h.k., et al, j. mol. biol.,332489-. The repeat domains of the invention can be assembled from the repeat modules of the invention and appropriate capping modules or capping repeat modules (modular) using standard recombinant DNA techniques (e.g.WO 2002/020565, Binz et al, 2003, in the above citations and Binz et al, 2004, in the above citations) (Forrer, P., et al, FEBS letters, et al)539, 2-6, 2003)。

The invention is not limited to the specific embodiments described in the examples. Other sources may be used and processed in accordance with the summary described below.

Examples

All starting materials and reagents disclosed below are known to those skilled in the art and are either commercially available or can be prepared using well known techniques.

Material

Chemicals were purchased from Fluka (switzerland). The oligonucleotides are from Microsynth (Switzerland). Unless otherwise indicated, DNA polymerase, restriction enzymes and buffers were from New England Biolabs (USA) or Fermentas (Litacon). The clones and protein preparations were E.coli XL1-blue (Stratagene, USA) or BL21 (Novagen, USA).

Molecular biology

Unless otherwise indicated, the methods were performed according to the protocol (Sambrook J., Fritsch E.F., and Maniatis T., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory 1989, New York).

DARPin used in the examples

DARPin #17 (SEQ ID NO:17 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #18 (SEQ ID NO:18 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #19 (SEQ ID NO:19 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #20 (SEQ ID NO:20 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #21 (SEQ ID NO:21 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #22 (SEQ ID NO:22 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #23 (SEQ ID NO:23 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #24 (SEQ ID NO:24 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #25 (SEQ ID NO:25 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #26 (SEQ ID NO:26 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #27 (SEQ ID NO:27 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #28 (SEQ ID NO:28 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #29 (SEQ ID NO:29 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #30 (SEQ ID NO:30 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #31 (SEQ ID NO:31 with His-tag fused to its N-terminus (SEQ ID NO: 16));

DARPin #32 (SEQ ID NO:32 with His-tag fused to its N-terminus (SEQ ID NO: 16)).

Designed ankyrin repeat protein library

Ankyrin repeat protein libraries designed by N2C and N3C are described (WO 2002/020565; Binz et al 2003, supra; Binz et al 2004, supra). The numbers in N2C and N3C describe the number of randomized repeat modules that exist between N-and C-terminal capping modules. The nomenclature used to define the locations within the repeating units and modules is based on: binz et al 2004, supra, modified by shifting the edges of the ankyrin repeat modules and ankyrin repeat units by one amino acid position. For example, position 1 of the ankyrin repeat module of Binz et al.2004 (above) corresponds to position 2 of the ankyrin repeat module of the present disclosure, and thus position 33 of the ankyrin repeat module of Binz et al.2004 (above) corresponds to position 1 of the next ankyrin repeat module of the present disclosure.

All DNA sequences were confirmed by sequencing and the calculated molecular weights of all the proteins were confirmed by mass spectrometry.

Example 1 construction, expression and purification of DARPin

Darpins with defined amino acid sequences can be prepared as follows: the corresponding reverse translated nucleic acid sequence is synthesized from the gene and subcloned into an expression system (e.g.Escherichia coliExpression system), expressing and purifying the protein. Such methods are known to those skilled in the art.

Capping module/replacement of repeated sequences

The N-or C-terminal capping repeat sequence of the ankyrin repeat domain can be replaced by an N-or C-terminal capping repeat sequence of the invention by combining techniques known to those skilled in the art (such as amino acid sequence alignment, mutagenesis and gene synthesis).

For example, the N-terminal capping repeat of SEQ ID NO:17 can be replaced with the N-terminal capping repeat of SEQ ID NO:14 as follows: (i) determining the N-terminal capping repeat sequence of SEQ ID NO:17 (i.e., sequence positions 1-32) by sequence alignment with SEQ ID NO:14, (ii) replacing the sequence of the determined C-terminal capping repeat sequence of SEQ ID NO:17 with the sequence of SEQ ID NO:14 to give SEQ ID NO:18, (iii) preparing a gene encoding a repeat domain encoding the replaced C-terminal capping repeat sequence (i.e., SEQ ID NO:18), (iv) expressing the modified repeat domain in the cytoplasm of E.coli, and (v) purifying the modified repeat domain by standard means.

As another example, the C-terminal capping repeat sequence of SEQ ID NO:17 can be replaced with the C-terminal capping repeat sequence of SEQ ID NO:9 as follows: (i) determining the sequence of the C-terminal capping repeat of SEQ ID NO:17 (i.e., sequence positions 99-126) by sequence alignment with SEQ ID NO:9, (ii) replacing the sequence of the determined C-terminal capping repeat of SEQ ID NO:17 with the sequence of SEQ ID NO:9, (iii) preparing a gene encoding a repeat domain encoding the replaced C-terminal capping module, (iv) expressing the modified repeat domain in the cytoplasm of E.coli, and (v) purifying the modified repeat domain by standard means.

High level and soluble expression of

In thatEscherichia coliDARPin was expressed in BL21 or XL1-Blue cells and purified using its His-tag using standard protocols. 1 l of the culture (same medium) was inoculated with 25 ml of overnight-standing culture (LB, 1% glucose, 100 mg/l ampicillin; 37 ℃). At an absorbance of about 1 at 600 nm, the culture was induced with 0.5 mM IPTG and incubated at 37 ℃ for 4 hours. The culture was centrifuged and the resulting pellet was resuspended in 40 ml TBS500 (50 mM Tris-HCl, 500 mM NaCl, pH 8) and sonicated. The lysate was centrifuged again and glycerol (10% (v/v) final concentration) and imidazole (20 mM final concentration) were added to the resulting supernatant. The protein was purified using a Ni-nitrilotriacetic acid column (2.5ml column volume) according to the manufacturer's instructions (QIAgen, Germany). Alternatively, DARPin or ankyrin repeat domains without a 6 xHis-tag were purified by anion exchange chromatography followed by size exclusion chromatography according to standard resins and protocols known to those skilled in the art. Up to 200 mg of highly soluble DARPin can be purified from 1 liter E.coli culture, purity estimated by SDS-15% PAGE>95 percent. The darpins thus purified were used for further characterization.

Example 2:higher thermal stability of darpins with improved N-end capping modules

Using fluorescence-based thermal stability assay (Niesen, F.H., Nature Protocols)2(9)2212-2221, 2007), the thermal stability of the purified DARPin (according to example 1) was analyzed. Thus, by targeting the hydrophobic part of the protein (which follows it)Protein unfolded and exposed) increase in fluorescence of a dye with affinity (e.g., SYPRO Orange; Invitrogen, cat # S6650), and the temperature at which the protein (i.e., such darpins) unfolds is measured. The temperature at the midpoint of the fluorescence transition thus obtained (from lower to higher fluorescence intensity) then corresponds to the midpoint denaturation temperature (Tm) of the analyzed protein. Alternatively, such purified darpins were analyzed for thermal stability by CD spectrometry; i.e. its Circular Dichroism (CD) signal at 222 nm is followed by techniques well known to those skilled in the art to measure its thermal denaturation.

Fluorescence-based thermal stability assay

Thermal denaturation of DARPin was measured using a real-time PCR instrument, i.e. a C1000 thermal cycler (BioRad) combined with a CFX96 optical system (BioRad), using SYPRO Orange as a fluorescent dye. DARPin was prepared at a concentration of 50 μ M in PBS at pH7.4 containing 1x SYPRO Orange (diluted from 5' 000x SYPRO Orange stock solution, Invitrogen) or MES buffer at pH5.8 (250 mM (2-N-morpholino) ethanesulfonic acid pH 5.5, 150 mM NaCl at 1:4 (v/v) mixed with PBS at pH7.4 and pH adjusted to 5.8), and 50 μ l of such protein solution or buffer alone was added to a white 96-well PCR plate (Bio-Rad). The plates were sealed with Microseal 'B' additive Seals (Bio-Rad) and heated in real-time PCR instruments from 20 ℃ to 95 ℃ in 0.5 ℃ increments, including a 25 second incubation step after each temperature increment, and the relative fluorescence units of the samples at each temperature increment were measured after thermal denaturation of the DARPin. Using channel 2 of the real-time PCR instrument (i.e., excitation at 515-535 nm and detection at 560-580 nm), the relative fluorescence units in the wells of the plate were measured and the corresponding values obtained with buffer only were subtracted. From the resulting thermal denaturation transition midpoint, the Tm value of the analyzed DARPin can be determined.

Thermal stability determination based on CD spectroscopy

The CD signal of DARPin was recorded at 222 nm in a Jasco J-715 instrument (Jasco, Japan) while the protein (in PBS pH 7.4) at a concentration of 0.02 mM was slowly heated from 20 ℃ to 95 ℃ using a temperature increment of 1 ℃ or 2 ℃ per minute. This is an effective way to track the denaturation of darpins, since darpins are mainly composed of alpha helices that show strong changes in their CD signal at 222 nm after unfolding. The midpoint of the transition of the thus measured CD signal trace observed for DARPin corresponds to its Tm value.

The results of thermal denaturation of DARPin in PBS ph7.4, as followed by increase in fluorescence intensity of SYPRO Orange, or by CD spectroscopy, are shown in the figure and table 1.

The thermal stability of DARPin #17 was compared to that of DARPin #18 using a fluorescence-based thermal stability assay (table 1, figure 1). These 2 darpins have the same amino acid sequence except for a single amino acid in the N-terminal capping module of their repeat domains. The repeat domain of DARPin #18 contains a modified N-terminal capping module as described herein, but DARPin #17 does not; that is, the N-terminal capping module of DARPin #18 contains a leucine (L) residue at position 24 of its N-terminal capping module, while DARPin #17 contains a methionine (M) at that position. Surprisingly, this single amino acid change results in an increase in the Tm of about 6.5 ℃.

The thermal stability of DARPin #19 was compared to that of DARPin #20 using fluorescence-based and CD-based thermal stability assays (table 1, figure 2). These 2 darpins have the same amino acid sequence except for a single amino acid in the N-terminal capping module of their repeat domains. The repeat domain of DARPin #20 contains a modified N-terminal capping module as described herein, but DARPin #19 does not; that is, the N-terminal capping module of DARPin #20 contains an L residue at position 24 of its N-terminal capping module, while DARPin #19 contains an M at that position. Surprisingly, this single amino acid change results in an increase in the Tm of about 2.5 ℃. Thus, the thermal stability of an already very stable DARPin can be further increased by applying the improved N-terminal capping module of the present invention.

The thermal stability of DARPin #21 was compared to the thermal stability of DARPin #22 and DARPin #23 using CD-based thermal stability assays (table 1, fig. 3). These 3 darpins have the same amino acid sequence except for 2 or 3 amino acids in the N-terminal capping module of their repeat domains. The repeat domains of DARPin #22 and DARPin #23 comprise a modified N-end capping module as described herein, but DARPin #21 does not; i.e., the N-terminal capping modules of DARPin #22 and DARPin #23 contain an L residue at its 24-position (while DARPin #21 contains an M at that position) and an a residue at position 26 (while DARPin #21 contains an N at that position); in addition, the N-terminal capping module of DARPin #23 contains a K residue at position 25 (whereas DARPin #21 and DARPin #22 contain an A at that position). Thus, DARPin #23 comprises an improved N-terminal capping module as described herein comprising the amino acid sequence RILLKA (SEQ ID NO:11) at positions 21-26. Surprisingly, these small changes in the N-end capping modules of DARPin #22 and DARPin #23 resulted in Tm values that were increased by about 8.5 ℃ or 7 ℃, respectively, as compared to DARPin # 21. Moreover, DARPin #22 and DARPin #23 have nearly identical thermal stability, although their amino acid sequences differ by a single amino acid in the N-terminal capping module of their repeat domains; that is, the N-terminal capping module of DARPin #22 contains an A residue at position 25 of its N-terminal capping module, while DARPin #23 contains a K at that position. Thus, this change in the single amino acid at position 25 in such an N-terminal capping module would appear to be well tolerated with no effect on thermostability.

The thermal stability of DARPin #24 was compared to that of DARPin #25 and DARPin #26 using fluorescence-based and CD-based thermal stability assays (table 1, fig. 4). These 3 darpins have the same amino acid sequence except for 3 or 4 amino acids in the N-terminal capping module of their repeat domains. The repeat domains of DARPin #25 and DARPin #26 contain a modified N-end capping module as described herein, but DARPin #24 does not; that is, the N-terminal capping modules of DARPin #25 and DARPin #26 contain an L residue at position 24 (while DARPin #24 contains an M at that position), contain a K residue at position 25 (while DARPin #24 contains an A at that position), and an A residue at position 26 (while DARPin #24 contains an N at that position); in addition, the N-terminal capping module of DARPin #26 contains an E residue at position 22 (while DARPin #24 and DARPin #25 contain an I at that position). Thus, DARPin #25 and DARPin #26 comprise improved N-terminal capping modules as described herein comprising the amino acid sequences RILLKA (SEQ ID NO:11) and RELLLKA (SEQ ID NO:12) at positions 21-26, respectively. Surprisingly, these small changes in the N-end capping modules of DARPin #25 and DARPin #26 result in Tm values that are increased by about 5 ℃ or 6 ℃, respectively, as compared to DARPin # 24. Moreover, DARPin #25 and DARPin #26 have nearly identical thermal stability, although their amino acid sequences differ by a single amino acid in the N-terminal capping module of their repeat domains; that is, the N-terminal capping module of DARPin #25 contains an I residue at position 22 of its N-terminal capping module, while DARPin #26 contains an E at that position. Thus, this change in the single amino acid at position 22 in such an N-terminal capping module would appear to be well tolerated without a significant impact on thermal stability.

In summary, the thermal stability of various darpins can be significantly improved by small changes in the amino acid sequence of their N-terminal capping module as described herein.

¹Tm value determined in PBS at pH7.4 with a fluorescence-based assay

²Tm value determined in PBS at pH7.4 with CD-based assay

³Because the post-transition baseline was not reached, only the estimated Tm values

n.d. not determined.

Example 3:higher thermal stability of darpins with improved C-end capping modules

The thermal stability of darpins was analyzed with fluorescence-based thermal stability assays, or by CD spectroscopy, as described in example 2.

The thermal stability of DARPin #27 (SEQ ID NO:27 with His-tag fused to its N-terminus (SEQ ID NO: 16)) was compared to that of DARPin #28 (SEQ ID NO:28 with His-tag fused to its N-terminus (SEQ ID NO: 16)) using a fluorescence-based thermal stability assay. These 2 darpins have the same amino acid sequence except for the C-terminal capping module of their repeat domains. The repeat domain of DARPin #28 contains a modified C-terminal capping module as described herein, but DARPin #27 does not. The Tm values of DARPin #27 and DARPin #28 measured in PBS at pH7.4 were about 63 ℃ and about 73 ℃, respectively. The Tm values of DARPin #27 and DARPin #28 measured in MES buffer pH5.8 were about 54.5 ℃ and about 66 ℃, respectively.

The thermal stability of DARPin #29 (SEQ ID NO:29 with His-tag fused to its N-terminus (SEQ ID NO: 16)) was compared to that of DARPin #30 (SEQ ID NO:30 with His-tag fused to its N-terminus (SEQ ID NO: 16)) using a fluorescence-based thermal stability assay. These 2 darpins have the same amino acid sequence except for the C-terminal capping module of their repeat domains. The repeat domain of DARPin #30 contains a modified C-terminal capping module as described herein, but DARPin #29 does not. The Tm values of DARPin #29 and DARPin #30 measured in MES buffer at pH5.8 were about 51 ℃ and about 55 ℃, respectively.

The thermal stability of DARPin #31 (SEQ ID NO:31) was compared to that of DARPin #32 (SEQ ID NO:32) using CD spectroscopy. These 2 darpins have the same amino acid sequence except for the C-terminal capping module of their repeat domains. The repeat domain of DARPin #32 contains a modified C-terminal capping module as described herein, but DARPin #31 does not. The Tm values of DARPin #31 and DARPin #32 measured in PBS at pH7.4 were about 59.5 ℃ and about 73 ℃, respectively.

CPCH1361491P

Sequence listing

<110>Molecular Partners AG

Binz, Hans Kaspar

<120> improved N-terminal capping module for designed ankyrin repeat proteins

<130>P393B

<150>EP10192711.9

<151>2010-11-26

<160>32

<170> PatentIn 3.5 edition

<210>1

<211>32

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>1

Gly Ser Asp Leu Gly Lys Lys LeuLeu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Met Ala Asn Gly Ala Asp Val Asn Ala

20 25 30

<210>2

<211>28

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>2

Gln Asp Lys Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile Asp Asn Gly

1 5 10 15

Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

20 25

<210>3

<211>28

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>3

Gln Asp Lys Phe Gly Lys Thr Pro Phe Asp Leu Ala Ile Arg Glu Gly

1 5 10 15

His Glu Asp Ile Ala Glu Val Leu Gln Lys Ala Ala

20 25

<210>4

<211>28

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>4

Gln Asp Lys Phe Gly Lys Thr Pro Phe Asp Leu Ala Ile Asp Asn Gly

1 5 10 15

Asn Glu Asp Ile Ala Glu Val Leu Gln Lys Ala Ala

20 25

<210>5

<211>32

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<220>

<221>misc_feature

<222>(3)..(3)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(5)..(5)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(21)..(22)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(24)..(26)

<223> Xaa can be any naturally occurring amino acid

<400>5

Gly Ser Xaa Leu Xaa Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Xaa Xaa Leu Xaa Xaa Xaa Gly Ala Asp Val Asn Ala

20 25 30

<210>6

<211>28

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<220>

<221>misc_feature

<222>(1)..(1)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(4)..(4)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(8)..(9)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(11)..(13)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(15)..(15)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(17)..(17)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(20)..(20)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(23)..(23)

<223> Xaa can be any naturally occurring amino acid

<400>6

Xaa Asp Lys Xaa Gly Lys Thr Xaa Xaa Asp Xaa Xaa Xaa Asp Xaa Gly

1 5 10 15

Xaa Glu Asp Xaa Ala Glu Xaa Leu Gln Lys Ala Ala

20 25

<210>7

<211>28

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<220>

<221>misc_feature

<222>(1)..(1)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(4)..(4)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(8)..(8)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(11)..(13)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(15)..(15)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(17)..(17)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(20)..(20)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(23)..(23)

<223> Xaa can be any naturally occurring amino acid

<400>7

Xaa Asp Lys Xaa Gly Lys Thr Xaa Ala Asp Xaa Xaa Xaa Asp Xaa Gly

1 5 10 15

Xaa Glu Asp Xaa Ala Glu Xaa Leu Gln Lys Ala Ala

20 25

<210>8

<211>28

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<220>

<221>misc_feature

<222>(1)..(1)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(4)..(4)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(8)..(8)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(11)..(12)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(15)..(15)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(17)..(17)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(20)..(20)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(23)..(23)

<223> Xaa can be any naturally occurring amino acid

<400>8

Xaa Asp Lys Xaa Gly Lys Thr Xaa Ala Asp Xaa Xaa Ala Asp Xaa Gly

1 5 10 15

Xaa Glu Asp Xaa Ala Glu Xaa Leu Gln Lys Ala Ala

20 25

<210>9

<211>28

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>9

Gln Asp Lys Ser Gly Lys Thr Pro Ala Asp Leu Ala Ala Asp Ala Gly

1 5 10 15

His Glu Asp Ile Ala Glu Val Leu Gln Lys Ala Ala

20 25

<210>10

<211>33

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<220>

<221>misc_feature

<222>(1)..(1)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(3)..(4)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(6)..(6)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(14)..(15)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(27)..(27)

<223> Xaa can be any naturally occurring amino acid

<400>10

Xaa Asp Xaa Xaa Gly Xaa Thr Pro Leu His Leu Ala Ala Xaa Xaa Gly

1 5 10 15

His Leu Glu Ile Val Glu Val Leu Leu Lys Xaa Gly Ala Asp Val Asn

20 25 30

Ala

<210>11

<211>6

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>11

Arg Ile Leu Leu Lys Ala

1 5

<210>12

<211>6

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>12

Arg Glu Leu Leu Lys Ala

1 5

<210>13

<211>30

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<220>

<221>misc_feature

<222>(1)..(1)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(3)..(3)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(22)..(22)

<223> Xaa can be any naturally occurring amino acid

<220>

<221>misc_feature

<222>(24)..(24)

<223> Xaa can be any naturally occurring amino acid

<400>13

Xaa Leu Xaa Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln Asp Asp

1 5 10 15

Glu Val Arg Ile Leu Xaa Ala Xaa Gly Ala Asp Val Asn Ala

20 25 30

<210>14

<211>32

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>14

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Leu Lys Ala Gly Ala Asp Val Asn Ala

20 25 30

<210>15

<211>32

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>15

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Glu Leu Leu Lys Ala Gly Ala Asp Val Asn Ala

20 25 30

<210>16

<211>10

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>16

Met Arg Gly Ser His His His His His His

1 5 10

<210>17

<211>126

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>17

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Met Lys Ala Gly Ala Asp Val Asn Ala

20 25 30

Lys Asp Glu Tyr Gly Leu Thr Pro Leu Tyr Leu Ala Thr Ala His Gly

35 40 45

His Leu Glu Ile Val Glu Val Leu Leu Lys Asn Gly Ala Asp Val Asn

50 55 60

Ala Val Asp Ala Ile Gly Phe Thr Pro Leu His LeuAla Ala Phe Ile

65 70 75 80

Gly His Leu Glu Ile Ala Glu Val Leu Leu Lys His Gly Ala Asp Val

85 90 95

Asn Ala Gln Asp Lys Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile Gly

100 105 110

Asn Gly Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

115 120 125

<210>18

<211>126

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>18

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Leu Lys Ala Gly Ala Asp Val Asn Ala

20 25 30

Lys Asp Glu Tyr Gly Leu Thr Pro Leu Tyr Leu Ala Thr Ala His Gly

35 4045

His Leu Glu Ile Val Glu Val Leu Leu Lys Asn Gly Ala Asp Val Asn

50 55 60

Ala Val Asp Ala Ile Gly Phe Thr Pro Leu His Leu Ala Ala Phe Ile

65 70 75 80

Gly His Leu Glu Ile Ala Glu Val Leu Leu Lys His Gly Ala Asp Val

85 90 95

Asn Ala Gln Asp Lys Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile Gly

100 105 110

Asn Gly Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

115 120 125

<210>19

<211>159

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>19

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Met Lys Ala GlyAla Asp Val Asn Ala

20 25 30

Phe Asp Trp Met Gly Trp Thr Pro Leu His Leu Ala Ala His Glu Gly

35 40 45

His Leu Glu Ile Val Glu Val Leu Leu Lys Asn Gly Ala Asp Val Asn

50 55 60

Ala Thr Asp Val Ser Gly Tyr Thr Pro Leu His Leu Ala Ala Ala Asp

65 70 75 80

Gly His Leu Glu Ile Val Glu Val Leu Leu Lys His Gly Ala Asp Val

85 90 95

Asn Thr Lys Asp Asn Thr Gly Trp Thr Pro Leu His Leu Ser Ala Asp

100 105 110

Leu Gly His Leu Glu Ile Val Glu Val Leu Leu Lys Asn Gly Ala Asp

115 120 125

Val Asn Ala Gln Asp Lys Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile

130 135 140

Asp Asn Gly Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

145 150 155

<210>20

<211>159

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>20

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Leu Lys Ala Gly Ala Asp Val Asn Ala

20 25 30

Phe Asp Trp Met Gly Trp Thr Pro Leu His Leu Ala Ala His Glu Gly

35 40 45

His Leu Glu Ile Val Glu Val Leu Leu Lys Asn Gly Ala Asp Val Asn

50 55 60

Ala Thr Asp Val Ser Gly Tyr Thr Pro Leu His Leu Ala Ala Ala Asp

65 70 75 80

Gly His Leu Glu Ile Val Glu Val Leu Leu Lys His Gly Ala Asp Val

85 90 95

Asn Thr Lys Asp Asn Thr Gly Trp Thr Pro Leu His Leu Ser Ala Asp

100 105 110

Leu Gly His Leu Glu Ile Val Glu Val Leu Leu Lys Asn Gly Ala Asp

115 120 125

Val Asn Ala Gln Asp Lys Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile

130 135 140

Asp Asn Gly Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

145 150 155

<210>21

<211>93

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>21

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Met Ala Asn Gly Ala Asp Val Asn Ala

20 25 30

Lys Asp Lys Asp Gly Tyr Thr Pro Leu His Leu Ala Ala Arg Glu Gly

35 40 45

His Leu Glu Ile Val Glu Val Leu LeuLys Ala Gly Ala Asp Val Asn

50 55 60

Ala Gln Asp Lys Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile Asp Asn

65 70 75 80

Gly Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

85 90

<210>22

<211>93

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>22

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Leu Ala Ala Gly Ala Asp Val Asn Ala

20 25 30

Lys Asp Lys Asp Gly Tyr Thr Pro Leu His Leu Ala Ala Arg Glu Gly

35 40 45

His Leu Glu Ile Val Glu Val Leu Leu Lys Ala Gly Ala Asp Val Asn

50 55 60

Ala Gln Asp Lys Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile Asp Asn

65 70 75 80

Gly Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

85 90

<210>23

<211>93

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>23

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Leu Lys Ala Gly Ala Asp Val Asn Ala

20 25 30

Lys Asp Lys Asp Gly Tyr Thr Pro Leu His Leu Ala Ala Arg Glu Gly

35 40 45

His Leu Glu Ile Val Glu Val Leu Leu Lys Ala Gly Ala Asp Val Asn

50 55 60

Ala Gln Asp Lys Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile Asp Asn

65 70 75 80

Gly Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

85 90

<210>24

<211>126

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>24

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Met Ala Asn Gly Ala Asp Val Asn Ala

20 25 30

Lys Asp Tyr Phe Ser His Thr Pro Leu His Leu Ala Ala Arg Asn Gly

35 40 45

His Leu Lys Ile Val Glu Val Leu Leu Lys Ala Gly Ala Asp Val Asn

50 55 60

Ala Lys Asp Phe Ala Gly Lys Thr Pro Leu His Leu Ala Ala Asn Asp

65 70 75 80

Gly His Leu Glu Ile Val Glu Val Leu Leu Lys His Gly Ala Asp Val

85 90 95

Asn Ala Gln Asp Ile Phe Gly Lys Thr Pro Ala Asp Ile Ala Ala Asp

100 105 110

Ala Gly His Glu Asp Ile Ala Glu Val Leu Gln Lys Leu Asn

115 120 125

<210>25

<211>126

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>25

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Leu Lys Ala Gly Ala Asp Val Asn Ala

20 25 30

Lys Asp Tyr Phe Ser His Thr Pro Leu His Leu Ala Ala Arg Asn Gly

35 40 45

His Leu Lys Ile Val Glu Val Leu Leu Lys Ala Gly Ala Asp Val Asn

50 55 60

Ala Lys Asp Phe Ala Gly Lys Thr Pro Leu His Leu Ala Ala Asn Asp

65 70 75 80

Gly His Leu Glu Ile Val Glu Val Leu Leu Lys His Gly Ala Asp Val

85 90 95

Asn Ala Gln Asp Ile Phe Gly Lys Thr Pro Ala Asp Ile Ala Ala Asp

100 105 110

Ala Gly His Glu Asp Ile Ala Glu Val Leu Gln Lys Leu Asn

115 120 125

<210>26

<211>126

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>26

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Glu Leu Leu Lys Ala Gly Ala Asp Val Asn Ala

20 25 30

LysAsp Tyr Phe Ser His Thr Pro Leu His Leu Ala Ala Arg Asn Gly

35 40 45

His Leu Lys Ile Val Glu Val Leu Leu Lys Ala Gly Ala Asp Val Asn

50 55 60

Ala Lys Asp Phe Ala Gly Lys Thr Pro Leu His Leu Ala Ala Asn Asp

65 70 75 80

Gly His Leu Glu Ile Val Glu Val Leu Leu Lys His Gly Ala Asp Val

85 90 95

Asn Ala Gln Asp Ile Phe Gly Lys Thr Pro Ala Asp Ile Ala Ala Asp

100 105 110

Ala Gly His Glu Asp Ile Ala Glu Val Leu Gln Lys Leu Asn

115 120 125

<210>27

<211>126

<212>PRT

<213> Artificial sequence

<220>

<223> Artificial sequence

<400>27

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

15 10 15

Asp Asp Glu Val Arg Ile Leu Met Ala Asn Gly Ala Asp Val Asn Ala

20 25 30

Ala Asp Tyr Phe Ser His Thr Pro Leu His Leu Ala Ala Arg Asn Gly

35 40 45

His Leu Lys Ile Val Glu Val Leu Leu Lys Tyr Gly Ala Asp Val Asn

50 55 60

Ala Ser Asp Phe Ala Gly Lys Thr Pro Leu His Leu Ala Ala Asn Asp

65 70 75 80

Gly His Leu Glu Ile Val Glu Val Leu Leu Lys His Gly Ala Asp Val

85 90 95

Asn Ala Gln Asp Ile Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile Asp

100 105 110

Asn Gly Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

115 120 125

<210>28

<211>126

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>28

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Met Ala Asn Gly Ala Asp Val Asn Ala

20 25 30

Ala Asp Tyr Phe Ser His Thr Pro Leu His Leu Ala Ala Arg Asn Gly

35 40 45

His Leu Lys Ile Val Glu Val Leu Leu Lys Tyr Gly Ala Asp Val Asn

50 55 60

Ala Ser Asp Phe Ala Gly Lys Thr Pro Leu His Leu Ala Ala Asn Asp

65 70 75 80

Gly His Leu Glu Ile Val Glu Val Leu Leu Lys His Gly Ala Asp Val

85 90 95

Asn Ala Gln Asp Ile Phe Gly Lys Thr Pro Ala Asp Ile Ala Ala Asp

100 105 110

Asn Gly His Glu Asp Ile Ala Glu Val Leu Gln Lys Leu Asn

115 120 125

<210>29

<211>126

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>29

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Leu Ala Ala Gly Ala Asp Val Asn Ala

20 25 30

Ala Asp Glu Arg Gly Thr Thr Pro Leu His Leu Ala Ala Val Tyr Gly

35 40 45

His Leu Glu Ile Val Glu Val Leu Leu Lys Asn Gly Ala Asp Val Asn

50 55 60

Ala Gln Asn Glu Thr Gly Tyr Thr Pro Leu His Leu Ala Asp Ser Ser

65 70 75 80

Gly His Leu Glu Ile Val Glu Val Leu Leu Lys His Ser Ala Asp Val

85 90 95

Asn Ala Gln Asp Lys Phe Gly Lys Thr Ala Phe Asp Ile Ser Ile Asp

100 105 110

Asn Gly Asn Glu Asp Leu Ala Glu Ile Leu Gln Lys Leu Asn

115 120 125

<210>30

<211>126

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>30

Gly Ser Asp Leu Gly Lys Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln

1 5 10 15

Asp Asp Glu Val Arg Ile Leu Leu Ala Ala Gly Ala Asp Val Asn Ala

20 25 30

Ala Asp Glu Arg Gly Thr Thr Pro Leu His Leu Ala Ala Val Tyr Gly

35 40 45

His Leu Glu Ile Val Glu Val Leu Leu Lys Asn Gly Ala Asp Val Asn

50 55 60

Ala Gln Asn Glu Thr Gly Tyr Thr Pro Leu His Leu Ala Asp Ser Ser

65 70 75 80

Gly His Leu Glu Ile Val Glu Val Leu Leu Lys His Ser Ala Asp Val

85 90 95

Asn Ala Gln Asp Lys Ser Gly Lys Thr Pro Ala Asp Ile Ala Ala Asp

100 105 110

Asn Gly His Glu Asp Ile Ala Glu Val Leu Gln Lys Leu Asn

115 120 125

<210>31

<211>103

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>31

Met Arg Gly Ser His His His His His His Gly Ser Asp Leu Gly Lys

1 5 10 15

Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln Asp Asp Glu Val Arg Ile

20 25 30

Leu Met Ala Asn Gly Ala Asp Val Asn Ala Lys Asp Lys Asp Gly Tyr

35 40 45

Thr Pro Leu His Leu Ala Ala Arg Glu Gly His Leu Glu Ile Val Glu

50 55 60

Val Leu Leu Lys Ala Gly Ala Asp Val Asn Ala Gln Asp Lys Phe Gly

65 70 75 80

Lys Thr Ala Phe Asp Ile Ser Ile Asp Asn Gly Asn Glu Asp Leu Ala

85 90 95

Glu Ile Leu Gln Lys Leu Asn

100

<210>32

<211>103

<212>PRT

<213> Artificial sequence

<220>

<223> synthetic constructs

<400>32

Met Arg Gly Ser His His His His His His Gly Ser Asp Leu Gly Lys

1 5 10 15

Lys Leu Leu Glu Ala Ala Arg Ala Gly Gln Asp Asp Glu Val Arg Ile

20 25 30

Leu Met Ala Asn Gly Ala Asp Val Asn Ala Lys Asp Lys Asp Gly Tyr

35 40 45

Thr Pro Leu His Leu Ala Ala Arg Glu GlyHis Leu Glu Ile Val Glu

50 55 60

Val Leu Leu Lys Ala Gly Ala Asp Val Asn Ala Gln Asp Lys Ser Gly

65 70 75 80

Lys Thr Pro Ala Asp Leu Ala Ala Asp Asn Gly His Glu Asp Ile Ala

85 90 95

Glu Val Leu Gln Lys Ala Ala

100

Claims

1. A binding protein comprising at least one ankyrin repeat domain, wherein said ankyrin repeat domain comprises an N-terminal capping module, and wherein the N-terminal capping module consists of the amino acid sequence:

GSDLGKKLLE AARAGQDDEV RILLKAGADV NA (SEQ ID NO:14)；

GSDLGKKLLE AARAGQDDEV RELLKAGADV NA (SEQ ID NO: 15); or GSDLGKKLLEAARAGQDDEV RILLKAGADV NA (SEQ ID NO:14), wherein

The amino acid residue at position 25 is replaced with A.

2. The binding protein according to claim 1, wherein said N-terminal capping module consists of the amino acid sequence:

GSDLGKKLLE AARAGQDDEV RILLKAGADV NA (SEQ ID NO:14)。

3. the binding protein according to claim 1, wherein said N-terminal capping module consists of the amino acid sequence:

GSDLGKKLLE AARAGQDDEV RELLKAGADV NA (SEQ ID NO:15)。

4. the binding protein according to claim 1, wherein said N-terminal capping module consists of the amino acid sequence:

GSDLGKKLLE AARAGQDDEV RILLKAGADV NA (SEQ ID NO:14), wherein

The amino acid residue at position 25 is replaced with A.

5. A nucleic acid encoding the binding protein of claim 1.

6. A pharmaceutical composition comprising the binding protein of claim 1 and optionally a pharmaceutically acceptable carrier and/or diluent.

7. A pharmaceutical composition comprising the nucleic acid of claim 5 and optionally a pharmaceutically acceptable carrier and/or diluent.

8. A nucleic acid encoding the binding protein of claim 2.

9. A pharmaceutical composition comprising the binding protein of claim 2 and optionally a pharmaceutically acceptable carrier and/or diluent.

10. A pharmaceutical composition comprising the nucleic acid of claim 8 and optionally a pharmaceutically acceptable carrier and/or diluent.

11. A nucleic acid encoding the binding protein of claim 3.

12. A pharmaceutical composition comprising the binding protein of claim 3 and optionally a pharmaceutically acceptable carrier and/or diluent.

13. A pharmaceutical composition comprising the nucleic acid of claim 11 and optionally a pharmaceutically acceptable carrier and/or diluent.

14. A nucleic acid encoding the binding protein of claim 4.

15. A pharmaceutical composition comprising the binding protein of claim 4 and optionally a pharmaceutically acceptable carrier and/or diluent.

16. A pharmaceutical composition comprising the nucleic acid of claim 14 and optionally a pharmaceutically acceptable carrier and/or diluent.