WO2025088193A1

WO2025088193A1 - Methods for efficient recombinant production of polypeptides

Info

Publication number: WO2025088193A1
Application number: PCT/EP2024/080339
Authority: WO
Inventors: Lars Friedrich; Uli Binder; Michaela Gebauer
Original assignee: XL Protein GmbH
Current assignee: XL Protein GmbH
Priority date: 2023-10-27
Filing date: 2024-10-25
Publication date: 2025-05-01
Anticipated expiration: 2026-04-27

Abstract

The present invention relates to inventive means and methods for the production of concatenated polypeptides comprising two or more random coil polypeptides, preferably linked by a cleavable linker. In particular, the present invention further provides for said inventive concatenated polypeptides comprising the two or more random coil polypeptides, preferably linked by a cleavable linker. Further, herein provided are nucleic acid molecules and nucleic acid vectors encoding said concatenated polypeptides. The present invention further provides for means and methods for the cleaving of concatenated polypeptides via a cleavable linker and/or for the release of random coil polypeptides from said concatenated polypeptide.

Description

Methods for efficient recombinant production of polypeptides

Polypeptides that form random coils (i.e., conformationally disordered polypeptides) have been described as useful in, inter alia, medical and diagnostic applications. A random coil conformation and/or a disordered conformation is primarily characterized by the absence of (stable) secondary structures (such as alpha-helices or beta-sheets). For example, WO 2008/155134 discloses proteins comprising an amino acid sequence of at least about 100 amino acid residues and consisting of proline, alanine, and serine (PAS) residues. The amino acid sequence forming the random coil conformation can comprise a plurality of amino acid repeats. WO 2011/144756 discloses polypeptides comprising repetitive amino acid sequences consisting solely of proline and alanine (PA) residues and forming random coils. Both PAS and PA polypeptides are hydrophilic, uncharged biological polymers.

Conformationally disordered polypeptides, i.e., PA- and/or PAS-polypeptides lack toxicity or immunogenicity and have, inter alia, been deployed as conjugates to medically-relevant biomolecules (for example, enzymes) in pharmaceutical compositions in order to extend the circulation time or half-life of such biomolecules; see, inter alia, WO 2018/234455. Accordingly, random coil polypeptides comprising PA-rich and/or PAS-rich sequences have been described and shown in the art to be useful in a plurality of technical fields, including therapeutics, research, as well as diagnostics (see, inter alia, WO 2008/155134). This illustrates the breadth of applications and the significance of PA-rich and/or PAS-rich polypeptides in various industries.

For industrial, technical, and research approaches, as well as in the medical field, it can be of interest to, inter alia, link PA-rich and/or PAS-rich random coil polypeptides to biomolecules like peptides, proteins, carbohydrates, nucleic acids, etc. It is also of interest to link such random coiled polypeptides to pharmaceuticals which may also comprise so-called “small molecules”. Generally, PA-rich and PAS-rich random coil polypeptides may be linked to, for example, biomolecules or pharmaceuticals via chemical coupling. However, in order to provide molecules of interest that are modified by the linkage to PA-rich and/or PAS-rich polypeptides the provision of these polypeptides is needed in the first step. This can be time consuming as well as cost intensive. Accordingly, the technical problem underlying the present invention is the provision of means and methods to overcome insufficiencies in the provision of random coil polypeptides, in particular in the provision of random coil polypeptides to be linked to e.g., biomolecules, pharmaceuticals, and the like.

The technical problem is solved by provision of the embodiments provided herein below and as characterized in the appended claims.

The present invention provides for concatenated polypeptides comprising two or more random coil polypeptides that are preferably linked by a cleavable linker. In particular, concatenated polypeptides are provided that comprise two or more random coil polypeptides and that are characterized by PAS-rich and/or PA-rich protein sequences. Therefore, in one embodiment, the random coil polypeptides comprised in the inventive concatenated polypeptides are characterized by “PAS-rich sequences” and/or “PA-rich sequences”. Accordingly, said random coil polypeptides may be characterized by sequences that are rich in proline, alanine, and serine residues, i.e., random coil polypeptides that comprise at least one proline residue, at least one alanine residue, and at least one serine residue. PA-rich sequences/PA-rich random coil polypeptides are characterized by sequences that are rich in proline and alanine residues, i.e., random coil polypeptide that comprise at least one proline residue and at least one alanine residue. It is preferred that the two or more random coil polypeptides comprised in the herein provided concatenated polypeptides each comprise the same/an identical amino acid sequence. Evidently, this facilitates downstream production processes, as will be detailed herein below.

Accordingly, the inventive concatenated polypeptides as characterized herein are particularly useful in production processes or preparation processes of random coil polypeptides for further use, for example the linkage to biomolecules, pharmaceuticals, etc.

Accordingly, the present invention provides in one embodiment for methods for the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker. The inventive method comprises the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker; and b) the purification of said concatenated polypeptide. As is evident from the illustrative and non-limiting examples, the herein provided means and methods allow for convenient and/or efficient production of said concatenated polypeptides. The concatenated polypeptides comprise in particular two or more random coil polypeptides linked by a cleavable linker. The present inventors provide for multiple exemplary and non-limiting embodiments of such concatenated polypeptides/concatemers (see, inter alia, illustrative and non-limiting Figure 1) and demonstrates efficient, convenient and/or reliable recombinant production as well as purification of concatenated polypeptides as characterized herein. Non-limiting Figure 3 illustrates for example the efficient and reliable production and/or purification of several concatemers.

As mentioned above, the present invention also provides for concatenated polypeptides that are obtained by/obtainable by the herein detailed methods. In one embodiment, these concatenated polypeptides comprise two or more random coil polypeptides. Again, these random coil polypeptides may be linked by a cleavable linker, as defined herein below. The invention also provides for nucleic acid molecules and nucleic acid vectors encoding said concatemers/concatenated polypeptides. Also this embodiment is explained in detail herein below.

Accordingly, the present invention provides the person skilled in the art with means and methods for the production of the herein disclosed concatemers/concatenated polypeptides whereby these means and methods, in particular, comprise recombinant means. The term “recombinant” is also further explained and illustrated herein below and is known to skilled artisan.

As will also be illustrated herein below, the present invention also provides for a method for the production of a random coil polypeptide, wherein said method comprises the steps of the cleavage of the inventive concatenated polypeptide and the purification of said random coil polypeptide. This method leads to a desired release of the random coil polypeptide, preferably the release from the herein described inventive concatemers/concatenated polypeptides.

The efficient production of random coil polypeptides is, inter alia, illustratively shown in the appended non-limiting examples and figures.

Accordingly, the present invention also relates to means and methods for the production of random coil polypeptides, preferably PA and/or PAS polypeptides. Hence, the present invention relates to means and methods for the production of random coil polypeptides comprising the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker; b) the purification of said concatenated polypeptide; c) the cleavage of said concatenated polypeptide; and d) the purification of said two or more random coil polypeptides. The present invention further provides for random coil polypeptides as obtained by/obtainable by the herein disclosed means and methods or as defined herein below. Further, the present invention provides for means and methods for the production of a conjugate comprising one or more of the herein provided random coil polypeptides, comprising the steps of a) the coupling of said random coil polypeptide to a biomolecule, and b) the purification of the conjugate.

Accordingly, the present invention provides for means and methods for the production of a conjugate comprising a random coil polypeptide comprising the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker; b) the purification of said concatenated polypeptide; c) the cleavage of said concatenated polypeptide; d) the purification of said two or more random coil polypeptides; e) the coupling of said random coil polypeptide to a biomolecule, and f) the purification of the conjugate.

Hence, the present invention also relates to conjugates comprising the herein provided random coil polypeptides. Further, herein provided are also compositions comprising such conjugates. Also, medical and non-medical uses of such conjugates or of such compositions comprising said conjugates are provided herein.

The inventive means, method, uses, and compounds provided in accordance with the present invention will be described in greater detail in the following. This detailed description relates to and is applicable to all aspects of the present invention, including not only the means and methods for the production of such compounds but also to the compounds as such. This further refers to the uses of any of such compounds.

The present inventors have surprisingly found means and methods for the convenient and efficient recombinant production of random coil polypeptides (like and in particular “PA” and/or “PAS” polypeptides). Accordingly, in context of this invention it was found that random coil polypeptides may be linked and/or coupled via a cleavable linker resulting in concatenated polypeptides (i.e., concatenated polypeptides comprising two or more random coil polypeptides linked by a cleavable linker). In the context of the present invention, the term “concatemer” may comprise (a) concatenated polypeptide(s) but also (a) nucleic acid molecule(s) that encode(s) for such (a) concatenated polypeptide(s) (i.e. the term concatemer in context of the present invention also relates to “concatenated nucleic acids”/”nucleic acid molecules encoding concatenated polypeptides”). In the context of the present invention, both concatenated polypeptides and concatenated nucleic acid molecules encoding the same are considered as concatemers. Accordingly, in the context of the present invention, a concatemer comprises a biopolymer that comprises or consists of monomeric units/elements/parts/pieces/repeats. As detailed herein below, said units/elements/parts/pieces/repeats may either be directly or indirectly linked/fused/coupled. In the context of the present invention “indirect” linkage/fusion/coupling refers to linking/fusing/coupling such units/elements/parts/pieces/repeats via a linker.

Further, in the context of the present invention such units/elements/parts/pieces/repeats (in the context of the present invention, in particular, the random coil polypeptides) comprised in a concatemer may be (i) functionally identical or (ii) functionally or structurally identical. With regard to the herein provided random coil polypeptides (i.e., exemplary units/elements/parts/pieces), being functionally identical means that they may or may not share an identical amino acid sequence, however, still share the same/a comparable function. With regard to said random coil polypeptides, said identical function may be the attainment/achievement/realization/obtainment of a random coil conformation and/or of a disordered conformation. As also described in WO 2008/155134 and WO 2011/144756, random coil polypeptides (in particular PA and/or PAS polypeptides) are very well known in the art and have been described as being capable of increasing the hydrodynamic volume when attached to e.g., biomolecules, pharmaceuticals, and the like. Further, such random coil polypeptides are immunologically inert.

Further, with regard to the herein provided random coil polypeptides (i.e., exemplary units/elements/parts/pieces), being functionally identical and structurally identical means that they share an identical/the same amino acid sequence and share an identical/the same function (e.g., a random coil conformation and/or a disordered conformation). In the context of the present invention the term “random coil conformation” also comprises the term “disordered conformation”. Said random coil conformation or said disordered conformation of the herein employed random coil polypeptides have preferably the capacity of increasing the hydrodynamic volume of, inter alia, protein drugs, other biomolecules, or pharmaceuticals and are by themselves immunologically inert (see, inter alia, WO 2018/234455).

In order to share an identical/the same function, the herein provided random coil polypeptides do not have to obtain exactly the same/ an identical conformation, as long as they obtain/realize/achieve a random coil conformation/ a disordered conformation. Herein below, further definitions of random coil conformations as well as means and methods that allow the skilled artisan to predict and/or experimentally determine such conformations are known in the art, inter alia, from WO 2008/155134 and are also provided herein below. With regard to the herein provided concatenated polypeptides and as is further detailed herein below, it is most preferred that the therein comprised units/elements/parts/pieces/repeats are functionally and structurally identical (i.e., that these units/elements/parts/pieces/repeats share the same/an identical function and the same/an identical amino acid sequence).

Normally, the person skilled in the art is aware of the term “concatemer” primarily in the context of continuous nucleic acid molecules that comprise multiple copies of the same/identical nucleic acid sequences. However, in the context of the present invention the term “concatemer” may refer to both nucleic acid molecules and polypeptides.

The skilled person is aware that PA and/or PAS polypeptides may be genetically encoded by nucleic acids (see, inter alia, WO 2017/109087). Furthermore, the herein recited and employed cleavable linkers may also be genetically encoded by nucleic acids. Hence, the concatenated polypeptide comprising also cleavable linkers may also be genetically encoded by corresponding nucleic acids. The present invention also relates to novel and inventive nucleic acid sequences that encode (i) one or more random coil polypeptide fused to one or more cleavable linkers, and/or (ii) two or more random coil polypeptides linked/fused/coupled/concatenated via one or more cleavable linkers (i.e., a concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker). Accordingly, the herein provided concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker may be encoded by a single nucleic acid molecule (inter alia, a DNA or RNA molecule). Hence, it is conceivable to the skilled person that such (a) nucleic acid molecule(s) can comprise two or more units/elements/parts/pieces/repeats of coding sequences encoding the herein provided random coil polypeptides (linked by one or more cleavable linkers). These units/elements/parts/pieces/repeats may thus each encode a random coil polypeptide and may be linked by a nucleic acid sequence encoding a cleavable linker, as will be detailed herein below. Accordingly, the present invention also relates to nucleic acid molecules that may encode the herein provided two or more random coil polypeptides linked by a cleavable linker. As such, it is evident to the skilled person that such a nucleic acid molecule (comprising multiple linked/fused/coupled/concatenated units/elements/parts/pieces) may be considered a concatenated nucleic acid. In the context of the present invention, any concatenated nucleic acid is a concatemer, whereas, as is detailed herein above, not every concatemer must be a concatenated polypeptide. Accordingly, the present invention also provides for concatenated nucleic acids, wherein these may be collectively or separately (depending on the context) be referred to as concatemer(s). In the context of the present invention the terms “concatenated nucleic acid”, “nucleic acid encoding the/a concatenated polypeptide” may be used interchangeably herein below and above. Further, the terms “nucleic acid” and “nucleic acid molecule” may be used interchangeably herein below and above.

As already expounded herein above, in the context of the present invention such units/elements/parts/pieces/repeats (in the context of the present invention, inter alia, nucleic acid sequences encoding the random coil polypeptides) comprised in a concatemer may be functionally identical or structurally and functionally identical. With regard to the herein provided nucleic acid sequences encoding the herein provided random coil polypeptides (i.e., exemplary units/elements/parts/pieces), being functionally identical means that they may or may not share an identical nucleic acid sequence, however, still share the same/a comparable function. With regard to said nucleic acid sequences, said identical function may be the encoding of identical polypeptides (i.e., polypeptides with identical amino acid sequences, such as, inter alia, random coil polypeptides with identical amino acid sequences). Accordingly, in the context of the present invention, the polypeptide encoded by a nucleic acid and/or encoding a certain polypeptide may be the function of said nucleic acid. Further, with regard to the herein provided nucleic acids encoding said random coil polypeptides (i.e., exemplary units/elements/parts/pieces), being functionally identical and structurally identical means that said nucleic acids share an identical/the same nucleic acid sequence and share an identical/the same function (e.g., encoding the same/ identical polypeptides, such as, inter alia, random coil polypeptides). With regard to the herein provided concatenated nucleic acids and as is further detailed herein below, it is most preferred that the therein comprised units/elements/parts/pieces/repeats are functionally identical, without being structurally identical (i.e., that these units/elements/parts/pieces/repeats share the same/an identical function, however, do not share the same/an identical nucleic acid sequence).

The skilled person is aware that the genetic code is degenerate and offers redundancy for most genetically encoded amino acids. This means that the same/identical amino acid may be encoded by more than one amino acid codon (i.e., nucleic acid sequence). Accordingly and as detailed herein below and above, the same polypeptide may be encoded by two or more nucleic acid molecules having different nucleic acid sequences. In the context of the present invention this may even be desired. The herein provided PA and PAS polypeptides consist of proline and alanine, or of proline, alanine, and serine, respectively. Hence, said PA and PAS polypeptides share highly repetitive amino acid sequences. WO 2017/109087 discloses nucleic acid sequences encoding PA and PAS polypeptides without reproducing the highly repetitive nature of said amino acid sequence in the nucleic acids encoding said amino acid sequences. Resultingly, WO 2017/109087 provides for nucleic acid sequences with advantageously low nucleic acid sequence redundancies (thereby reducing genetic instability of such nucleic acid sequences). Similarly, the herein disclosed nucleic acid sequences encoding said PA and PAS polypeptides preferably have advantageously low nucleic acid sequence redundancies. Non-limiting examples of such nucleic acids encoding PA and/or PAS random coil polypeptides and having low nucleic acid sequence redundancy may be found, inter alia, in SEQ IDs NO: 36 to 47. Solely for the sake of providing an illustrative example, the amino acid sequence encoding for an exemplary PAS polypeptide (i.e., a random coil polypeptide comprising proline residues, alanine residues, and serine residues) as provided in SEQ ID NO: 1 (PAS#1, ASPAAPAPASPAAPAPSAPA) may be, inter alia, encoded by one or more of the nucleic acid sequences as provided in SEQ IDs NO: 128 to 137. Accordingly, the skilled person is aware that a concatenated polypeptide comprising two or more PAS#1 amino acid sequences (i.e., units/elements/parts/pieces; SEQ ID NO: 1) may or may not be encoded by a concatenated nucleic acid molecule comprising nucleic acid units/elements/parts/pieces/repeats comprising identical or different nucleic acid sequences (e.g. SEQ IDs NO: 128 to 137). As mentioned herein above, units/elements/parts/pieces/repeats comprising different nucleic acid sequences are preferred in the context of the present invention. Non-limiting examples of such nucleic acid sequences with low nucleic acid sequence redundancies encoding PA and/or PAS polypeptides may, inter alia, be found in SEQ IDs NO: 36 to 47 and in WO 2017/109087, which is, thus, herewith incorporated by reference in its entirety.

As is evident, inter alia, from the claims, the present invention relates to the recombinant production of said concatenated polypeptides. Accordingly, the herein provided concatenated nucleic acids (encoding the herein provided concatenated polypeptides comprising the herein provided random coil polypeptides linked by a cleavable linker) may be comprised in a nucleic acid vector, as will be further detailed herein below. Said nucleic acid vector and/or said concatenated nucleic acid molecule may, further, be comprised in a cell (i.e., a host cell) and/or in an organism (i.e., a host). In the context of the present invention, said organism is capable of expressing/can express/expresses the concatenated polypeptide from said nucleic acid vector and/or said concatenated nucleic acid. Accordingly, the present invention provides for the means and methods for the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker, wherein said means and methods comprise culturing a host or host cell, and wherein said host or host cell expresses said concatenated polypeptide from a concatenated nucleic acid molecule (i.e., a nucleic acid molecule encoding said concatenated polypeptide) and/or a nucleic acid vector encoding said concatenated polypeptide. As detailed herein above and below, the present invention further provides for additional/complementary/downstream means and methods that, inter alia, allow for the release of the random coil polypeptides linked by said cleavable linker and comprised in said concatenated polypeptide.

The term "random coil" as used herein relates generally to any conformation of a polymeric molecule, including amino acid polymers/amino acid sequences/polypeptides/peptides, in which the individual monomeric elements that form said polymeric structure are essentially randomly oriented towards the adjacent monomeric elements while still being chemically bound to said adjacent monomeric elements. In particular, a polypeptide, amino acid sequence or amino acid polymer adopting/having/forming/capable of forming a "random coil conformation" substantially lacks a defined secondary and tertiary structure. Accordingly, in the context of the present invention, polypeptides capable of forming a random coil are conformationally disordered polypeptides.

The nature of polypeptides forming a random coil conformation and their methods of experimental identification are known to the person skilled in the art and have been described in the scientific literature (Cantor (1980) Biophysical Chemistry, 2nd ed., W. H. Freeman and Company, New York; Creighton (1993) Proteins - Structures and Molecular Properties, 2nd ed., W. H. Freeman and Company, New York; Smith (1996) Fold Des TR95-R106). Such polypeptides are particularly capable of forming a random coil conformation when present in an aqueous environment (e.g., an aqueous solution or an aqueous buffer). The presence of a random coil conformation can be determined using methods known in the art, in particular by means of spectroscopic techniques such as circular dichroism (CD) spectroscopy. CD spectroscopy represents a light absorption spectroscopy method in which the difference in absorbance of right- and left-circularly polarized light by a substance is measured. The secondary structure of a protein can be determined by CD spectroscopy using far-ultraviolet spectra with a wavelength between approximately 190 and 250 nm. At these wavelengths, the different secondary structures commonly found in polypeptides can be analyzed, since a-helix, parallel and anti-parallel p-sheet, and random coil conformations each give rise to a characteristic shape and magnitude of the CD spectrum. Accordingly, by using CD spectrometry the skilled artisan is readily capable of determining whether a polypeptide (or segment thereof) forms/adopts random coil conformation in aqueous solution or at physiological conditions. Other established biophysical methods include nuclear magnetic resonance (NMR) spectroscopy, absorption spectrometry, infrared and Raman spectroscopy, measurement of the hydrodynamic volume via size exclusion chromatography or field flow fractionation, analytical ultracentrifugation or dynamic/static light scattering as well as measurements of the frictional coefficient or intrinsic viscosity (Cantor (1980) loc. cit.; Creighton (1993) loc. cit.; Smith (1996) loc. cit.).

In addition to the experimental methods detailed above, theoretical methods for the prediction of secondary structures in proteins have been described. One example of such a theoretical method is the Chou-Fasman method (Chou and Fasman (1974), Biochemistry 13, 223-245) which is based on an analysis of the relative frequencies of each amino acid in a -helices, p-sheets, and turns based on known protein structures solved, for example, with X-ray crystallography. However, theoretical prediction of protein secondary structure is known to be unreliable. As exemplified herein below, amino acid sequences expected to adopt an a -helical secondary structure according to the Chou-Fasman method were experimentally found to form a random coil. Accordingly, theoretical methods such as the Chou-Fasman algorithm may only have limited predictive value whether a given polypeptide adopts random coil conformation, as also illustrated in the appended examples and figures. Nonetheless, the above-described theoretical prediction is often the first approach in the evaluation of a putative secondary structure of a given polypeptide/amino acid sequence. A theoretical prediction of a random coil structure also often indicates that it might be worthwhile verifying by the above experimental means whether a given polypeptide/amino acid sequence has indeed a random coil conformation.

As detailed herein above and below, the present invention provides, inter alia, for random coil polypeptides and concatenated polypeptides comprising two or more random coil polypeptides. Preferably, said random coil polypeptides comprise an amino acid sequence comprising alanine, proline, and serine, or comprise an amino acid sequence consisting of alanine and proline. In other words, the herein provided random coil polypeptides preferably comprise alanine, proline and optionally serine. However, the skilled person is aware that an amino acid sequence/a polypeptide may also form random coil conformation when other residues than proline, alanine and, optionally, serine are comprised as a minor constituent in said amino acid sequence/polypeptide. The term "minor constituent" as used herein means that maximally 5 mol% or maximally 10 mol% amino acid residues are different from proline, alanine, or serine in the encoded random coil polypeptides of this invention. This means that maximally 10 of 100 amino acids may be different from proline, alanine and, optionally, serine, preferably maximally 8 mol%, i.e. maximally 8 of 100 amino acids may be different from proline, alanine and, optionally, serine, more preferably maximally 6 mol%, i.e. maximally 6 of 100 amino acids may be different from proline, alanine and, optionally, serine, even more preferably maximally 5 mol%, i.e. maximally 5 of 100 amino acids may be different from proline, alanine and, optionally, serine, particularly preferably maximally 4 mol%, i.e. maximally 4 of 100 amino acids may be different from proline, alanine and, optionally, serine, more particularly preferably maximally 3 mol%, i.e. maximally 3 of 100 amino acids may be different from proline, alanine and, optionally, serine, even more particularly preferably maximally 2 mol%, i.e. maximally 2 of 100 amino acids may be different from proline, alanine and, optionally, serine and most preferably maximally 1 mol%, i.e. maximally 1 of 100 of the amino acids that are comprised in the random coil polypeptide may be different from proline, alanine and, optionally, serine. Said amino acids different from proline, alanine and, optionally, serine may be selected from the group consisting of Arg, Asn, Asp, Cys, Gin, Glu, Gly, His, He, Leu, Lys, Met, Phe, Thr, Trp, Tyr, and Vai, including posttranslationally modified amino acids or non-natural amino acids (see, e.g., Budisa (2004) Angew Chem Int Ed Engl 43:6426-6463; Young (2010) J Biol Chem 285: 11039-11044; Liu (2010) Annu Rev Biochem 79:413-444; Wagner (1983) AngewChem Int Ed Engl 22:816-828; Walsh (2010) Drug Discov Today 15: 773-780), preferably Cys. In certain cases, PA-polypeptides can also comprise Ser as a minor constituent. For example, in case the encoded random coil polypeptide consists of proline and alanine, serine can also be considered as minor constituent.

Generally, it is preferred herein that these “minor” amino acids (other than proline, alanine and, optionally, serine) are not present in the encoded random coil polypeptide as described herein or the encoded random coil polypeptide as part/fragment of a fusion protein. In accordance with the invention, the encoded random coil polypeptide / amino acid sequence may, in particular, consist exclusively of proline, alanine and, optionally, serine residues (i.e., no other amino acid residues are present in the encoded random coil polypeptide or in the amino acid sequence).

It is also envisaged that the introduction (e.g., by way of substitution or insertion) into the random coil polypeptide of one or more cysteine (Cys or C) residues may be of advantage for subsequent coupling reactions to, for example, biomolecules or pharmaceuticals. Accordingly, in some embodiments of the present invention wherein the random coil polypeptide may subsequently be used for chemical coupling to, for example, biomolecules or pharmaceuticals it may be envisaged that one or more Cys residues introduced at specific positions can provide an advantage for said chemical coupling (see, e.g., SEQ IDs NO: 3, 4, 11, and 12). Methods to conjugate the thiol side chain of Cys residues with compounds carrying e.g., maleimide or iodo groups are well known in the art. It is evident for the person skilled in the art that such PA and/or PAS polypeptides comprising one or more (additional) cysteine residue(s) may also be encoded by nucleic acid sequences. Non-limiting examples of such nucleic acid sequences are shown in SEQ IDs NO: 148 to 167 and SEQ IDs NO: 228 to 247.

In one embodiment, the herein provided random coil polypeptide comprises an amino acid sequence comprising alanine and proline residues (P and A). Accordingly, in the context of the present invention, polypeptides comprising an amino acid sequence comprising alanine and proline residues may be referred to as “PA polypeptides” or simply as “PA”.

Preferably, PA is a polypeptide comprising an amino acid sequence comprising amino acid residues independently selected from proline and alanine residues. Preferably, PA comprises at least one proline residue and at least one alanine residue.

More preferably, in PA the proportion of the number of proline residues comprised in the PA to the total number of amino acid residues comprised in PA is preferably >10 mol% and <70 mol%, more preferably >20 mol% and <50 mol%, and even more preferably >25 mol% and <40 mol%. Accordingly, it is preferred that 10 mol% to 70 mol% of the total number of amino acid residues in PA are proline residues; more preferably, 20 mol% to 50 mol% of the total number of amino acid residues comprised in PA are proline residues; and even more preferably, 25 mol% to 40 mol% (e.g., 25 mol%, 30 mol%, 35 mol% or 40 mol%) of the total number of amino acid residues comprised in PA are proline residues. Moreover, it is preferred that PA does not contain any consecutive proline residues (i.e., that it does not contain any partial PP sequence or multiples thereof). Further, it is preferred that PA comprises no more than 6 identical consecutive amino acid residues (i.e., that it does not, inter alia, contain any partial AAAAAA sequence or multiples thereof).

In another embodiment, the herein provided random coil polypeptide comprises an amino acid sequence comprising alanine and proline residues (P and A) with about one, about two, about three, about four or about five Cys residues. Preferably, the random coil polypeptide comprises a single N-terminal Cys residue or a single C-terminal Cys residue. Accordingly, the random coil polypeptide may comprise an N- terminal or a C-terminal cysteine residue.

Example 4 illustratively demonstrates the recombinant production of concatenated polypeptides comprising two or more random coil polypeptides linked by a cleavable linker, wherein each of said random coil polypeptides comprises a C-terminal Cys residue (Figure 8). Further, Example 4 also illustrates the successful release of said random coil polypeptides (each comprising a C-terminal Cys residue) from said concatenated polypeptides (Figure 9). As detailed herein above, it is furthermore preferred that at least 90 mol%, preferably at least 92 mol%, more preferably at least 93 mol%, more preferably at least 94 mol%, more preferably at least 95 mol%, more preferably at least 96 mol%, more preferably at least 97 mol%, even more preferably at least 98 mol%, yet even more preferably at least 99 mol%, and most preferably 100 mol% of the number of amino acid residues in PA are independently selected from proline and alanine. The remaining amino acid residues in PA are preferably selected from the 20 standard proteinogenic a-amino acids, more preferably from proline, alanine, serine, glycine, cysteine, valine, asparagine, and glutamine, and even more preferably from proline, alanine, glycine, cysteine, and serine. Accordingly, it is preferred that PA is composed of proline, alanine, glycine, cysteine, and serine residues (wherein less than 10 mol%, preferably less than 5 mol%, of the number of amino acid residues in PA are glycine, cysteine, or serine residues), and it is most preferred that PA is composed of proline and alanine residues, i.e. consists solely of proline and alanine residues. It will be understood that, as specified above, PA includes at least one proline residue and at least one alanine residue.

The number of amino acid residues that the herein provided PA polypeptide is composed of is preferably about 10 to about 300 amino acid residues, more preferably about 10 to about 250 amino acid residues, more preferably about 10 to about 200 amino acid residues, even more preferably about 15 to about 150 amino acid residues, more preferably about 10 to about 140 amino acid residues, even more preferably about 10 to about 130 amino acid residues, even more preferably about 15 to about 120 amino acid residues, even more preferably about 15 to about 110 amino acid residues, and yet even more preferably about 20 to about 100 amino acid residues. PA sequences comprising about 20, about 40, about 60, about 80 or about 100 amino acid residues may be even more preferred. This means that the herein provided random coil polypeptide comprises at least about 10 amino acid residues. Most preferably, the herein provided random coil polypeptide comprises about 20, about 30, about 40, about 50, about 60, about 80, or about 100 amino acid residues.

Examples of preferred PA amino acid sequences include, in particular, such amino acid sequences that comprise (or, preferably, that consist of): (i) any of the sequences of “PA#1” (AAPAAPAPAAPAAPAPAAPA; SEQ ID NO: 6), “PA#2” (AAPAAAPAPAAPAAPAPAAP; SEQ ID NO: 7), “PA#3” (SEQ ID NO: 8), “PA#4” (SEQ ID NO: 9), “PA#5” (SEQ ID NO: 10) or (ii) a fragment of any of these sequences, or (iii) a combination of two or more of these sequences (which may be the same or different, i.e., any combination of two or more (e.g., two, three, four, five, six, seven, eight, nine, ten, or more) of these sequences (inter alia, PA#1, PA#2, PA#3, PA#4, PA#5); a corresponding example is a dimer of PA#1 (“PA#1- PA#1”), i.e.

AAPAAPAPAAPAAPAPAAPA AAPAAPAPAAPAAPAPAAPA; further non-limiting examples include PA#1-PA#2 (i.e. AAPAAPAPAAPAAPAPAAPAAAPAAAPAPAAPAAPAPAAP), PA#1-PA#5, PA#2- PA#1, PA#2-PA#2, PA#2-PA#5, PA#5-PA#1, PA#5-PA#2, PA#5-PA#5, PA# I -PA# I -PA# I, PA#1- PA#1-PA#2, PA#1-PA#1-PA#5, PA#1-PA#2-PA#1, PA#1-PA#2-PA#2, PA#1-PA#2-PA#5, PA#1- PA#5-PA#1, PA#1-PA#5-PA#2, PA#1-PA#5-PA#5, PA#2-PA#1-PA#1, PA#2-PA#1-PA#2, PA#2- PA#1-PA#5, PA#2-PA#2-PA#1, PA#2-PA#2-PA#2, PA#2-PA#2-PA#5, PA#2-PA#5-PA#1, PA#2-

PA#5-PA#2, PA#2-PA#5-PA#5, PA#5-PA#1-PA#1, PA#5-PA#1-PA#2, PA#5-PA#1-PA#5, PA#5- PA#2-PA#1, PA#5-PA#2-PA#2, PA#5-PA#2-PA#5, PA#5-PA#5-PA#1, PA#5-PA#5-PA#2, or PA#5- PA#5-PA#5).

In another embodiment, the random coil polypeptide comprises an amino acid sequence comprising alanine, proline, and serine residues (P, A, and S). Accordingly, in the context of the present invention, polypeptides comprising an amino acid sequence comprising alanine, proline, and serine residues may be referred to as “PAS polypeptides” or simply as “PAS”.

Preferably, PAS is a polypeptide comprising an amino acid sequence comprising amino acid residues independently selected from alanine, proline, and serine residues. Preferably, PAS comprises at least one proline residue and at least one alanine residue and at least one serine residue.

More preferably, in PAS the encoded amino acid sequence comprises more than about 4 mol%, preferably more than about 6 mol%, more preferably more than about 10 mol%, more preferably more than about

15 mol%, more preferably more than about 20 mol%, more preferably more than about 22 mol%, 23 mol% or 24 mol%, more preferably more than about 26 mol%, 29 mol%, or 30 mol%, more preferably more than about 31 mol%, 32 mol%, 33 mol%, 34 mol% or 35 mol% and most preferably more than about 25 mol% proline residues. The encoded amino acid sequence preferably comprises less than about 40 mol%, more preferably less than 38 mol%, 35 mol%, 30 mol%, 26 mol% proline residues, wherein the lower values are preferred. Moreover, it is preferred that PAS does not contain any consecutive proline residues (i.e., that it does not contain any partial PP sequence or multiples thereof). Further, it is preferred that PAS comprises no more than 6 identical consecutive amino acid residues (i.e., that it does not, inter alia, contain any partial AAAAAA sequence or multiples thereof).

In another embodiment, the herein provided random coil polypeptide comprises an amino acid sequence comprising alanine, proline, and serine residues (P, A, and S) with about one, about two, about three, about four or about five Cys residues. Preferably, the random coil polypeptide comprises a single N- terminal Cys residue or a single C-terminal Cys residue. Accordingly, the random coil polypeptide may comprise an N-terminal or a C-terminal cysteine residue.

Example 4 illustratively demonstrates the recombinant production of concatenated polypeptides comprising two or more random coil polypeptides linked by a cleavable linker, wherein each of said random coil polypeptides comprises a C-terminal Cys residue (Figure 8). Further, Example 4 also illustrates the successful release of said random coil polypeptides (each comprising a C-terminal Cys residue) from said concatenated polypeptides (Figure 9).

As detailed herein above, it is furthermore preferred that at least 90 mol%, preferably at least 92 mol%, more preferably at least 93 mol%, more preferably at least 94 mol%, more preferably at least 95 mol%, more preferably at least 96 mol%, more preferably at least 97 mol%, even more preferably at least 98 mol%, yet even more preferably at least 99 mol%, and most preferably 100 mol% of the number of amino acid residues in PAS are independently selected from proline, alanine, and serine. The remaining amino acid residues in PAS are preferably selected from the 20 standard proteinogenic a-amino acids, more preferably from proline, alanine, serine, glycine, cysteine, valine, asparagine, and glutamine, and even more preferably from proline, alanine, glycine, cysteine, and serine. Accordingly, it is preferred that PAS is composed of proline, alanine, glycine, cysteine, and serine residues (wherein less than 10 mol%, preferably less than 5 mol%, of the number of amino acid residues in PAS are glycine, cysteine, or serine residues), and it is most preferred that PAS is composed of proline and alanine residues, i.e. consists solely of proline, alanine, and serine residues. It will be understood that, as specified above, PAS includes at least one proline residue, at least one alanine residue, and at least one serine residue.

The number of amino acid residues that the herein provided PAS polypeptide is composed of is preferably about 10 to about 300 amino acid residues, more preferably about 10 to about 250 amino acid residues, more preferably about 10 to about 200 amino acid residues, even more preferably about 15 to about 150 amino acid residues, more preferably about 10 to about 140 amino acid residues, even more preferably about 10 to about 130 amino acid residues, even more preferably about 15 to about 120 amino acid residues, even more preferably about 15 to about 110 amino acid residues, and yet even more preferably about 20 to about 100 amino acid residues. PAS sequences comprising about 20, about 40, about 60, about 80 or about 100 amino acid residues may be even more preferred. This means that the herein provided random coil polypeptide comprises at least about 10 amino acid residues. Most preferably, the herein provided random coil polypeptide comprises about 20, about 30, about 40, about 50, about 60, about 80, or about 100 amino acid residues.

Examples of preferred PAS amino acid sequences include, in particular, such amino acid sequences that comprise (or, preferably, that consist of): (i) any of the sequences of “PAS#1” (ASPAAPAPASPAAPAPSAPA; SEQ ID NO: 1), “PAS#2” (AAPASPAPAAPSAPAPAAPS; SEQ ID NO: 2), “PAS#7” (SEQ ID NO: 5), or (ii) a fragment of any of these sequences, or (iii) a combination of two or more of these sequences (which may be the same or different, i.e., any combination of two or more (e.g., two, three, four, five, six, seven, eight, nine, ten, or more) of these sequences (inter alia, PAS#1, PAS#2, PAS#7); a corresponding example is a dimer of PAS#1 (“PAS#1- PAS#1”), i.e. ASPAAPAPASPAAPAPSAPAASPAAPAPASPAAPAPSAPA; further non-limiting examples include PAS#1-PAS#2 (i.e. ASPAAPAPASPAAPAPSAPAAAPASPAPAAPSAPAPAAPS), PAS#1-PAS#7, PAS#2-PAS#1, PAS#2-PAS#2, PAS#2-PAS#7, PAS#7-PAS#1, PAS#7-PAS#2, PAS#7-PAS#7, PAS#1- PAS#1-PAS#1, PAS#1-PAS#1-PAS#2, PAS#1-PAS#1-PAS#7, PAS#1-PAS#2-PAS#1, PAS#1-PAS#2- PAS#2, PAS#1-PAS#2-PAS#7, PAS#1-PAS#7-PAS#1, PAS#1-PAS#7-PAS#2, PAS#1-PAS#7-PAS#7, PAS#2-PAS#1-PAS#1, PAS#2-PAS#1-PAS#2, PAS#2-PAS#1-PAS#7, PAS#2-PAS#2-PAS#1, PAS#2- PAS#2-PAS#2, PAS#2-PAS#2-PAS#7, PAS#2-PAS#7-PAS#1, PAS#2-PAS#7-PAS#2, PAS#2-PAS#7- PAS#7, PAS#7-PAS#1-PAS#1, PAS#7-PAS#1-PAS#2, PAS#7-PAS#1-PAS#7, PAS#7-PAS#2-PAS#1, PAS#7-PAS#2-PAS#2, PAS#7-PAS#2-PAS#7, PAS#7-PAS#7-PAS#1, PAS#7-PAS#7-PAS#2, or PAS#7-PAS#7-PAS#7).

Further examples of the herein above (and below) detailed PA and/or PAS amino acid sequences (or PA and/or PAS random coil polypeptides comprising such amino acid sequences) are, inter alia, illustrated in WO 2008/155134, WO 2011/144756, WO 2017/109087, and WO 2018/234455, which are hereby incorporated by reference in their entirety. Such examples are further illustratively and non-limitingly provided as SEQ IDs NO: 1 to 12 and/or SEQ IDs NO: 30 to 35. Accordingly, the present invention further provides for a random coil polypeptide selected from the group consisting of the following (I) to (II): (I) a random coil polypeptide comprising an amino acid sequence selected from the group consisting of SEQ IDs NO: 1 to 12; (II) a random coil polypeptide comprising an amino acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the amino acid sequence as defined in (I). The present invention further provides for a random coil polypeptide selected from the group consisting of the following (I) to (II): (I) a random coil polypeptide consisting of an amino acid sequence selected from the group consisting of SEQ IDs NO: 1 to 12 or a random coil polypeptide consisting of an amino acid sequence consisting of multiples and/or combinations of (an) amino acid sequence(s) selected from the group consisting of SEQ IDs NO: 1 to 12; (II) a random coil polypeptide comprising an amino acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the amino acid sequence as defined in (I).

In the context of the present invention, the herein provided random coil polypeptides may additionally comprise an N-terminal blocking residue. Said N-terminal blocking residue may be an amino acid residue, preferably an amino acid residue that is covalently bound to the amino terminus (N-terminus) of said random coil polypeptide. This N-terminal blocking residue may be a single amino acid residue but may also comprise two or more amino acid residues, however, a single amino acid residue is preferred herein. Most preferably, said N-terminal blocking residue consists of a glutamine or glutamate residue. As will be detailed herein below, said N-terminal blocking residue may be cyclized to form an N-terminal protection group on the (purified) random coil polypeptide (after the cleavage of the concatenated polypeptide comprising said random coil polypeptide). Accordingly, the present invention provides for means and methods for the N-terminal protection of the herein provided random coil polypeptides, wherein N-terminal protection of said (purified) random coil polypeptide may comprise cyclization of said N-terminal blocking residue. The cyclisation of said N-terminal blocking residue may thereby produce/generate/form an N-terminal protecting group, as is further detailed herein below. It is evident for the person skilled in the art that glutamine or glutamate residues may also autocatalytically cyclize to form a pyroglutamate residue, which may act as N-terminal protecting group, as is further detailed herein below. Accordingly, the N-terminal blocking residue may consist of a glutamine, glutamate, or pyroglutamate residue.

As mentioned above, the herein provided concatenated polypeptide comprises two or more random coil polypeptides linked by a cleavable linker, accordingly, said concatenated polypeptide may comprise at least 2 random coil polypeptides (linked by a cleavable linker). In the context of the present invention, it is preferred that said concatenated polypeptide comprises between about 2 and about 40 random coil polypeptides. This means that the herein provided concatenated polypeptide may comprise about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, or about 20, or about 24, or about 30, or about 36, or about 40 random coil polypeptides. Preferably said concatenated polypeptide comprises at least about 2, at least about 5, or at least about 10, or at least about 20 random coil polypeptides. Exemplary and nonlimiting concatenated polypeptides comprising 2, 5, or 10 random coil polypeptides can be found in SEQ IDs NO: 17, 18, 23, 24, SEQ IDs NO: 15, 16, 21, 22, and SEQ IDs NO: 13, 14, 19, 20, respectively, and are further illustrated in the herein provided examples.

In the context of the present invention, said concatenated polypeptide may comprise two or more different random coil polypeptides (linked by a cleavable linker). This means that the two or more random coil polypeptides (linked by a cleavable linker and) comprised in the herein provided concatenated polypeptides may comprise different amino acid sequences and/or amino acid sequences of different lengths. Accordingly, the present invention also provides for concatenated polypeptides comprising two or more random coil polypeptides linked by a cleavable linker, wherein said two or more random coil polypeptides each comprise different amino acid sequences and/or amino acid sequences of different length. It is, however, most preferred herein that said two or random coil polypeptides (linked by a cleavable linker and comprised in the herein provided concatenated polypeptides) each comprise the same amino acid sequence and/or amino acid sequences of the same length. Accordingly, it is most preferred in the context of the present invention that the herein provided concatenated polypeptide comprises only one type of random coil polypeptide. This means that it is most preferred herein that the herein provided concatenated polypeptide comprises random coil polypeptides only consisting of the same/an identical amino acid sequence and/or of an amino acid sequence of identical/the same length. Accordingly, the present invention also provides for concatenated polypeptides comprising two or more random coil polypeptides linked by a cleavable linker, wherein said two or more random coil polypeptides each comprise the same amino acid sequence.

In one aspect, the present invention further relates to a carrier polypeptide. Accordingly, the herein provided concatenated polypeptide may further comprise a carrier polypeptide. Carrier polypeptides are known to the person skilled in the art (sometimes called fusion partners or protein tags and the like) and have been described in the scientific literature, e.g. in Ki (2020) Appl. Microbiol. Biotechnol. 104:2411- 2425 or Costa (2014) Front. Microbiol. 5:63. In the context of the present invention said carrier polypeptide is preferably linked to the herein provided concatenated polypeptide via a cleavable linker (as further detailed herein below). Said carrier polypeptide may be present on the N-terminus or on the C- terminus of said concatenated polypeptide, preferably on the N-terminus. Accordingly, said carrier polypeptide may be an N-terminal carrier polypeptide or a C-terminal carrier polypeptide, preferably an N-terminal carrier polypeptide. The skilled person is aware of suitable carrier polypeptides, however, non-limiting examples may also comprise thioredoxin A (TrxA) from Escherichia coli (E. coli). maltose- binding protein (MBP) from E. coli or NusA from E. coli or SUMO from yeast or galactose-binding protein (GBP) from E. coli or glutathione-S-Transferase, preferably TrxA. Accordingly, said carrier polypeptide may be selected from thioredoxin A (TrxA) from Escherichia coli (E. coli), maltose-binding protein (MBP) from E. coli or NusA from E. coli or SUMO from yeast or galactose-binding protein (GBP) from E. coli, preferably TrxA and, most preferably TrxA containing an additional Uys residue directly downstream of the start Met residue as well as an exchange of Met at position 39 by Gin (M39Q).

Additionally or alternatively, said concatenated polypeptide may (further) comprise a signal peptide/signal sequence/targeting sequence. Such a signal peptide/signal sequence/targeting sequence may be present on the N-terminus or on the C-terminus of said concatenated polypeptide, preferably on the N-terminus. The person skilled in the art is aware that signal peptide/signal sequence/targeting sequence may result in targeting of (expressed/translated) polypeptides to specific cellular/subcellular/extracellular compartments or to the culture medium and is further able to select adequate signal peptides/signal sequences/targeting sequences, whose suitability may depend on the choice of the host cell. Non-limiting examples of signal peptides/signal sequences/targeting sequences are CspA or CspB signal peptide from C. glutamicum, OmpA, PhoA, DsbA, PelB oder CGTase signal peptides or derivatives thereof from E. coli or signal peptides of Epr, YncM, UipA, XynA, or GlpQ from Bacillus subtilis, signal peptides of phosphate binding protein (pbp), outer membrane porin E (OprE), azurin, iron (III) binding protein, lipoprotein B from Pseudomonas fluorescens, or signal peptides of albumin, azurocidin, interleukin-2, insulin, chymotrypsinogen, trypsinogen-2 or derivatives thereof from Homo sapiens. In the context of the present invention the signal peptide/signal sequence/targeting sequence is preferably linked to the herein provided concatenated polypeptide via a cleavable linker (as further detailed herein below).

In another aspect, the herein provided concatenated polypeptide further comprises a tag. Said tag is preferably connected to said concatenated polypeptide or to said carrier polypeptide or to said peptide/signal sequence/targeting sequence via a cleavable linker, more preferably said tag is connected to said concatenated polypeptide via a cleavable linker. Preferably said tag is an affinity tag, more preferably said affinity tag is selected from polyhistidine-tag, Strep-tag, Strep-tag II, FLAG tag, HA tag, Halo tag, Arg-tag, preferably a hexahistidine tag. Said tag may be an N-terminal tag or a C-terminal tag, preferably a C-terminal tag. The skilled person is aware that an N-terminal (affinity) tag is attached/fused/coupled to the amino-terminus of a polypeptide (inter alia, of said concatenated polypeptide), wherein a C-terminal (affinity) tag is attached/fused/coupled to the carboxy-terminus of a polypeptide (inter alia _a of said concatenated polypeptide). The skilled person is, furthermore, aware of alternative (affinity) tags and readily knows how to apply them in order to purify polypeptides.

It is further envisaged herein that said affinity tag and the herein above detailed carrier polypeptide may be identical/the same. The skilled person is aware of affinity tags that may also function as carrier polypeptide (and vice versa) and may readily choose adequate affinity tags and/or carrier polypeptides.

In the following, the herein provided cleavable linker will be further detailed. In this context, the term “cleavable linker” may refer to any herein and above detailed cleavable linker. This means that the herein detailed cleavable linker may be a cleavable linker linking the two or more random coil polypeptides, may be a cleavable linker linking the carrier polypeptide to, e.g. but not limiting, one of the two or more random coil polypeptides, may be a cleavable linker linking the tag to, e.g. but not limiting, one of the two or more random coil polypeptides, may be a cleavable linker linking the signal peptide to, e.g. but not limiting, one of the two or more random coil polypeptides.

In the context of the present invention, the cleavable linker may comprise one or more amino acid residues. Further, said cleavable linker may preferably comprise less than about 10 amino acid residues, preferably one amino acid residue. Accordingly, said cleavable linker may comprise or consists of about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10 amino acid residues, most preferably said cleavable linker comprises about 1 amino acid residue. More preferably said cleavable linker does not solely comprise proline, alanine, and/or serine. Accordingly, if such a cleavable linker comprises or consist of more than one amino acids it may inter alia comprise proline, alanine, and/or serine, however, if the cleavable linker comprises solely or consists of solely a single amino acid residue, said amino acid residue may not be selected from proline, alanine, and/or serine. In one aspect, said cleavable linker comprises or consists of one or more basic amino acid residues, preferably one basic amino acid residue. Said one or more basic amino acid residues may be selected from one or more lysine and/or one or more arginine residues, preferably one or more lysine residues, more preferably one lysine residue or one arginine residue. In another aspect said cleavable linker comprises or consists of one or more hydrophobic amino acid residues, preferably one hydrophobic amino acid residue. Said one or more hydrophobic amino acid residues may be selected from one or more tyrosine residues, one or more phenylalanine residues, one or more tryptophan residues, and/or one or more leucine residues, preferably one tyrosine residue, one phenylalanine residue, one tryptophan residue, or one leucine residue. A nonlimiting list of exemplary cleavable linkers is provided in Tables 1 and 2. The cleavable linker and means and methods for its cleavage are further detailed herein below.

In the context of the present invention, it is preferred that the herein provided cleavable linker linking the two or more random coil polypeptides and thereby forming the herein provided concatenated polypeptide does not form part of /is not comprised in any of the two or more random polypeptides. This means, if the cleavable linker, for example, consists of one or more lysine residues or one or more arginine residues, it is preferred that neither lysine, nor arginine form part of the random coil polypeptide. In other words, if the cleavable linker, for example, consists of one or more lysine residues or one or more arginine residues, it is preferred that no lysine residue and no arginine residue is comprised in the random coil polypeptide. Similarly, if the cleavable linker is, for example, selected from one or more tyrosine residues, one or more phenylalanine residues, one or more tryptophan residues, and/or one or more leucine residues, it is preferred herein that none of these hydrophobic amino acids are comprised in the two or more random coil polypeptides.

The skilled person is aware that a concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker may comprise more than two random coil polypeptides and accordingly must comprise two or more cleavable linkers. For example, a concatenated polypeptide comprising 3 random coil polypeptides, in the context of the present invention comprises at least 2 cleavable linkers. For example, a concatenated polypeptide comprising 4 random coil polypeptides, in the context of the present invention comprises at least 3 cleavable linkers. For example, a concatenated polypeptide comprising 5 random coil polypeptides, in the context of the present invention comprises at least 4 cleavable linkers. Furthermore, a concatenated polypeptide in the context of the present invention may comprise an additional cleavable linker at the junction to the carrier polypeptide.

It is thus evident that a herein provided concatenated polypeptide may comprise more than one cleavable linkers. In the context of the present invention, said more than one cleavable linkers may be identical or may not be identical. In other words, the entirety of all cleavable linkers in one concatenated polypeptides may be the same or may be different. In the context of the present invention the entirety of all cleavable linkers in one concatenated random coil polypeptides may be selected from, inter alia, one or more lysine residues, one or more arginine residues, one or more tyrosine residues, one or more phenylalanine residues, one or more tryptophan residues, and/or one or more leucine residues. Exemplary and nonlimiting combinations of cleavable linkers are comprised in SEQ IDs NO: 13 to 24.

In one preferred aspect, the present invention provides for a method for the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker, wherein said cleavable linker is selected from one or more lysine residues, one or more arginine residues, one or more tyrosine residues, one or more phenylalanine residues, one or more tryptophan residues, and/or one or more leucine residues, and wherein the method comprises the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker; and b) the purification of said concatenated polypeptide.

In one even more preferred aspect, the present invention provides for a method for the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker, wherein said cleavable linker is selected from one or more lysine residues and/or one or more arginine residues, preferably one lysine residue and/or one arginine residue, and wherein the method comprises the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker; and b) the purification of said concatenated polypeptide.

In another preferred aspect, the present invention provides for a method for the production of random coil polypeptides comprising the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said cleavable linker is selected from one or more lysine residues, one or more arginine residues, one or more tyrosine residues, one or more phenylalanine residues, one or more tryptophan residues, and/or one or more leucine residues, and wherein said random coil polypeptides are linked by a cleavable linker; b) the purification of said concatenated polypeptide; c) the cleavage of said concatenated polypeptide; and d) the purification of said two or more random coil polypeptides.

In a more preferred aspect, the present invention provides for a method for the production of random coil polypeptides comprising the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said cleavable linker is selected from one or more lysine residues and/or one or more arginine residues, preferably one lysine residue and/or one arginine residue, and wherein said random coil polypeptides are linked by a cleavable linker; b) the purification of said concatenated polypeptide; c) the cleavage of said concatenated polypeptide; and d) the purification of said two or more random coil polypeptides. Exemplary and non-limiting examples of the herein provided concatenated polypeptides are illustrated in the appended examples and further schematically illustrated in the appended Figure 1. Corresponding, yet illustrative amino acid sequences are provided in the appended sequence listing. Accordingly, the present invention, inter alia, provides for concatenated polypeptides, wherein said concatenated polypeptide comprises an amino acid sequence selected from the group consisting of the following (I) to (II): (I) a concatenated polypeptide comprising an amino acid sequence selected from the group consisting of SEQ IDs NO: 13 to 24; (II) a concatenated polypeptide comprising an amino acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the amino acid sequence as defined in (I).

As mentioned herein above, the recombinant production of the herein provided concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker, may further comprise the expression of said concatenated polypeptide via/from a nucleic acid molecule or via/from a nucleic acid vector encoding said concatenated polypeptide. Accordingly, the present invention also relates to nucleic acid molecules encoding said concatenated polypeptide and to nucleic acid vectors encoding said concatenated polypeptide. Further, the present invention also relates to methods for expressing said concatenated polypeptide from said nucleic acid molecules or a nucleic acid vectors.

Said nucleic acid molecule may be selected from the group consisting of the following (I) to (IV): (I) a nucleic acid molecule comprising a nucleic acid sequence selected from the group consisting of SEQ IDs NO: 36 to 47; (II) a nucleic acid molecule hybridizing with a nucleic acid molecule comprising a nucleic acid sequence complementary to a nucleic acid sequence selected from the group consisting of SEQ ID NO: 36 to 47 under stringent conditions; (III) a nucleic acid molecule comprising a nucleic acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the nucleic acid sequence as defined in any one of (I) and (II); and (IV) a nucleic acid molecule comprising a nucleic acid sequence encoding a concatenated polypeptide as defined herein above and/or a nucleic acid molecule comprising a nucleic acid sequence encoding a concatenated polypeptide comprising an amino acid sequence selected from the group of SEQ IDs NO: 13 to 24.

Many suitable nucleic acid vectors are known to those skilled in molecular biology. The choice of a suitable nucleic acid vector depends on the function desired, including plasmids but not limited to plasmids and other nucleic acid vectors used conventionally in genetic engineering. Preferably, the nucleic acid vector is a plasmid or a vector that allows genomic integration of the gene encoding the concatenated polypeptide. Preferably, said nucleic acid vector and/or plasmid comprises means for the constitutive or inducible expression of the concatenated polypeptide encoded by said nucleic acid vector and/or plasmid, preferably a constitutive promoter or an inducible promoter. The person skilled in the art is able to identify and select adequate constitutive and/or inducible promoters. A non-limiting example of a constitutive promoter is the cspB promoter for recombinant expression of polypeptides in, inter alia, Corynebacterium spp. Preferably, the promoter comprised in the nucleic acid vector and/or plasmid of the present invention is an inducible promoter. Suitable promoters for the inducible expression of, inter alia, concatenated polypeptides in yeasts (including but not limiting to Pichia spp.) include, inter alia, arabinose inducible promoters, anhydrotetracycline inducible promoters, methanol inducible promoters. Suitable promoters for the inducible expression of, inter alia, concatenated polypeptides in bacteria (including but not limiting to Escherichia coli or Corynebacterium glutamicum) include, inter alia, an isopropyl P-D-l -thiogalactopyranoside (IPTG) inducible promoter (such as lac(p/o)), the (anhydro)tetracycline inducible promoter tetA(p/o), the arabinose inducible promoter PBAD (araBp), the gluconate inducible promoter Pgitl and the maltose-inducible promoter PmalEl. In the context of the present invention, an isopropyl P-D-l -thiogalactopyranoside (IPTG) inducible promoter may be preferred.

Said nucleic acid vector may comprise a nucleic acid sequence selected from the group consisting of the following (I) to (IV): (I) a nucleic acid vector comprising a nucleic acid sequence selected from the group consisting of SEQ IDs NO: 48 to 59; (II) a nucleic acid vector hybridizing with a nucleic acid molecule comprising a nucleic acid sequence complementary to a nucleic acid sequence selected from the group consisting of SEQ IDs NO: 48 to 59 under stringent conditions; (III) a nucleic acid vector comprising a nucleic acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the nucleic acid sequence as defined in any one of (I) and (II); and (IV) a nucleic acid molecule comprising a nucleic acid sequence encoding a concatenated polypeptide as defined herein above and/or a nucleic acid molecule comprising a nucleic acid sequence encoding a concatenated polypeptide comprising an amino acid sequence selected from the group of SEQ IDs NO: 13 to 24.

As already mentioned herein above, the recombinant production of the herein provided concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker, may further comprise the culturing of a host or the culturing of a host cell. Said host or host cell may be prokaryotic or eukaryotic. Exemplary eukaryotic hosts or host cells may be animal cell lines (such as insect cell lines or e.g., CHO or COS), human cell lines (e.g., HEK) or yeasts, preferably Pichia spp. . In a preferred aspect said prokaryotic or eukaryotic host or host cell is procaryotic, more preferably E. coli, P. fluorescence, C. glutamicum or a Bacillus strain, even more preferably E. coli. In another aspect, the purification of the recombinantly produced concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker, may further comprise performing purification via affinity chromatography of said concatenated polypeptide, preferably purification via immobilized metal ion affinity chromatography (IMAC).

As already mentioned herein above, the present invention also provides for concatenated polypeptides that are obtained by/obtainable by the herein detailed methods and/or provides for concatenated polypeptides comprising two or more random coil polypeptides linked by a cleavable linker.

In one preferred aspect, the present invention provides for a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker, and wherein said cleavable linker is selected from one or more lysine residues, one or more arginine residues, one or more tyrosine residues, one or more phenylalanine residues, one or more tryptophan residues, and/or one or more leucine residues.

In one preferred aspect, the present invention provides for a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker, and wherein said cleavable linker is selected from one or more tyrosine residues, one or more phenylalanine residues, one or more tryptophan residues, and/or one or more leucine residues, preferably one tyrosine residue, one phenylalanine residue, one tryptophan residue or one leucine residue.

In an even more preferred aspect, the present invention provides for a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker, and wherein said cleavable linker is selected from one or more lysine residues and/or one or more arginine residues, preferably one lysine residue or one arginine residue.

As mentioned above, the present invention further relates to a nucleic acid molecule comprising a nucleic acid sequence encoding the herein provided concatenated polypeptide. As detailed herein above such a nucleic acid molecule may, in the context of the present invention, also be referred to as “concatenated nucleic acid”. As also detailed herein above, in the context of the present invention, a low genetic redundancy of such concatenated nucleic acids is desired. Hence, a concatenated polypeptide comprising (structurally) identical random coil polypeptides (herein, inter alia, referred to as units/elements/parts/pieces) may be encoded by a concatenated nucleic acid comprising (structurally) different (i.e., structurally non-identical) nucleic acids (or nucleic acid units/elements/parts/pieces). With regard to the herein provided nucleic acid molecule comprising a nucleic acid sequence encoding the herein provided concatenated polypeptide, each random coil polypeptide comprised in said encoded concatenated polypeptide may be encoded by a different nucleic acid sequence, preferably is encoded by a different nucleic acid sequence. Said concatenated nucleic acid molecule may further comprise a nucleic acid sequence, which may be selected from the group consisting of the following (I) to (IV): (I) a nucleic acid molecule comprising a nucleic acid sequence selected from any one of SEQ IDs NO: 36 to 47; (II) a nucleic acid molecule hybridizing with a nucleic acid molecule comprising a nucleic acid sequence complementary to the nucleic acid sequence selected from any one of SEQ IDs NO: 36 to 47 under stringent conditions; (III) a nucleic acid molecule comprising a nucleic acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the nucleic acid sequence as defined in any one of (I) and (II); (IV) a nucleic acid molecule comprising a nucleic acid sequence encoding a concatenated polypeptide as defined herein above and/or a nucleic acid molecule comprising a nucleic acid sequence encoding a concatenated polypeptide comprising an amino acid sequence selected from the group of SEQ IDs NO: 13 to 24.

As detailed herein above, a concatenated nucleic acid (such as a nucleic acid molecule encoding the concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker) may comprise two or more nucleic acids each encoding a random coil polypeptide. Further, said two or more nucleic acids each encoding a random coil polypeptide, in the context of the present invention, are linked by a nucleic acid molecule encoding the herein above detailed cleavable linker. This means that a concatenated nucleic acid of the present invention at least encodes a concatenated polypeptide of a format exemplarily illustrated as: “random coil polypeptide-cleavable linker-random coil polypeptide(-cleavable linker)”. Another exemplary concatenated nucleic acid of the present invention may encode a concatenated polypeptide of a format exemplarily illustrated as: “random coil polypeptide-cleavable linker-random coil polypeptide-cleavable linker-random coil polypeptide(-cleavable linker)”. It is evident that the difference of said two exemplary illustrated formats is the addition of “random coil polypeptide- cleavable linker”. This means that the second exemplary illustrated format refers to a concatenated nucleic acid that encodes a concatenated polypeptide encoding one additional cleavable linker and one additional random coil polypeptide as compared to the first exemplary illustrated format. Accordingly, the herein above provided exemplarily illustrated formats may be readily further extended by the addition of a “random coil polypeptide-cleavable linker”-(nucleic acid) unit. Evidently, the addition of such a “random coil polypeptide-cleavable linker”-(nucleic acid) unit results in the extension of said concatenated nucleic acid.

Another exemplary concatenated nucleic acid of the present invention may encode a concatenated polypeptide of the following (non-limiting) structure: “carrier polypeptide-cleavable linker-random coil polypeptide-cleavable linker-random coil polypeptide-cleavable linker-random coil polypeptide-cleavable linker-affinity tag”. Another exemplary concatenated nucleic acid of the present invention may encode a concatenated polypeptide of the following (non-limiting) structure: “signal peptide-random coil polypeptide-cleavable linker-random coil polypeptide-cleavable linker-random coil polypeptide-cleavable linker-affinity tag”. As detailed herein above and below, in the above mentioned (non-limiting) structures, the number of the “random coil polypeptide-cleavable linker”-units can be decreased to two or further increased to e.g., 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 etc. in the context of the present invention.

In the context of the present invention, the length of said concatenated nucleic acids and the number of such units per nucleic acid is not particularly limited. The person skilled in the art is aware of genetic and molecular biological means and methods to further extend a given concatenated nucleic acid encoding a concatenated polypeptide of the present invention by a further nucleic acid encoding a cleavable linker fused to a random coil polypeptide. Exemplary means and methods are further detailed herein below and the illustrative and non-limiting Example 1 further provides for means and methods to generate concatenated nucleic acids encoding concatenated polypeptides comprising varying numbers of cleavable linkers and varying numbers of random coil polypeptides (i.e., varying numbers of “random coil polypeptide-cleavable linker”- units). Accordingly, the present invention also provides for a nucleic acid sequence encoding a random coil polypeptide coupled to a C-terminal cleavable linker, wherein said nucleic acid molecule is selected from the group consisting of the following (I) to (IV): (I) a nucleic acid molecule comprising a nucleic acid sequence selected from any one of SEQ IDs NO: 60 to 127; (II) a nucleic acid molecule hybridizing with a nucleic acid molecule comprising a nucleic acid sequence complementary to the nucleic acid sequence selected from any one of SEQ IDs NO: 60 to 127 under stringent conditions; (III) a nucleic acid molecule comprising a nucleic acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the nucleic acid sequence as defined in any one of (I) and (II); (IV) a nucleic acid molecule comprising a nucleic acid sequence encoding a random coil polypeptide comprising an amino acid sequence selected from any one of SEQ IDs NO: 1 to 12 coupled to a C-terminal cleavable linker as defined herein above.

As detailed herein above, the nucleic acid sequences encoding a random coil polypeptide coupled to a C- terminal cleavable linker (i.e., a nucleic acid encoding a “random coil polypeptide-cleavable linker“-unit) may be linked in succession, resulting in a concatenated nucleic acid encoding a concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker. Exemplary and nonlimiting examples of such nucleic acid sequences encoding a random coil polypeptide coupled to a C- terminal cleavable linker can be found in SEQ IDs NO: 60 to 127. The person skilled in the art is aware that a cleavable linker, consisting of, e.g., a lysine residue may either be encoded by the amino acid codon “AAA” or the amino acid codon “AAG”. Accordingly, the skilled person is aware that in the context of the present invention, such lysine codons (i.e., “AAA” or “AAG”) may be readily interchanged. Herein above, alternative cleavable linkers have been defined (for example and non-limiting, arginine, tyrosine, leucine, etc.). The skilled person is aware that the genetic code is degenerate, which means that a single amino acid may be encoded by multiple amino acid codons. Accordingly, the herein above defined alternative cleavable linkers (for example and non-limiting, arginine, tyrosine, leucine, etc.) may each also be encoded by more than one respective amino acid codons. Such (degenerate) amino acid codons (for example and non-limiting, arginine codons, tyrosine codons, leucine codons, etc.) may each also be readily interchanged in the context of the present invention. Accordingly, it is thus also envisaged that a concatenated nucleic acid molecule comprising a nucleic acid sequence encoding two or more random coil polypeptides linked by a cleavable linker may comprise more than one different amino acid codons encoding said cleavable linker. This means that, for example, a concatenated nucleic acid molecule comprising a nucleic acid sequence encoding, e.g., three random coil polypeptides mutually linked by a cleavable linker (i.e., in the exemplary format “random coil polypeptide-cleavable linker-random coil polypeptide-cleavable linker-random coil polypeptide-cleavable linker”) encodes at least two cleavable linkers. Optionally, the C-terminal cleavable linker may be omitted in order to directly generate a free peptide carboxylate group for the C-terminal random coil polypeptide (unit) if no C-terminal affinity tag is present. Due to the degenerate genetic code, said three cleavable linkers may each be encoded by the same or varying nucleic acid sequences.

As detailed herein above and below, a nucleic acid molecule encoding a random coil polypeptide coupled to a C-terminal cleavable linker may further encode, inter alia, an amino acid codon selected from the group of glutamine or glutamate at the 5 ’-end. In the context of the present invention, the glutamine codons “CAA” and “CAG” may readily be interchanged. Similarly, the glutamate codons “GAA” and “GAG” may also be readily interchanged.

Further, as has been detailed herein above, in the context of the present invention, it is desired that the herein provided concatenated nucleic acids comprise low sequence redundancy. Similarly, in the context of the present invention, it is desired that the two or more random coil polypeptides comprised in the herein provided concatenated polypeptides are each encoded by different nucleic acid sequences. Further, it is desired that there is low sequence redundancy within a nucleic acid encoding one of such random coil polypeptides. SEQ IDs NO: 128 to 247 provide exemplary low-redundant nucleic acid sequences encoding such random coil polypeptides of different lengths. Said exemplary low-redundancy nucleic acids sequences (e.g., SEQ IDs NO: 128 to 247) may also be linked/fused/coupled to produce nucleic acids encoding longer random coil polypeptides (each comprising, inter alia, two or more of such amino acid sequences shown in SEQ IDs NO: 128 to 247, or fragments thereof, or multiples of fragments thereof). Accordingly, in the context of the present invention, it is evident that two (or more) nucleic acids each encoding the same or a different random coil polypeptides may be linked/fused/coupled by, e.g., genetic and/or molecular biological means, thus producing a nucleic acid encoding a single random coil polypeptide that comprises the combined length of the two (or more) random coil polypeptides encoded by the two linked/fused/coupled nucleic acid. For examples, the nucleic acid molecules having a nucleic acid sequence of SEQ ID NO: 128 (encoding a random coil polypeptide of the amino acid sequence: ASPAAPAPASPAAPAPSAPA; SEQ ID NO: 1; PAS#1) and of SEQ ID NO: 129 (also encoding a random coil polypeptide of the amino acid sequence: ASPAAPAPASPAAPAPSAPA; SEQ ID NO: 1; PAS#1) may be linked/fused/coupled to generate, e.g., a nucleic acid molecule having a nucleic acid sequence comprised in the sequence shown in SEQ ID NO: 70 (encoding a random coil polypeptide comprising the amino acid sequence: ASPAAPAPASPAAPAPSAPAASPAAPAPASPAAPAPSAPA; PAS#1-PAS#1). Alternatively, the nucleic acid molecule having a nucleic acid sequence of SEQ ID NO: 128 (encoding a random coil polypeptide of the amino acid sequence: ASPAAPAPASPAAPAPSAPA; SEQ ID NO: 1; PAS#1) may also be linked/fused/coupled with itself resulting in, e.g., a nucleic acid molecule having a nucleic acid sequence also encoding a random coil polypeptide of the amino acid sequence: ASPAAPAPASPAAPAPSAPAASPAAPAPASPAAPAPSAPA (PAS#1-PAS#1), however, having increased nucleic acid sequence redundancy and increased genetic instability as compared to a nucleic acid comprising the sequence of SEQ ID NO: 70. Thus, it is herein preferred to couple two (or more) nucleic acid molecules each encoding the same amino acid sequence, yet having different nucleic acid sequences (e.g. and not limiting, SEQ IDs NO: 128 and 129). It is evident that this allows for low nucleic acid sequence redundancy. Accordingly, nucleic acids molecules encoding random coil polypeptides may be, for example and non-limiting, generated by combining nucleic acid molecules having nucleic acid sequences of SEQ IDs NO: 128 to 247in any possible combination.

Further and as detailed herein above and below, the resulting nucleic acid molecules encoding said random coil polypeptides may be linked/fused/coupled by a nucleic acid encoding a cleavable linker, thereby producing a concatenated nucleic acid encoding a concatenated polypeptide. Said concatenated nucleic acid molecule may further encode a carrier polypeptide selected from the group of TrxA from E. coli. MBP from E. coli or NusA from E. coli or SUMO from yeast or GBP from E. coli or glutathione-S- Transferase or a mutated dehalogenase from Rhodococcus sp. (HaloTag), preferably TrxA, and wherein the nucleic acid sequence encoding a carrier polypeptide is linked to the neighboring genetic element by a cleavable linker. Alternatively and/or additionally, said nucleic acid molecule may further encode a signal peptide/signal sequence/targeting sequence, as detailed herein above, and wherein the nucleic acid sequence encoding said peptide/signal sequence/targeting sequence is linked to the neighboring genetic element by a cleavable linker. Said nucleic acid molecule may further encode a tag, preferably an affinity tag selected from the group of polyhistidine-tag, Strep-tag, Strep-tag II, FLAG tag, HA tag, Halo tag, Arg-tag, preferably a hexahistidine tag, and wherein the nucleic acid sequence encoding a tag is linked to the neighboring genetic element by a cleavable linker. In the context of said nucleic acid, said cleavable linker may be encoded by a nucleic acid sequence encoding one or more basic amino acid residues, preferably one or more lysine or arginine residues, more preferably one or more lysine residues, most preferably one lysine residue or one arginine residue. Alternatively, said nucleic acid may encode a cleavable linker as defined herein above.

The present invention further provides for a polypeptide encoded by any nucleic acid molecule detailed herein above.

The present invention further relates to a nucleic acid vector comprising any nucleic acid molecule detailed herein above. Further details on suitable nucleic acid vectors in accordance with the present invention have already been provided herein above.

The present invention further relates to a host or host cell comprising any nucleic acid molecule detailed herein above and/or a host or host cell comprising any nucleic acid vector detailed herein above. Further details on suitable hosts or host cells in accordance with the present invention have already been provided herein above. The skilled person is aware that the herein provided nucleic acid molecules and/or nucleic acid vectors may require codon optimization depending on the host or host cell that is to be used for the expression of the (concatenated) polypeptide encoded by the respective nucleic acid molecule and/or nucleic acid vector. Further, the skilled artisan is aware that it is generally desired to use nucleic acid molecules and/or nucleic acid vectors having low secondary structure content and/or encode nucleic acid molecules with low secondary structure.

As already mentioned herein above, the present invention also provides for means and methods for the production of a random coil polypeptide, wherein said method comprises the steps of a) the cleavage of a concatenated polypeptide as detailed herein above, the cleavage of a concatenated polypeptide expressed from a nucleic acid molecule as detailed herein above, or the cleavage of a concatenated polypeptide expressed from a nucleic acid vector as detailed herein above; and b) the purification of said random coil polypeptide.

In accordance with the present invention, the cleavage of a herein provided concatenated polypeptide leads to/results in the release of at least one of the two or more random coil polypeptides, in particular the release from said concatenated polypeptide. This means that cleaving a concatenated polypeptide, inter alia, by means that are further detailed herein below (including enzymatic cleavage) results in the release of the random coil polypeptide from said concatenated polypeptide. Said random coil polypeptide may be released via the cleavage of a cleavable linker comprised in said concatenated polypeptide. The term “release” or “releasing” may, in the context of the present invention, however, also refer to the release of the random coil polypeptide into a buffer/buffer system/medium/liquid/solution comprising said cleaved/to be cleaved concatenated polypeptide. Accordingly, the present invention provides for means and methods for the production of a random coil polypeptide, wherein said method comprises the steps of a) the cleavage of a concatenated polypeptide as detailed herein above, wherein said cleavage leads to the release of the random coil polypeptide, in particular from said concatenated polypeptide; and b) the purification of said random coil polypeptide. Accordingly, the present invention provides for means and methods for the production of a released and purified random coil polypeptide.

In one preferred aspect, the cleavage of said concatenated polypeptide (as defined in step a) of said method for the production of a random coil polypeptide) is enzymatic cleavage. Accordingly, the cleavage of said concatenated polypeptide preferably comprises the treatment of said concatenated polypeptide with one or more peptidases, one or more proteases, and/or one or more proteinases. Accordingly, the release of the random coil polypeptide from said concatenated polypeptide may be an enzymatic release. A non-limiting example of such an enzymatic release of random coil polypeptides is illustratively shown in the appended Figure 4.

Accordingly, the present invention relates to a method for the production of random coil polypeptides, wherein said method comprises the steps of a) the cleavage of the herein provided concatenated polypeptide by/with one or more peptidases, one or more proteases, and/or one or more proteinases; and b) the purification of said random coil polypeptide.

Hence, the present invention relates to a method for the production of random coil polypeptides comprising the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker; b) optionally, the purification of said concatenated polypeptide; c) the cleavage of said concatenated polypeptide by/with one or more peptidases, one or more proteases, and/or one or more proteinases; and d) the purification of said two or more random coil polypeptides, which may have the same composition.

In one preferred aspect, said one or more peptidases, one or more proteases, and/or one or more proteinases are one or more endopeptidases, or a combination of one or more endopeptidases and one or more exopeptidases.

Hence, the present invention relates to a method for the production of random coil polypeptides comprising the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker; b) the purification of said concatenated polypeptide; c) the cleavage of said concatenated polypeptide by/with one or more endopeptidases, or a combination of one or more endopeptidases and one or more exopeptidases; and d) the purification of said two or more random coil polypeptides.

In a further preferred aspect, the present invention relates to a method for the production of random coil polypeptides comprising the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker, and wherein said cleavable linker is selected from one or more lysine residues, one or more arginine residues, one or more tyrosine residues, one or more phenylalanine residues, one or more tryptophan residues, and/or one or more leucine residues; b) the purification of said concatenated polypeptide; c) the cleavage of said concatenated polypeptide by/with one or more endopeptidases, or a combination of one or more endopeptidases and one or more exopeptidases; and d) the purification of said two or more random coil polypeptides.

It is evident to the skilled person that the herein above and below detailed one or more peptidases, one or more proteases, and/or one or more proteinases, preferably the herein and above and below detailed one or more endopeptidases, or the combination of the one or more endopeptidases and the one or more exopeptidases cleave the cleavable linker comprised in said concatenated polypeptide. This means that the herein above and below detailed one or more peptidases, one or more proteases, and/or one or more proteinases, preferably the herein and above detailed one or more endopeptidases, or the combination of the one or more endopeptidases and the one or more exopeptidases cleave the concatenated polypeptide via the therein comprised cleavable linker. This means that the herein above and below detailed one or more peptidases, one or more proteases, and/or one or more proteinases, preferably the herein and above detailed one or more endopeptidases, or the combination of the one or more endopeptidases and the one or more exopeptidases, release the random coil polypeptides from the concatenated polypeptide by cleaving the cleavable linker linking said random coil polypeptides.

In one preferred aspect, the herein provided endopeptidases are selected from trypsin, Arg-C proteinase, clostripain, chymotrypsin, pepsin, endoproteinase Lys-C, preferably trypsin.

In one preferred aspect, the herein provided exopeptidases are selected from carboxypeptidase B, carboxypeptidase E, carboxypeptidase N, carboxypeptidase D, metallocarboxypeptidase D, metalloendopeptidase Lys-N, or carboxypeptidase A, preferably carboxypeptidase B .

The skilled person is aware that endopeptidases and exopeptidases known in the art often comprise amino acid sequence-specific cleaving activity (i.e., sequence-specific endopeptidase activity and sequencespecific exopeptidase activity, respectively). This means that the prior art teaches (specific) combinations of endopeptidases and target amino acid sequences that may be cleaved by a given endopeptidase. Here, only for non-limiting illustrative purposes, the combination of the endopeptidase trypsin and, as part of the polypeptide substrate, the basic amino acids lysine and/or arginine are mentioned. Further nonlimiting examples of such combinations of endopeptidases and corresponding target amino acids or amino acid sequences are highlighted in Table 1 herein below. Similarly, the prior art further teaches (specific) combinations of exopeptidases and target amino acid sequences that may be cleaved by a given exopeptidase. Here, only for illustrative purposes, the non-limiting combination of the endopeptidase carboxypeptidase B and the basic amino acids lysine and/or arginine are mentioned. Further non-limiting examples of such combinations of exopeptidases and corresponding target amino acid sequences are highlighted in Table 2 herein below. In the context of the present invention, the term “target amino acid sequence” may also refer to a single amino acid residue (i.e., a “target amino acid residue”).

Table 1: Exemplary combinations of endopeptidases and corresponding target amino acid sequences

Table 2: Exemplary combinations of exopeptidases and corresponding target amino acid sequences

The person skilled in the art is aware that endopeptidases are enzymes with proteolytic activity (i.e., proteolytic peptidases) that can break/cleave peptide bonds of non-terminal amino acids. This means that endopeptidases can break/cleave peptide bonds within a polypeptide (such as, but not limiting a protein). This means that endopeptidases normally cleave non-terminal peptide bonds. In the context of the present invention, an endopeptidase may cleave the (non-terminal) cleavable linker linking the herein provided random coil polypeptides within the herein provided concatenated polypeptide, thereby releasing said random coil polypeptides potentially including the cleavable linker moiety.

The person skilled in the art is further aware that exopeptidases (in contrast to endopeptidases) are enzymes with proteolytic activity (i.e., proteolytic peptidases) that preferentially break/cleave peptide bonds of terminal amino acids. This means that exopeptidases can break/cleave the first or last peptide bond of a polypeptide (such as, but not limiting, a protein). In other words, exopeptidases cleave terminal peptide bonds. In particular, carboxypeptidases preferentially cleave the C-terminal peptide bond of a polypeptide whereas aminopeptidases preferentially cleave the N-terminal peptide bond of a polypeptide.

In the context of the present invention, and as detailed herein above, when the concatenated polypeptide (comprising the two or more random coil polypeptides separated by a cleavable linker) is treated with a (suitable) combination of an exopeptidase and an endopeptidase, said endopeptidase may first cleave the (non-terminal) cleavable linker, thereby releasing said random coil polypeptide including the cleavable linker moiety. Thus, the resulting (released) random coil polypeptide may comprise a (terminal) cleavable linker attached to either its C-terminus or its N-terminus. Said C-terminal or N-terminal cleavable linker may subsequently be cleaved off/removed by said exopeptidase, resulting in the release of a random coil polypeptide lacking any (terminal or non-terminal) cleavable linker, that is a PA-rich and/or PAS-rich random coil polypeptide.

In the context of the present invention, a “suitable combination of an exopeptidase and an endopeptidase” may refer to a combination of endopeptidases and exopeptidases having the same/an identical target amino acid sequence. An exemplary suitable combination of an endopeptidase and an exopeptidase may, inter alia and non-limiting, be trypsin and carboxypeptidase B, as both use a single lysine and/or arginine as target sequence. Further non-limiting, yet illustrative details can be found in the appended examples. The present invention also relates to any suitable combination of endopeptidases and exopeptidases listed in Tables 1 and 2, respectively. In the context of the present invention, the terms “an (suitable) exopeptidase” and “an (suitable) endopeptidase”, may also refer to one or more (suitable) exopeptidases and one or more (suitable) endopeptidases, respectively.

In one preferred aspect, the present invention relates to a method for the production of a random coil polypeptide, wherein said method comprises the steps of a) the cleavage of the herein provided concatenated polypeptide by/with a combination of one or more endopeptidases and one or more exopeptidases, wherein said endopeptidase is preferably trypsin, and wherein said exopeptidase is preferably carboxypeptidase B; and b) the purification of said two or more random coil polypeptides.

Accordingly, the present invention also relates to a method for the production of random coil polypeptides comprising the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker, and wherein said cleavable linker is selected from one or more lysine residues and/or one or more arginine residues; b) the purification of said concatenated polypeptide; c) the cleavage of said concatenated polypeptide by/with a combination of one or more endopeptidases and one or more exopeptidases, wherein said endopeptidase is preferably trypsin, and wherein said exopeptidase is preferably carboxypeptidase B; and d) the purification of said two or more random coil polypeptides. In another preferred aspect, the present invention relates to a method for the production of a random coil polypeptide, wherein said method comprises the steps of a) the cleavage of the herein provided concatenated polypeptide by/with a combination of one or more endopeptidases and one or more exopeptidases, wherein said endopeptidase is preferably chymotrypsin or pepsin, and wherein said exopeptidase is preferably carboxypeptidase A; and b) the purification of said two or more random coil polypeptides.

Accordingly, the present invention also relates to a method for the production of random coil polypeptides comprising the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker, and wherein said cleavable linker is selected from one or more tyrosine residues, one or more phenylalanine residues, one or more tryptophan residues, and/or one or more leucine residues; b) the purification of said concatenated polypeptide; c) the cleavage of said concatenated polypeptide by/with a combination of one or more endopeptidases and one or more exopeptidases, wherein said endopeptidase is preferably chymotrypsin or pepsin, and wherein said exopeptidase is preferably carboxypeptidase A; and d) the purification of said two or more random coil polypeptides.

In the context of the present invention, the purification of the two or more random coil polypeptides (that were released from the herein provided concatenated polypeptide by cleavage of the latter) may comprise performing purification via precipitation or via liquid chromatography (LC) of said random coil polypeptide, preferably via high-pressure liquid chromatography (HPLC). This means that said two or more random coil polypeptides may be purified via liquid chromatography (LC) of said random coil polypeptide, preferably via high-pressure liquid chromatography (HPLC).

In the context of the present invention, it may be beneficial to protect all but one reactive amino acid side chains of the herein provided (purified) random coil polypeptides. Accordingly, the present invention provides for means and methods for the protection of all but one reactive amino acid side chains of the herein provided random coil polypeptides, preferably after the herein above detailed purification of said random coil polypeptide. In this context, the term “reactive amino acid side chain” may refer to any (unprotected) carboxyl group (including the C-terminal carboxyl group), any (unprotected) amino group (including the N-terminal amino group), any (unprotected) guanidinium group, and/or any (unprotected) thiol group that is comprised in said random coil polypeptide.

An advantage of protecting all but one reactive amino acid side chains of the herein provided random coil polypeptides may be that said random coil polypeptide may subsequently, inter alia, be coupled to a biomolecule, a pharmaceutical, or the like, via only said one non-protected reactive said chain. Accordingly, protecting all but one reactive amino acid side chains of the herein provided random coil polypeptides may allow for targeted and/or directional coupling of said random coil polypeptide, inter alia, to a biomolecule, a pharmaceutical, or the like. It may herein be preferred to leave the C-terminal carboxyl group (i.e., the C-terminus) of said random coil polypeptide unprotected. This means that it may herein be preferred to protect every other reactive amino acid side chain except for the C-terminal carboxyl group (i.e., the C-terminus) of said random coil polypeptide. Alternatively, it may herein be preferred to leave the thiol group of one or more Cys residues of said random coil polypeptide unprotected.

Accordingly, the present invention, inter alia, provides for means and methods for the protection of the N-terminal amino group (i.e., the N-terminus) of the herein provided random coil polypeptides. As already mentioned herein above, the two or more random coil polypeptides may further comprise an N- terminal blocking residue (inter alia, but non-limiting, a glutamine residue, or a glutamate residue). In the context of the present invention, said N-terminal blocking residue (inter alia, but non-limiting, a glutamine residue, or a glutamate residue) may be reacted to form a N-terminal protecting group.

Reacting said N-terminal blocking residue to form an N-terminal protecting group may comprise cyclisation of said N-terminal blocking residue (inter alia, but non-limiting, a glutamine residue, or a glutamate residue) to form said N-terminal protecting group (inter alia, but non-limiting, a pyroglutamoyl group/ a pyroglutamate residue/a pyrrolidone carboxylic acid group). In the context of the present invention, cyclisation of the N-terminal protection group comprises the cyclisation reaction of said N- terminal blocking residue in the presence of an acid selected from acetic acid, formic acid, 2-propanoic acid (propionic acid), butanoic acid (butyric acid), 2-hydroxypropanoic acid (lactic acid), 2,4- hexanedienoic acid (sorbic acid), 2-butenedioic acid (fumaric acid), hydroxybutanedioic acid (malic acid), 2,3 -dihydroxybutanedioic acid (tartaric acid), 2 -hydroxy- 1,2, 3 -propanetricarboxylic acid (citric acid), benzenecarboxylic acid (benzoic acid), trifluoroacetic acid, trichloroacetic acid, p-toluenesulfonic acid, trifluoromethane sulfonic acid, preferably in the presence of acetic acid.

The present inventors have surprisingly found that pivalic acid may be successfully employed in the cyclisation of the N-terminal blocking residue (i.e., Pga cyclisation) resulting in the formation of an N- terminal protecting group (e.g., a pyroglutamyl group / a pyroglutamate residue / a pyrrolidone carboxylic acid group). In particular, Example 4 illustratively shows that the cyclisation of the N-terminal blocking residue using pivalic acid requires shorter incubation times, as compared to, e.g., acetic acid, without the formation of unwanted side products.

Accordingly, in the context of the present invention the cyclisation of the N-terminal protection group may comprise the cyclisation reaction of said N-terminal blocking residue in the presence of an acid selected from pivalic acid, acetic acid, 2,2-dimethylbutyric acid, 2,2-dimethylvaleric acid, formic acid, 2- propanoic acid (propionic acid), butanoic acid (butyric acid), 2-hydroxypropanoic acid (lactic acid), 2,4- hexanedienoic acid (sorbic acid), 2-butenedioic acid (fumaric acid), hydroxybutanedioic acid (malic acid), 2,3 -dihydroxybutanedioic acid (tartaric acid), 2 -hydroxy- 1,2, 3 -propanetricarboxylic acid (citric acid), benzenecarboxylic acid (benzoic acid), trifluoroacetic acid, trichloroacetic acid, p-toluenesulfonic acid, trifluoromethane sulfonic acid, preferably in the presence of pivalic acid or acetic acid.

The present invention further relates to the use of pivalic acid in the cyclisation of the N-terminal blocking residue and/or in the formation of an N-terminal protecting group (e.g., a pyroglutamyl group / a pyroglutamate residue / a pyrrolidone carboxylic acid group).

Accordingly, said N-terminal blocking residue may be reacted/cyclized to form an N-terminal protection group. Accordingly, the protection of all but one reactive amino acid side chains of the herein provided random coil polypeptide may comprise reacting/cyclizing said N-terminal blocking residue to form an N- terminal protection group. This means that the protection of all but one reactive amino acid side chains of the herein provided random coil polypeptide may comprise the addition of an N-terminal protecting on the N-terminus of said random coil polypeptide. In this context and as detailed herein above, the addition of an N-terminal protecting group may also refer to the cyclisation of an N-terminal amino acid residue (such as an N-terminal blocking residue) thereby producing said N-terminal protecting group.

Furthermore, in a further embodiment of the present invention, an N-terminal protecting group may be appended to the N-terminus after formation of the random coil polypeptide, e.g. via peptide or amide bond formation with the a-amino group. Accordingly, the protection of all but one reactive amino acid side chains of the herein provided random coil polypeptides may comprise the addition of an N-terminal protection group at the N-terminus of said random coil polypeptide, and wherein said N-terminal protection group is selected from pyroglutamoyl, homopyroglutamoyl, formyl, CO(Ci-4 alkyl), wherein the alkyl moiety comprised in said -CO(Ci-4 alkyl) is optionally substituted with one or two groups independently selected from -OH, -O(Ci-4 alkyl), -N⁺CI-4 alkyl)(Ci-4 alkyl)(Ci-4 alkyl), -N(CI-4 alkyl)( Ci-4 alkyl) and -COOH. Further, the protection of all but one reactive amino acid side chains of the herein provided random coil polypeptides may comprise the addition of an N-terminal protection group at the N- terminus of said random coil polypeptide, wherein said N-terminal protection group is selected from acetyl, hydroxyacetyl, methoxyacetyl, ethoxyacetyl, propoxyacetyl, malonyl, propionyl, 2- hydroxypropionyl, 3 -hydroxypropionyl, 2-methoxypropionyl, 3 -methoxypropionyl, 2-ethoxypropionyl, 3- ethoxypropionyl, butyryl, 2-hydroxybutyryl, 3 -hydroxybutyryl, 4-hydroxybutyryl, 2-methoxybutyryl, 3- methoxybutyryl, 4-methoxybutyryl, succinyl, glutaryl, and glycine betainyl.

A non-limiting example of the N-terminal protection of the herein provided random coil polypeptides is illustratively shown in appended Figure 5. Further, SEQ IDs NO: 30 to 32 provide exemplary and non- limiting amino acid sequences of random coil polypeptides comprising an N-terminal glutamine blocking residue (Q) and SEQ IDs NO: 33 to 35 provide the respective sequences after cyclisation of said glutamine residues, thus, each comprising an N-terminal protecting group (i.e., a pyroglutamoyl group/a pyroglutamate residue/ a pyrrolidone carboxylic acid linked to via a peptide bond to the random coil polypeptide/Pga). It is evident to the person skilled in the art that such an N-terminal protecting group can be applied similarly to any other herein provided random coil polypeptide. As detailed herein above, the herein disclosed random coil polypeptides and/or concatenated polypeptides may be each encoded by multiple nucleic acid sequences having low sequence redundancies. Accordingly, is evident for the skilled artisan that SEQ ID NO: 30 may be, inter alia, encoded by SEQ IDs NO: 77 to 86, SEQ ID NO: 31 may be, inter alia, encoded by SEQ IDs NO: 87 to 91, and SEQ ID NO: 42 may be, inter alia, encoded by SEQ IDs NO: 92 and 93.

In the context of the present invention, it may be beneficial, to purify and/or concentrate the herein provided random coil polypeptides after the protection of all but one reactive amino acid side chains of said random coil polypeptide. Accordingly, the present invention provides for means and methods for the purification and/or the concentration of said (N-terminally protected) random coil polypeptide, preferably after the protection of all but one reactive amino acid side chains of said random coil polypeptide. In the context of the present invention, the purification and/or the concentration of said (N-terminally protected) random coil polypeptide may comprise the evaporation of solvents and acids that are in contact with said (N-terminally protected) random coil polypeptide, optionally using an entrainer. Said entrainer is preferably selected from toluene, benzene, m-xylene, n-heptane, n-octane, tetrachloroethylene, ethyl acetate, preferably said entrainer is toluene. Further, the purification and/or the concentration of said (N- terminally protected) random coil polypeptide may also comprise the precipitation of said random coil polypeptide using a solvent selected from diethyl ether, toluene, diisopropyl ether, cyclohexane, benzene, trichloromethane, di-n-butyl ether, o-xylene, m-xylene, p-xylene, butyl acetate, ethyl benzene, preferably diethyl ether, thereby producing/resulting in the production of (a) (purified and/or concentrated) random coil polypeptide(s).

The non-limiting appended Figure 6 illustratively demonstrates the successful production/preparation and purity of multiple exemplary produced/prepared and purified random coil polypeptides.

Accordingly, and as already mentioned herein above, the present invention also provides for random coil polypeptides, as defined herein above and/or for random coil polypeptides as obtained by/obtainable by the means and methods as provided herein. Furthermore, the present invention also relates to, inter alia, salts and solutions of said random coil polypeptide(s) or comprising said random coil polypeptide(s). Furthermore, the present invention provides for uses of the herein provided random coil polypeptides for the conjugation with a biomolecule, for the conjugation with a pharmaceutical, or for conjugation with any other organic molecule.

The present invention further relates to means and methods for the production of a conjugate comprising the herein provided random coil polypeptide, wherein the method comprises the steps of a) the coupling of the herein provided random coil polypeptide to a biomolecule, and b) the purification of the conjugate. In the context of the present invention the biomolecule may be selected from an enzyme, a protein, a peptide, a lipid, a dialkylamine, a fatty acid, a carbohydrate, a cyclodextrin, a nucleic acid, DNA, RNA, or a peptide nucleic acid (PNA), preferably an enzyme.

In the context of the present invention, the purification of the conjugate (comprising the herein provided random coil polypeptide and a biomolecule) may comprise performing purification via LC of said conjugate. Accordingly, the present invention further provides for a conjugate comprising the herein provided random coil polypeptide, a conjugate as detailed herein above, and/or a conjugate obtained by/obtainable by the herein provided means and methods for the production of a conjugate (comprising the herein provided random coil polypeptide). Further, the present invention also relates to, inter alia, salts and solutions of said conjugate (comprising the herein provided random coil polypeptide).

Furthermore, the present invention also provides for a composition comprising the herein provided conjugate (comprising the herein provided random coil polypeptide) and/or to a composition comprising a salt of the herein provided conjugate (comprising the herein provided random coil polypeptide).

The present invention further provides for medical uses of the herein provided conjugates (or of salts of the herein provided conjugates) and/or of the herein provided composition (comprising the herein provided conjugate). Accordingly, the present invention provides for said conjugates or a salt thereof, and/or said composition for use as a medicament. The herein provided medicament comprising the said conjugate and/or said composition may be used in the treatment and/or prevention of a disorder or a disease.

Further, the present invention provides for non-medical uses of the herein provided random coil polypeptides, the herein provided conjugates (or of salts of the herein provided conjugates) and/or of the herein provided compositions (comprising the herein provided conjugates).

The present invention further provides for a kit comprising the herein provided composition, the herein provided conjugate or a salt thereof, the herein provided random coil polypeptide or a salt thereof, the herein provided concatenated polypeptide, the herein provided nucleic acid molecule, the herein provided nucleic acid vector, and/or the herein provided host or host cell. Preferably said kit comprises the herein provided nucleic acid molecule, the herein provided nucleic acid vector, and/or the herein provided host or host cell.

The terms “polypeptide”, “peptide”, and “protein” are used herein interchangeably and refer to a polymer of two or more amino acids linked via amide bonds that are formed between an amino group of one amino acid and a carboxy group of another amino acid. The amino acids comprised in the peptide or protein, which are also referred to as amino acid residues, may be selected from the 20 standard proteinogenic a-amino acids (i.e., Ala, Arg, Asn, Asp, Cys, Glu, Gin, Gly, His, He, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, and Vai) but also from non-proteinogenic, non-natural and/or non-standard a-amino acids (such as, e.g., ornithine, citrulline, homolysine, pyroglutamate, pyrrolysine, 4-hydroxyproline, norvaline, norleucine, and terleucine (tert-leucine). Preferably, the amino acid residues comprised in the peptide or protein are selected from a-amino acids, more preferably from the 20 standard proteinogenic a-amino acids, including pyroglutamate, and are preferably all present as the L-isomer.

The peptide or protein may be unmodified or may be modified, e.g., at its N-terminus, at its C-terminus and/or at a functional group in the side chain of any of its amino acid residues (particularly at the side chain functional group of one or more Lys, His, Ser, Thr, Tyr, Cys, Asn, Asp, Glu, and/or Arg residues). Such modifications may include, e.g., the attachment of any of the protecting groups described for the corresponding functional groups in: Wuts PGM, Greene’s protective groups in organic synthesis, 4^th edition, John Wiley & Sons, 2007. Such modifications may also include, e.g., the glycosylation with mono- di-, oligo- or polysaccharides and/or the acylation with one or more fatty acids (e.g., one or more Cs-3o alkanoic or alkenoic acids; forming a fatty acid acylated peptide or protein).

As used herein, the term “amino acid” refers, in particular, to any one of the 20 standard proteinogenic a-amino acids (i.e., Ala, Arg, Asn, Asp, Cys, Glu, Gin, Gly, His, He, Leu, Lys, Met, Phe, Pro (also called an imino acid), Ser, Thr, Trp, Tyr, or Vai) but also to a non-proteinogenic, non-natural and/or nonstandard a-amino acid (such as, e.g., ornithine, citrulline, homolysine, pyroglutamate, pyrrolysine, 4-hydroxyproline, norvaline, norleucine, and terleucine (tert-leucine). Unless defined otherwise, the term “amino acid” preferably refers to an a-amino acid, more preferably to any one of the 20 standard proteinogenic a-amino acids, including pyroglutamate, preferably in the form of the L-isomer.

As used herein, the term "nucleic acid molecule" or “nucleotide sequence” is intended to include nucleic acid molecules such as DNA molecules and RNA molecules. It is herein understood that the term “nucleotide sequence” is equal to the term “nucleic acid sequence” and that these terms can be used interchangeably herein. Said nucleic acid molecule or said nucleotide sequence may be single -stranded or double-stranded, but preferably is double-stranded DNA. The skilled person in the art knows that double- stranded DNA actually comprises two different nucleic acid molecules, with largely complementary nucleotide sequences (neglecting sticky ends if present), which are non-covalently associated/hybridized to form a double strand.

Methods which are well known to those skilled in the art can be used to construct various plasmids; see, for example, the techniques described in Sambrook (2001) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, 3rd ed.; Ausubel (1989) Current Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience. Typical plasmid vectors include, e.g., the pASK and pASK-IBA series of expression plasmids, the pUCseries of plasmids, pBluescript (Stratagene), or the pET series of expression vectors (Novagen) or pD441-SR (ATUM). Typical vectors compatible with expression in mammalian cells inlcude pREP (Invitrogen), pCEP4 (Invitrogen), pMClneo (Stratagene), pXTl (Stratagene), pSG5 (Stratagene), EBO-pSV2neo, pBPV-1, pdBPVMMTneo, pRSVgpt, pRSVneo, pSV2-dhfr, pIZD35, pcDVl (Pharmacia), pRc/CMV, pcDNAl, pcDNA3 (Invitrogen), pcDNA3.1, pSPORTl (GIBCO BRL), pGEMHE (Promega), pLXIN, pSIR (Clontech), pIRES-EGFP (Clontech), pEAK-10 (Edge Biosystems) pTriEx-Hygro (Novagen) and pCINeo (Promega). Non-limiting examples for plasmid vectors suitable for Pichia pastoris comprise e.g. the plasmids pAO815, pPIC9K and pPIC3.5K (all Invitrogen).

Additionally, baculoviral systems can also be used in order to express the nucleic acid molecules of the invention in insect cell culture. In these aspects, the pFBDM vector can be used as an expression vector. The insertion into the MultiBac baculoviral DNA is mediated via the Tn7 transposition sequence upon transformation of DH10 MultiBac E. colt cells (Berger (2013) J. Vis. Exp. 77:50159; Fitzgerald (2006) Nat. Methods 3: 1021-1032.). Virus amplification and expression can be performed in Sf21 (Spodoptera frugiperda) or High Five (Trichoplusia ni) cells.

Generally, vectors can contain one or more origins of replication (ori) and inheritance systems for cloning and propagation, one or more markers for selection, e.g., antibiotic resistance, and one or more expression cassettes. Examples of suitable origins of replication include, for example, the full length ColEl, its truncated versions such as those present on the pUC plasmids, the M13 phage origin of replication and, in the case of mammalian cells, the SV40 viral origin of replication. Non-limiting examples of selectable markers include ampicillin, chloramphenicol, tetracycline, kanamycin, dhfir, gpt, neomycin, hygromycin, blasticidin or geneticin.

Further, said vector comprises a regulatory sequence that is operably linked to the expression cassette and/or said nucleotide sequence or the nucleic acid molecule defined herein.

The coding sequence(s), e.g., said concatenated nucleotide sequence, comprised in the vector can be linked to (a) transcriptional regulatory element(s) and/or to other amino acid encoding sequences using established methods. Such regulatory sequences are well known to those skilled in the art and include, without being limiting, regulatory sequences ensuring the initiation of transcription, internal ribosomal entry sites (IRES) and, optionally, regulatory elements ensuring termination of transcription and stabilization of the transcript. Non-limiting examples for such regulatory sequences ensuring the initiation of transcription comprise promoters, a translation initiation codon, enhancers, insulators and/or regulatory elements ensuring transcription termination. Further examples include Kozak sequences and intervening sequences flanked by donor and acceptor sites for RNA splicing, nucleic acid sequences encoding secretion signals or, depending on the expression system used, signal sequences capable of directing the expressed protein to a cellular compartment or to the culture medium.

Examples of suitable promoters include, without being limiting, the lac, trp, tac or tet promoters, the lacUV5 promoter, the T7 or T5 promoter, the cytomegalovirus (CMV) promoter, SV40 promoter, RSV (Rous sarcome virus) promoter, chicken p-actin promoter, CAG promoter (a combination of chicken - actin promoter and cytomegalovirus immediate-early enhancer), human elongation factor la promoter, AOX1 promoter, GALI promoter, CaM-kinase promoter, the Autographa calif arnica multiple nuclear polyhedrosis virus (AcMNPV) polyhedral promoter or a globin intron in mammalian and other animal cells. One example of an enhancer is, e.g., the SV40 enhancer. Non-limiting additional examples for regulatory elements/sequences ensuring transcription termination include the SV40 poly-A site, the tk poly-A site or the AcMNPV polyhedral polyadenylation signals.

Furthermore, depending on the expression system, leader sequences capable of directing the polypeptide to a cellular compartment or secreting it into the medium may be added to the coding sequence of the nucleic acid molecule provided herein. The leader sequence(s) is (are) assembled in frame with translation, initiation, and termination sequences, and preferably, a leader sequence is capable of directing secretion of the translated protein, or a portion thereof, into the periplasmic space or into the extracellular medium. Suitable leader sequences are, for example, the signal sequences of OmpA, PhoA, stll, OmpT, PelB CTB (cholera toxin subunit B), DsbA, XynA, Tat (Twin-arginine translocation) in E. call. and the signal sequences of bovine growth hormone, human chymotrypsinogen, human factor VIII, human ig- kappa, human insulin, human interleukin-2, luciferase from Metrida or Vargula, human trypsinogen-2, inulinase from Kluyveromyces marxianus, mating factor alpha-1 from Saccharomyces cerevisiae. mellitin, human azurocidin and the like in eukaryotic cells.

The vectors may also contain an additional expressible nucleic acid sequence coding for one or more chaperones to facilitate correct protein folding. Suitable bacterial expression hosts comprise, e.g., strains derived from Escherichia coli JM83, W3110, KS272, TGI, BL21 (such as BL21(DE3), BL21(DE3)PlysS, BL21(DE3)RIL, BL21(DE3)PRARE), NEBExpress, Origami (K-12), Origami B or Rosetta.

For vector modification, PCR amplification and ligation techniques, see methods described in Sambrook (2001) loc. cit.

The nucleic acid molecules and/or vectors of the invention as described herein above may be designed for introduction into cells by, e.g., non-chemical methods (electroporation, sonoporation, optical transfection, gene electrotransfer, hydrodynamic delivery or naturally occurring transformation upon contacting cells with the nucleic acid molecule of the invention), chemical-based methods (calcium phosphate, DMSO, PEG, liposomes, DEAE-dextrane, polyethylenimine, nucleofection etc.), particle-based methods (gene gun, magnetofection, impalefection), phage or phagemid vector-based methods and viral methods. For example, expression vectors derived from viruses such as retroviruses, vaccinia virus, adeno-associated virus, herpes viruses, Semliki Forest Virus or bovine papilloma virus, may be used for delivery of the nucleic acid molecules into a targeted cell population.

Preferably, the nucleic acid molecules and/or vectors of the invention are designed for transformation of chemically competent E. coli or of electrocompetent E. coli by electroporation or for stable transfection of CHO cells by calcium phosphate, polyethylenimine or lipofectaminetransfection (Pham (2006) Mol. Biotechnol. 34:225-237; Geisse(2012) Methods Mol. Biol. 899:203-219; Hacker (2013) Protein Expr. Purif. 92:67-76).

The present invention also relates to a host cell or a host transformed with a vector or the nucleic acid molecule of this invention. It will be appreciated that the term "host”, in accordance with the present invention, relates to a non-human host. Host cells for the expression of polypeptides are well known in the art and comprise prokaryotic cells as well as eukaryotic cells. Thus, the host can be selected from the group consisting of a bacterium, a mammalian cell, an algal cell, a ciliate, yeast and a plant cell.

Typical bacteria include Escherichia, Corynehacterium (glutamicum), Pseudomonas (fluorescens), Lactobacillus, Sirepiomyces. Salmonella or Bacillus (such as Bacillus megaterium or Bacillus subtilis). The most preferred bacterial host herein is E. coli. An exemplary ciliate to be used herein is Teirahymena. e.g. Tetrahymena thermophila.

Typical mammalian cells include, Hela, HEK293, HEK293T, H9, Per.C6 and Jurkat cells, mouse NIH3T3, NS0 and C127 cells, COS 1, COS 7 and CV1, quail QC1-3 cells, mouse L cells, mouse sarcoma cells, Bowes melanoma cells and Chinese hamster ovary (CHO) cells. Most preferred mammalian host cells in accordance with the present invention are CHO cells. An exemplary host to be used herein is Criceiulus. e.g. Cricetulus griseus (Chinese hamster). Also, human embryonic kidney (HEK) cells are preferred.

Other suitable eukaryotic host cells are e.g. yeasts such as Pichia pastoris. Kluyveromyces lactis, Saccharomyces cerevisiae and Schizosaccharomyces pombe or chicken cells, such as e.g. DT40 cells. Insect cells suitable for expression are e.g. Drosophila S2, Drosophila Kc, Spodoptera Sf9 and Sf21 or Trichoplusia Hi5 cells. Preferable algal cells are Chlamydomonas reinhardtii or Synechococcus elongatus cells and the like. An exemplary plant is Physcomitrella, for example Physcomitrella patens. An exemplary plant cell is a Physcomitrella plant cell, e.g. a. Physcomitrella patens plant cell.

The vector present in the host of the invention is either an expression vector, or the vector mediates the stable integration of the nucleic acid molecule of the present invention into the genome of the host cell in such a manner that expression of the protein is ensured. Means and methods for selecting a host cell in which the nucleic acid molecule of the present invention can be introduced such that expression of the protein is ensured are well known in the art and have been described (Browne (2007) Trends Biotechnol. 25:425-432; Matasci (2008) Drug Discov. Today: Technol. 5:e37-e42; Wurm (2004) Nat. Biotechnol. 22: 1393-1398).

Suitable conditions for culturing prokaryotic or eukaryotic host cells are well known to the person skilled in the art. For example, bacteria such as e.g. E. coli can be cultured under aeration in Luria Bertani (LB) medium or in a synthetic medium, typically at a temperature from 4 to about 37 °C. To increase the yield and the solubility of the expression product, the medium can be buffered or supplemented with suitable additives known to enhance or facilitate both. In those cases where an inducible promoter controls the nucleic acid molecule of the invention in the vector present in the host cell, expression of the polypeptide can be induced by addition of an appropriate inducing agent, such as, e.g., isopropyl-P-D- thiogalactopyranoside (IPTG) or anhydrotetracycline (aTc) as employed in the appended examples. Suitable expression protocols and strategies have been described in the art, e.g. Schlapschy (2013) Protein Eng. Des. Sei. 26:489-501, Breibeck (2018) Biopolymers 109:e23069, Friedrich (2022) Microb. Cell Fact. 21:227 and can be adapted to the needs of the specific host cells and the requirements of the protein to be expressed, if required.

Depending on the cell type and its specific requirements, mammalian cell culture can, e.g., be carried out in RPMI, Williams’ E or medium DMEM containing 10 % (v/v) FCS, 2 mM L-glutamine and 100 U/ml penicillin/streptomycin. The cells can be kept, e.g., at 37 °C, or at 41 °C for DT40 chicken cells, in a 5 % CO2, water-saturated atmosphere. A suitable medium for insect cell culture is, e.g., TNM + 10 % FCS, SF900 or HyClone SFX-Insect medium. Insect cells are usually grown at 27 °C as adhesion or suspension cultures. Suitable expression protocols for eukaryotic or vertebrate cells are well known to the skilled person and can be retrieved, e.g., from Sambrook (2001) (loc. cit) or Freshney (2010) Culture of Animal Cells: A Manual of Basic Technique and Specialized Applications. 6th Edition, John Wiley & Sons Ltd., Hoboken.

Preferably, the method for preparing the nucleic acid molecule, the vector, the polypeptide and/or the drug conjugate of the invention is carried out using either bacterial cells, such as, e.g., E. coli cells or C. glutamicum cells, or mammalian cells, such as, e.g., CHO cells. More preferably, the method is carried out using E. coli cells or CHO cells and most preferably, the method is carried out using E. coli cells.

Methods for the isolation of the encoded polypeptides produced comprise, without limitation, purification steps such as affinity chromatography (preferably using a fusion tag such as the Strep-tag II or the Hise- tag), gel filtration (size exclusion chromatography), anion exchange chromatography, cation exchange chromatography, hydrophobic interaction chromatography, mixed mode chromatography, high pressure liquid chromatography (HPLC), reversed phase HPLC, ammonium sulfate precipitation or immunoprecipitation. These methods are well known in the art.

Exemplary encoded biologically or pharmacologically active proteins or therapeutically effective proteins of interest that are useful in the context of the present invention include, but are not limited to, interleukin receptor antagonist, interleukin- 1 receptor antagonist like EBI-005 or anakinra, leptin, acetylcholinesterase, activated protein C (drotrecogin), activin receptor IIB antagonist, adenosine deaminase, agalsidase alfa, agonist of toll-like receptor 5 like entolimod , alpha- 1 antitrypsin, alpha- 1 proteinase inhibitor, alpha-galactosidase, alpha-human atrial natriuretic peptide, alpha-N- acetylglucosaminidase, alteplase, amediplase, amylin, amylin analogue, ANF-Rho, angiotensin (1-7), angiotensin II, angiotensin-converting-enzyme 2, anti-epithelial cell adhesion molecule single-chain antibody fragment, antithrombin alfa, antithrombin III, apoptosis inducing enzyme mi-APO, arginine deiminase, asparaginases like calaspargase, pegaspargase, crisantaspase, B domain deleted factor VIII like beroctocog alfa or octofactor, bectumomab (Lymphoscan), bile salt stimulated lipases like bucelipase alfa, binding protein directed against the respiratory syncytial virus like pavlizumab, bone morphogenetic proteins like BMP -2 (dibotermin alfa) or BMP-6, bouganin, bovine carboxyhemoglobin, bovine growth hormone, Cl-Esterase-Inhibitor, C3 exoenzyme protein, carboxyhemoglobin, CD 19 antagonist, CD20 antagonist like rituxan, CD3 receptor antagonist, CD40 antagonist, CD40L antagonist like dapirolizumab or Antova, cerebroside sulfatase, cethrin like VGX-210, chondroitin lyase, coagulation factor IX like nonacog gamma, conacog beta, albutrepenonacog alfa, coagulation factor Vila like eptacog alfa, marzeptacog alfa, vatreptacog alfa, oreptacog alfa, coagulation factor VIII like susoctocog alfa , damoctocog alfa, turoctocog alfa, rurioctocog alfa, efinoroctocog alfa, efraloctocog alfa, simoctocog alfa, coagulation factor X, coagulation factor XIII like catridecacog, collagenase of Clostridium histolyticum, complement factor C3 inhibitor, complement receptor 5a antagonist, corticotrophin releasing factor, CSF1 receptor antagonists like FPA008, CSF1R antagonist, CTLA-4 antagonist like ipilimumab, cyanovirin-N, deoxyribonuclease I like domase alfa, EGFR receptor antagonist, elastases like human type I pancreatic elastase like vonapanitase, endostatin, enkastim, epidermal growth factor, erythropoietin alfa, erythropoietin zeta, FcyllB receptor antagonists, fibrinogenase, fibrinolytic enzyme like brinase, fibroblast growth factor 1 (human acidic fibroblast growth factor), fibroblast growth factor 18, fibroblast growth factor 2 (human basic fibroblast growth factor), fibroblast growth factor 21, fibroblast growth factor receptor 2 antagonists like FPA144, Fms-like tyrosine kinase 3 ligand, follicle-stimulating hormones like follitropin alfa or follitropin beta, fragment of human bactericidal/permeability-increasing protein 21 (opebacan/rBPI 21 ), gelonin, glucagon receptor agonist, glycoprotein Ilb/IIIa antagonist like abciximab, glycosaminoglycan-degrading enzymes like condoliase, gpl20/gpl60, granulocyte colony stimulating factor (G-CSF), granulocyte macrophage colony stimulating factor (GM-CSF), heat-shock protein hsp 65 from Mycobacterium BCG fused with transcription factor E7 (verpasep caltespen), hepatocyte growth factor, hepatocyte growth factor receptor (HGFR) antagonist, hepcidin antagonist, Her2/neu receptor antagonist like herceptin, heterodimeric 15:IL-15Ra (hetIL-15), hirudin, hsp70 antagonist, human acid sphingomyelinase, human chorionic gonadotropin like choriogonadotropin alfa, human enzyme acid a-glucosidases like reveglucosidase alfa or alglucosidase alfa, human growth hormone, human keratinocyte growth factor (KGF), human matrix metalloproteinase, human myelin basic protein fragment, human osteogenic protein 1, human osteogenic protein- 1, human parathyroid hormone, human thrombomodulin alpha, hyaluronidase like rHuPH20, hyaluronidases like human hyaluronidase PH-20 (vorhyaluronidase alfa), hyalosidase or bovhyaluronidase, hydrolytic lysosomal glucocerebroside-specific enzymes like glucocerebrosidase, velaglucerase alfa or taliglucerase alfa, iduronate-2-sulfatase, IgE antagonists like omalizumab, ilroquois homeobox protein 2 (IRX-2), insulin, insulin analog, integrin a4pi antagonist, interferon tau, interferon-alpha, interferon-alpha antagonist, interferon-alpha superagonist, interferon-alpha-n3 (Alferon N Injection), interferon-beta, interferongamma, interferon-lambda, interleukin 2 fusion proteins like DAB(389)IL-2, interleukin- 11 like oprelevkin, interleukin- 12, interleukin- 17 receptor antagonist, interleukin- 18 binding protein, interleukin- 2, interleukin-22, interleukin-4 like pitrakinra, interleukin-4 mutein, interleukin-6 receptor antagonist, interleukin-7, interleukin-22 receptor subunit alpha (IL-22ra) antagonist, irisin, islet neogenesis associated protein, kallidinogenase, lactoferrin, lactoferrin fragment, lanoteplase, lipase enzymes like burlulipase, rizolipase, epafipase or sebelipase alfa, luteinizing hormone, lutropin alpha, lymphocyte expansion molecule, lysostaphin, mammalian gastric lipase enzyme (merispace), mannosidases like velmanase alfa, melanocortin-4 receptor agonist, MEPE-derived 23-amino acid peptide, methionyl human stem cell factor (ancestim), microplasmin, N-acetylgalactosamine-6-sulfatase like elosulfase alfa, N- acetylglucosaminidase, nasaruplase beta, nerve growth factor, neuregulin-1, neurotoxin (e.g. a clostridial neurotoxin, like a Clostridium botulinum neurotoxin (such as Clostridium botulinum neurotoxin serotype A, B, C, D, E, F or G, particularly Clostridium botulinum neurotoxin serotype A), neutrophil gelatinase- associated lipocalin, ocriplasmin, Omithodoros moubata complement inhibitor (OmCI/Coversin), osteoprotegerin, P128 (StaphTAME), pamiteplase, parathormone (PTH), PD-1 antagonist, PDGF antagonist, pentraxin-2 protein , phage lysin like HY133, phenylalanine ammonia lyase like valiase, phosphatases like tissue-nonspecific alkaline phosphatase or asfotase alfa, plasminogen, plasminogen variant like V10153, platelet derived growth factor-BB, porcine growth hormone, prohibitin-targeting peptide 1, proinsulin, protein A, protein C like drotrecognin, protein binding fibroblast growth factor receptor ligands like FP-1039, recombinant tissue factor pathway inhibitor (tifacogin), relaxin, relaxin analog like serelaxin, reteplase, rhPDGF-BB, ribonuclease like onconase or amphinase, senrebotase, serine protease inhibitors like conestat alfa, sfericase, sialidase, soluble complement receptor type 1, soluble DCC (deleted in colorectal cancer) receptor, soluble TACI receptor (atacicept), soluble tumor necrosis factor I receptor (sTNF-RI), soluble tumor necrosis factor II receptor (sTNF-RII), soluble VEGF receptor Fit- 1, soluble, human FcyllB receptor, staphylokinase, streptokinase, sulfamidase, T-cell receptor ligand, tenecteplase, thrombopoiesis-stimulating protein (AMG-531), thrombopoietin, thrombospondin- 1, thyroid hormone, thyrotropin-releasing hormone (TRH) analog like taltirelin, tissue plasminogen activator, tissue-type plasminogen activator like pamiteplase, tripeptidyl peptidase I, tumor necrosis factor (TNFalpha), tumour necrosis factor a antagonist, uricase like rasburicase or pegadricase, urodilatin, urofollitropin, urokinase, uteroglobin, VEGF antagonist like ranbizumab or bevacizumab, VEGF/PDGF antagonist, VEGF/PDGF antagonist like a multi -VEGF/PDGF DARPin or a fusion protein, viscumin, von Willebrand factors like vonicog alfa. Interleukin receptor antagonist, especially interleukin- 1 receptor antagonists, like EBI-005 or anakinra, and leptin, especially human leptin, or huLeptin(W100Q), a human leptin mutant with a tryptophan to glutamine substitution at position 100 in the mature polypeptide chain.

Exemplary peptides and peptidomimetics include but are not limited to adrenocorticotropic hormone (ACTH), afamelanotide, alarelin, alpha 4 integrin inhibitor, anti-HIV fusion inhibitor (like enfurvitide, V2o, SC34EK, SC35EK, IQN17 or IZN17), angiotensin II type 2 (AT2) receptor agonist (like LT2), anti- idiotypic p53 peptide, amylin, amylin analog, astressin, atosiban, bacterial peptide fragment with anticancer and anti HIV activity (like ATP-01), bicyclic peptide (like TG-758), bivalirudin, bradykinin antagonist (like icatibant), bremelanotide, B-type natriuretic peptide, calcitonin, carbetocin, carfilzomib, chrysalin, cilengitide, C-type natriuretic peptide, colostrinin, corticotrophin releasing factor (like Xerecept, coysnthropin), CNGRCG tumor homing peptide, co-conotoxin peptide (like ziconotide), C- peptide, danegaptide, defensin, ecallantide, elcatonin, eledoisin, exenatide, exendin-4, exendin-4 analog (like exendin 9-39), ezrin peptide 1, fragments from the human matrix extracellular phosphoglycoprotein (like AC- 100), galanin, gastric inhibitory polypeptide (GIP), GIP analog, glatiramer, glucagon, glucagon analog, glucagon-like peptide 1 (GLP-1), GLP-1 analog (like lixisenatide, liraglutide or semiglutide), glucagon-like peptide 2 (GLP-2), GLP-2 analog (like teduglutide), gonadorelin, gonadotropin-releasing hormone agonist (like goserelin, buserelin, triptorelin, leuprolide, protirelin, lecirelin, fertirelin or desiorelin), gonadotropin-releasing hormone antagonist (like abarelix, cetrorelix, degarelix, ganirelix or teverelix), grehlin, grehlin analog (like AZP-531), growth hormone-releasing hormone, growth hormone- releasing hormone analog (like sermorelin or tesamorelin), hematide, hepcidin mimetic peptide, histrelin, indolicidin, indolicidin analog (like omiganan), IgE down-regulating peptide (like SC-01), INGAP peptide (exsulin), insulin-like growth factor 1, insulin-like growth factor 2, Kvl.3 ion channel antagonist (like cgtxA, cgtxE or cgtxF), lanreotide, lectin binding peptide (like sv6B, sv6D, svC2, svHIC, svHlD or svL4), lanthipeptide, larazotide, linaclotide, lusupultide, melanocortin-4 receptor agonist (like AZD2820), MEPE-derived 23-amino acid peptide, mitochondrial-derived peptide (like MOTS-c, humanin, SHLP-6 or SHLP-2), mutant of the insulin-like growth factor binding protein-2 (like I-HBD1), Nav ion channel modulators (like GTxl-15 or VSTx3), octreotide, proprotein convertase subtilisin/kexin type 9 (PCSK9) inhibitory peptide, octreotide, oxyntomodulin, oxytocin, peptide fragment of azurin, Phylomer, peptide antagonist to the MHC Class Il-associated invariant peptide (CLIP) (like VG1177), peptide derived from a heat shock protein (like enkastim), pexiganan, plovamer, pramlintide, prohibitin-targeting peptide 1, pro-islet peptide, peptide tyrosine tyrosine (PYY 3-36), RGD peptide or peptidomimetic, ramoplanin, secretin, sinapultide, somatostatin, somatostatin analog (like pasireotide or CAP-232), specifically targeted antimicrobial peptide (STAMP) (like C16G2), receptor agonist of the bone morphogenetic protein (like THR-184 or THR-575), stresscopin, surfaxin, Tc99m apcitide, teriparatide (PTH 1-34), tetracosactide, thymosin alpha 1, TLR2 inhibitory peptide, TLR3 inhibitory peptide, TLR4 inhibitory peptide, thymosin B4, thymosin B15, vasoactive intestinal peptide, vasopressin, vasopressin analog like (desompressin, felypressin or terlypressin).

The “treatment” of a disorder or disease may, for example, lead to a halt in the progression of the disorder or disease (e.g., no deterioration of symptoms) or a delay in the progression of the disorder or disease (in case the halt in progression is of a transient nature only). The “treatment” of a disorder or disease may also lead to a partial response (e.g., amelioration of symptoms) or complete response (e.g., disappearance of symptoms) of the subject/patient suffering from the disorder or disease. Accordingly, the “treatment” of a disorder or disease may also refer to an amelioration of the disorder or disease, which may, e.g., lead to a halt in the progression of the disorder or disease or a delay in the progression of the disorder or disease. Such a partial or complete response may be followed by a relapse. It is to be understood that a subject/patient may experience a broad range of responses to a treatment (such as the exemplary responses as described herein above). The treatment of a disorder or disease may, inter alia, comprise curative treatment (preferably leading to a complete response and eventually to healing of the disorder or disease) and palliative treatment (including symptomatic relief).

The term “prevention” of a disorder or disease, as used herein, is also well known in the art. For example, a patient/subject suspected of being prone to suffer from a disorder or disease may particularly benefit from a prevention of the disorder or disease. The subject/patient may have a susceptibility or predisposition for a disorder or disease, including but not limited to hereditary predisposition. Such a predisposition can be determined by standard methods or assays, using, e.g., genetic markers or phenotypic indicators or biomarkers. It is to be understood that a disorder or disease to be prevented in accordance with the present invention has not been diagnosed or cannot be diagnosed in the patient/subject (for example, the patient/subject does not show any clinical or pathological symptoms). Thus, the term “prevention” comprises the use of a conjugate of the present invention before any clinical and/or pathological symptoms are diagnosed or determined or can be diagnosed or determined by the attending physician.

Exemplary inflammatory diseases include but are not limited to ankylosing spondylitis, arthritis, atherosclerosis, atypical hemolytic uremic syndrome (aHUS), fibromyalgia, Guillain Barre syndrome (GBS), irritable bowel syndrome (IBS), Crohn's disease, colitis, dermatitis, diverticulitis, myasthenia gravis, osteoarthritis, psoriatic arthritis, Lambert-Eaton fmyasthenic syndrom, systemic lupus erythematous (SLE), nephritis, Parkinson's disease, multiple sclerosis, paroxysmal nocturnal hemoglobinuria (PNH), rheumatoid arthritis (RA), Sjogren’s syndrome , ulcerative colitis, and the like.

Exemplary infectious diseases include but are not limited to african trypanosomiasis, borreliosis, cholera, cryptosporidiosis, dengue fever, hepatitis A, hepatitis B, hepatitis C, HIV/AIDS, influenza, Japanese encephalitis, leishmaniasis, malaria, measles, meningitis, onchocerciasis, pneumonia, rotavirus infection, schistosomiasis, sepsis, shigellosis, streptococcal tonsillitis, tuberculosis, typhoid, yellow fever, and the like.

Exemplary respiratory diseases include but are not limited to asthma, chronic obstructive pulmonary disease (COPD), cystic fibrosis, and the like.

Exemplary endocrine disorders include but are not limited to acromegaly, type I diabetes, type II diabetes, gestational diabetes, Graves' disease, growth hormone deficiency, hyperglycemia, hyperparathyroidism, hyperthyroidism, hypoglycemia, infertility, obesity, parathyroid diseases, Morquio A syndrome, mucopolysaccharidosis, and the like.

Exemplary diseases of the central nervous system include but are not limited to Alzheimer's disease, catalepsy, Huntington's disease, Parkinson's disease, and the like.

Exemplary musculoskeletal diseases include but are not limited to osteoporosis, muscular dystrophy, and the like.

Exemplary cardiovascular diseases include but are not limited to acute heart failure, cerebrovascular disease (stroke), ischemic heart disease, and the like. Exemplary oncological diseases include but are not limited to adrenal cancer, bladder cancer, breast cancer, colon and rectal cancer, endometrial cancer, kidney cancer, acute lymphoblastic leukemia (ALL) and other types of leukemia, lung cancer, melanoma, Non-Hodgkin lymphoma, pancreatic cancer, prostate cancer, thyroid cancer, and the like.

Exemplary urogenital diseases include but are not limited to benign prostatic hyperplasia (BPH), hematuria, neurogenic bladder, Peyronie's disease, and the like.

Exemplary metabolic diseases include but are not limited to diabetes, adipositas, Gaucher disease, Fabry disease, Growth hormone deficiency, Hurler syndrome, Hunter syndrome, hyperoxaluria, neuronal ceroid lipofuscinosis, Maroteaux-Lamy syndrome, Morquio syndrome, Noonan syndrome, SHOX gene haploinsufficiency, Turner syndrome, Prader-Willi syndrome, phenylketonuria, Sanfilippo syndrome, and the like.

Exemplary diseases of the eye include but are not limited to eye surface inflammation, dry age-related macular degeneration (e.g., geographic atrophy), wet age-related macular degeneration (e.g., choroidal neovascularisation), retinopathy of prematurity, uveitis (e.g., autoimmune uveitis, infective uveitis), optic neuritis (e.g. glaucoma associated optic neuritis), diabetic retinopathy, diabetic macular oedema, retinal vein occlusion, and the like.

It is to be understood that the present invention specifically relates to each and every combination of features and embodiments described herein, including any combination of general and/or preferred features/embodiments. In particular, the invention specifically relates to each combination of meanings (including general and/or preferred meanings) for the various groups and variables comprised in the PAS polypeptides and PA polypeptides and the conjugates according to the invention.

In this specification, a number of documents including patents, patent applications and scientific literature are cited. The disclosure of these documents, while not considered relevant for the patentability of this invention, is herewith incorporated by reference in its entirety. More specifically, all referenced documents are incorporated by reference to the same extent as if each individual document was specifically and individually indicated to be incorporated by reference.

The reference in this specification to any prior publication (or information derived therefrom) is not and should not be taken as an acknowledgment or admission or any form of suggestion that the corresponding prior publication (or the information derived therefrom) forms part of the common general knowledge in the technical field to which the present specification relates. As used herein, the terms “comprising” and “including” or grammatical variants thereof are to be taken as specifying the stated features, integers, steps or components but do not preclude the addition of one or more additional features, integers, steps, components or groups thereof. These terms encompass the terms “consisting of’ and “consisting essentially of.”

Thus, the terms “comprising ”/“including”/”having” mean that any further component (or likewise features, integers, steps and the like) can/may be present. Thus, whenever the terms “comprising ”/“including”/”having” are used herein, they can be replaced by “consisting essentially of’ or, preferably, by “consisting of’.

The term “consisting of’ means that no further component (or likewise features, integers, steps and the like) is present.

The term “consisting essentially of’ or grammatical variants thereof when used herein are to be taken as specifying the stated features, integers, steps or components but do not preclude the addition of one or more additional features, integers, steps, components or groups thereof but only if the additional features, integers, steps, components or groups thereof do not materially alter the basic and novel characteristics of the claimed composition, device or method.

Thus, the term “consisting essentially of’ means that specific further components (or likewise features, integers, steps and the like) can be present, namely those not materially affecting the essential characteristics of the composition, device or method. In other words, the term "consisting essentially of (which can be interchangeably used herein with the term "comprising substantially"), allows the presence of other components in the composition, device or method in addition to the mandatory components (or likewise features, integers, steps and the like), provided that the essential characteristics of the device or method are not materially affected by the presence of other components.

The term “production” may in the context of the present invention be used interchangeably with the term “preparation” or "manufacturing". Accordingly, the term “producing”, “produces”, “produce” may herein be interchanged with “preparing”, “prepares”, and “prepare”.

The term “method” refers to manners, means, techniques and procedures for accomplishing a given task including, but not limited to, those manners, means, techniques and procedures either known to, or readily developed from known manners, means, techniques and procedures, by practitioners of the chemical, biological and biophysical arts. As used herein and if not indicated otherwise, the term "about" preferably refers to ±10 % of the indicated numerical value, more preferably to ±5 % of the indicated numerical value, and in particular to the exact numerical value indicated.

The present invention is further described by reference to the following non-limiting figures and examples. Unless otherwise indicated, established methods of recombinant gene technology were used as described, for example, in Sambrook (2001) loc. cit. which is incorporated herein by reference in its entirety.

In the following table (Table 3), the herein disclosed, non-limiting amino acid sequences (“AAS”; SEQ IDs NO: 1 to 35, 259, 262, 263, 265, and 266) and non-limiting nucleic acid sequences (“NAS”; SEQ IDs NO: 36 to 256, 257, 258, 260, 261, and 264) are also reproduced. All amino acid sequences are provided in the one letter code. Further, the letter “X” at position 1 of an amino acid sequence indicates an N- terminal pyroglutamate (“Pga” or “pyrrolidone carboxylic acid”; used herein synonymously) protecting group. In the description (column 2 of Table 3) “K” or “-K” indicates the presence of a cleavable linker consisting of a single lysine (K) or of a nucleic acid sequence encoding a cleavable linker consisting of a single lysine residue. Further, in said description “Q” indicates an N-terminal blocking residue consisting of a single glutamine residue (Q) or indicates a nucleic acid sequence encoding an N-terminal blocking residue consisting of a single glutamine residue (Q). “C” indicates an N-terminal or C-terminal residue useful for chemical conjugation consisting of a single cysteine residue (C) or indicates a nucleic acid sequence encoding an N-terminal or C-terminal residue useful for chemical conjugation consisting of a single cysteine residue (C). As has been detailed herein above, the amino acid sequences provided herein (and, accordingly, also in Table 3) may each be encoded by multiple nucleic acid sequences having low nucleic acid sequence redundancy. Accordingly, in Table 3 nucleic acid sequences encoding the same polypeptide are additionally labeled with lower case letters (e.g., “_a”, “_b”, “_c”, etc.) at the end of the respective description.

Table 3: Herein disclosed, non-limiting amino acid sequences

I l l

The present invention is further described by reference to the following non-limiting figures. The figures show:

Figure 1: Schematic representation of PAS concatemers. Recombinant concatenated polypeptides comprising PAS peptides with a length of 20 amino acids (top), 40 amino acids (center) or 100 amino acids (bottom) were produced. PAS peptides are linked by lysine residues (K), serving as cleavage site for the endopeptidase trypsin, followed by processing with carboxypeptidase B. Additionally, each PAS peptide is equipped with a glutamine (Q) residue at the N-terminus, which, after being chemically transformed to pyroglutamate, serves as N-terminal protection group. Also linked via lysine to the PAS concatemer, the produced concatemer comprises a E. coli thioredoxin A (TrxA) at its N-terminus, serving as a carrier protein. At the C-terminus, the concatemer is equipped with a hexahistidine tag (His) for immobilized metal ion affinity chromatography.

Figure 2: Plasmid map of the expression plasmid pD441-SR-TrxA-K(QPAS40K)sHis6 and the amino acid sequence of the corresponding gene product (TrxA-K(QPAS40K)sHis6). The vector facilitates the expression of a fusion protein comprising the E.coli thioredoxin A (TrxA), containing an additional Lys residue directly downstream of the start Met residue as well as an exchange of Met at position 39 by Gin (M39Q), followed by a lysine (Lys), the Gln-PAS40 peptide concatemer with a C- terminal lysine (QPAS40K)s and a hexahistidine tag (Hise) under the control of an inducible T5 promoter. The vector further comprises the 5 ’ untranslated region of the gene of the bacteriophage T7 capsid protein 10A (5’UTR_T7_10A), a strong ribosome binding site (RBS) and the terminator from the E. coli rmB gene (Term_rmB) at its 3’ end. Furthermore, the vector contains a kanamycin resistance gene (KanR) under the control of the beta-lactamase promoter (P Amp), a gene encoding the LacI repressor under the control of the lacl promoter (P lacI) and the origin of replication from the pUC plasmid (Ori_pUC), followed by the terminators from the rpoC gene (Term rpoC) and the beta-lactamase gene (Term bla). The depicted amino acid sequence corresponds to the amino acid sequence of TrxA-K(QPAS40K)5Hise. The QPAS40 sequence stretches are highlighted bold.

Figure 3: Purification of PAS concatemers by immobilized metal ion affinity chromatography. (A) The chromatogram (top) from the IMAC of the TrxA-K(QPAS40K)5-Hise fusion protein shows a small peak of host cell proteins in the flow through and a large peak, containing the recombinant fusion protein, eluting within the concentration gradient from 10 to 400 mM imidazole. The Coomassie-stained SDS- PAGE gel (bottom) shows collected fractions from the flow through (FT) and the elution fraction of the TrxA-K(QPAS40K)5-Hise fusion protein, exhibiting a strong protein band with high purity. (B, C) Coomassie-stained SDS-PAGE gels from the IMAC of (B) the TrxA-K(QPAS2OK)io-Hise fusion protein and of (C) the TrxA-K(QPAS100K)2-Hise fusion protein show similar results. The PageRuler broad range molecular weight ladder (ThermoFisher) was used as marker (M). Figure 4: Enzymatic release of QPAS40 polypeptides from the concatenated polypeptides. The individual PAS polypeptides, each with an N-terminal glutamine (Q) residue, are linked to each other and to the carrier protein (TrxA) as well as the hexahistidine tag (His) via lysine (K) residues. Trypsin is used to cleave the peptide bonds after each lysine (black arrows), releasing the QPAS40 polypeptides carrying the C-terminal lysine residues. To cleave off the C-terminal lysine, carboxypeptidase B (Cpb) is used (white arrows), finally releasing the QPAS40 polypeptides with the desired sequence.

Figure 5: N-terminal protection of PAS polypeptides. The reaction scheme of an exemplary PAS peptide shows how the N-terminal glutamine cyclizes (under release of ammonia, NHs) in the presence of acid, e.g. acetic acid, to pyroglutamic acid (Pga), which serves as a physiological N-terminal blocking group.

Figure 6: Characterization of biotechnologically produced PAS polypeptides. Deconvoluted spectra from ESI-mass spectrometry show uniform mass distribution. The measured masses (annotated) correlate well with the calculated masses of Pga-PAS20 (1781.0), Pga-PAS40 (3432.8) and Pga-PASIOO (8388.3).

Figure 7: Plasmid map of the expression plasmid pD441-SR-TrxA-(QPAS2OCK/R)io-His6 and the amino acid sequence of the corresponding gene product (TrxA-K(QPAS2OCK/R)ioHise). The vector facilitates the expression of a fusion protein comprising the E.coli thioredoxin A (TrxA), containing an additional Lys residue directly downstream of the start Met residue as well as an exchange of Met at position 39 by Gin (M39Q), followed by a lysine (Lys), the Gln-PAS20C peptide concatemer with a C- terminal arginine (QPAS2OCK/R)io, a threonine (Thr) and a hexahistidine tag (Hise) under the control of an inducible T5 promoter. The vector further comprises the 5’ untranslated region of the gene of the bacteriophage T7 capsid protein 10A (5’UTR_T7_10A), a strong ribosome binding site (RBS) and the terminator from the E. coli rmB gene (Term rmB) at its 3’ end. Furthermore, the vector contains a kanamycin resistance gene (KanR) under the control of the beta-lactamase promoter (P Amp), a gene encoding the LacI repressor under the control of the lacl promoter (P lacI) and the origin of replication from the pUC plasmid (Ori_pUC), followed by the terminators from the rpoC gene (Term rpoC) and the beta-lactamase gene (Term bla). The depicted amino acid sequence corresponds to the amino acid sequence of TrxA-K(QPAS2OCK/R)ioHise. The QPAS20C sequence stretches are highlighted bold.

Figure 8: Protein expression of the TrxA-(QPAS2OCK/R)io-Hise concatemer. A dominant band in the Coomassie-stained SDS-PAGE gel (reducing conditions) of the E. coli cell extract (CE) and in the ammonium sulfate precipitate (AS) indicates the strong over-expression of the TrxA-(QPAS2OCK/R)io- Hise fusion protein. The PageRuler broad range molecular weight ladder (ThermoFisher) was used as marker (M). Figure 9: Characterization of biotechnologically produced Pga-PAS20C peptide. Deconvoluted mass spectrum from ESI-mass spectrometry shows a distinct main species (annotated) corresponding to the calculated mass of the Pga-PAS20C monomer (1884.0 Da) and a second mass with low abundancy corresponding to the calculated mass of the disulphide-linked Pga-PAS20C dimer (3766.0 Da).

Examples

Certain embodiments of the invention are described with reference to the following examples, which are intended for the purpose of illustration only and are not intended to limit the scope of the generality of the description hereinbefore.

Example 1: Construction and cloning of an expression plasmid for the production PAS peptides

The expression vector pD441-SR (ATUM) was equipped with a multiple cloning site by ligating the Sapl and Xbal cleaved vector backbone with the annealed, phosphorylated and overlapping oligonucleotides MCS1 (SEQ ID NO. 250), and MCS2 (SEQ ID NO. 251), both purchased from an oligonucleotide synthesis provider (Eurofins Scientific). The resulting vector was named pD441-SR-MCS (SEQ ID NO: 252).

A synthetic DNA fragment (DNA: SEQ ID NO: 248) encoding Thioredoxin A (TrxA; SEQ ID NO: 25) from E. coli. containing an additional Lys residue directly downstream of the start Met residue as well as an exchange of Met at position 39 by Gin (M39Q) was purchased from a gene synthesis provider (Thermo Fisher Scientific). This gene fragment (SEQ ID NO: 248) encodes an Xbal (TCTAGA) restriction site, followed by a ribosomal binding site (AGGAGGTAA), the coding sequence of TrxA followed by a lysine codon (AAA), a glutamine codon (CAG), an alanine codon (GCC), a first Sapl recognition sequence (GCTCTTC) on the non-coding strand, an 8-nucleotide spacer (TCCTCAGC), and a second Sapl restriction sequence in reverse complementary orientation, with its recognition sequence (GCTCTTC) on the coding strand, followed by an alanine codon (GCC), a lysine codon (AAG), a hexahistidine (Hise) encoding sequence (DNA: SEQ ID NO: 249; Protein: SEQ ID NO: 26) and, finally, a Hzfidll I restriction site (AAGCTT). This gene fragment was digested with the restriction enzymes Xbal and HinAlll, purified using the Monarch DNA cleanup kit (New England Biolabs) and subcloned in the likewise digested expression vector pD441-SR-MCS using the Electra cloning system (ATUM).

The resulting plasmid pD441-SR-TrxA-SapI-His6 (SEQ ID NO: 253) was digested with Sapl, which led to a vector backbone with 5'-GCC/5’-GGC sticky ends directly at the position downstream of the C- terminus of TrxA. These sticky ends allow for the insertion of a low repetitive nucleic acid molecule encoding a concatenated polypeptide/concatemer termed PAS gene cassettes herein below. The vector backbone fragment was isolated using the Promega Wizard gel extraction kit (Promega) and dephosphorylated with the thermosensitive alkaline phosphatase FastAP (Thermo Fisher Scientific), both according to the manufacturer's instructions.

Synthetic DNA-fragments comprising PAS gene cassettes encoding 10 QPAS20K-polypeptides (the encoded polypeptide is herein below also referred to as (QPAS20K)io-polypeptide or gene cassette, respectively; SEQ ID NO: 14; SEQ ID NO: 37), 5 QPAS40K-polypeptides (the encoded polypeptide is herein below also referred to as (QPAS40K)5— polypeptide or gene cassette, respectively; SEQ ID NO: 16; SEQ ID NO: 39), or 2 QPASlOOK-polypeptides (the encoded polypeptide is herein below also referred to as (QPAS100K)₂-- polypeptide or gene cassette, respectively; SEQ ID NO: 18; SEQ ID NO: 41) were purchased from a gene synthesis provider (Thermo Fisher Scientific). These synthetic PAS gene cassettes (SEQ IDs NO: 254 to 256) were digested with the restriction enzyme SapI, leading to 5'- GCC/5’-GGC sticky ends compatible with the cleaved vector backbone, then purified using the Monarch DNA cleanup kit (New England Biolabs), and finally ligated with the SapI cleaved and dephosphorylated pD441-SR-TrxA-SapI-His6 vector backbone from above.

The three resulting plasmids allow for the bacterial expression of fusion proteins consisting of a TrxA protein fused C-terminally to a concatemer/concatenated polypeptide comprising two or more random coil polypeptides linked by a lysine residue (K; herein acting as a cleavable linker) and a glutamine residue (Q; herein acting an N-terminal blocking residue), and a C-terminal Hise tag linked to said concatemer/concatenated polypeptide via a lysine residue (Figure 1, Protein: SEQ IDs NO: 27 to 29). Figure 2 illustrates the plasmid map of the IPTG-inducible expression plasmid pD441-SR-TrxA- K(QPAS40K)5-Hise (SEQ ID NO: 51) and the amino acid sequence of the corresponding gene product (TrxA-K(QPAS40K)₅-His₆; SEQ ID NO: 27). In the same manner, the pD441-SR-TrxA-K(QPAS2OK)io- His₆ (SEQ ID NO: 49) and the pD441-SR-TrxA-K(QPAS100K)₂-His₆ (SEQ ID NO: 53) plasmids were generated. These two latter expression vectors were used similarly to the pD441-SR-TrxA- K(QPAS40K)5-Hise expression vector in the recombinant production of the encoded gene products (TrxA-K(QPAS2OK)io-His₆, SEQ ID NO: 28; TrxA-K(QPAS100K)₂-His₆, SEQ ID NO: 29), as described herein below.

Example 2: Fermenter production and purification of the TrxA-K(QPAS40K)s-His6 Concatemer

The TrxA-K(QPAS40K)5-Hise fusion protein (SEQ ID NO: 27; calculated mass: 32 kDa) was produced at 25 °C in E. coli NEB Express cells (New England Biolabs) harboring the expression plasmid pD441-SR- TrxA-K(QPAS40K)5-Hise (Figure 2; SEQ ID NO: 51) from Example 1 using an 8 1 bench top fermenter with a synthetic glucose mineral medium supplemented with 100 mg/l kanamycin according to a published procedure (Schiweck (1995) Proteins 23: 561-565). The O₂ saturation was maintained at 60 % and the pH was maintained at 6.9 by automated dosing of ammonia solution (25 %). The growth temperature was 25 °C. Recombinant gene expression was induced by addition of IPTG to a final concentration of 1 mM as soon as the culture reached OD550 = 80. After an induction period of 4 h, the bacteria were harvested by centrifugation, and the resulting approximately 1 kg wet cell paste was suspended in 2.8 1 extraction buffer (100 mM citric acid). Finally, the E. coli cells were disrupted using a PandaPlus 2000 lab homogenizer (GEA).

The raw cell extract was cleared by centrifugation (12,200 xg, 4 °C, 30 min) and the supernatant was passed through a sterile filter membrane (0.2 pm PES). The TrxA-K(QPAS40K)5-Hise concatemer was precipitated by addition of ammonium sulfate to a saturation of 30 % (approximately 1.3 M final concentration) at 20 °C. After centrifugation, the supernatant was removed and the precipitate was resuspended in IMAC running buffer (40 mM NaPi, 0.5 M NaCl, pH 7.5). Immobilized metal ion affinity chromatography (IMAC) was performed on a HisTrap HP column (Cytiva) in the IMAC running buffer. After washing with running buffer until the baseline was reached, the TrxA-K(QPAS40K)5-Hise concatemer was eluted by applying a concentration gradient of 10-400 mM imidazole/HCl in IMAC running buffer (Figure 3A). In the same manner, the TrxA-K(QPAS2OK)io-Hise (Figure 3B; SEQ ID NO: 28) and TrxA-K(QPAS100K)2-His6 concatemers (Figure 3C; SEQ ID NO: 29) were produced and purified.

From Figure 3 it is evident that the purification of TrxA-K(QPAS40K)5-Hise, TrxA-K(QPAS2OK)io-Hise, and TrxA-K(QPAS100K)2-His6 concatemers results in highly pure concatenated polypeptides. This is exemplary also for other (PAS and/or PA) concatemers.

Example 3: Enzymatic cleavage of the Tr A-K(QPAS40K)s-His6 concatemer and N-terminal pyroglutamate formation

The IMAC-purified concatenated TrxA-K(QPAS40K)5-Hise polypeptide/fusion protein/concatemer (SEQ ID NO: 27) solution (protein concentration: ~8 mg/ml) from Example 2 was dialyzed against 20 mM Tris/HCl buffer (pH 9.0). To release the individual PAS polypeptides peptides (i.e., units/elements/parts/pieces/repeats) from the concatemer and also to remove the TrxA carrier protein and the His6-tag, 61.5 pg trypsin (recombinant from E. coll, HiMedia Laboratories) per 1 mg TrxA- K(QPAS40K)5-Hise concatemer was added, resulting in the specific proteolytic cleavage on the C- terminal side of the lysine residues serving as cleavage sites. Furthermore, 0.2 pg carboxypeptidase B (CpB, recombinant from E. coli, ProSpec-Tany TechnoGene) per 1 mg TrxA-K(QPAS40K)5-Hise concatemer was immediately added to the reaction mixture in order to cleave the C-terminal lysine from the released QPAS40K polypeptides (Figure 4). The reaction was incubated at 37 °C for 4 h. The resulting QPAS40 random coil polypeptides (SEQ ID NO: 31) were then purified on a Source RPC column (PS-DVB resin; Cytiva) using 2 % (v/v) acetonitrile and 0.1 % (v/v) formic acid as mobile phase. The peptides were eluted with a gradient of 2-80 % (v/v) acetonitrile.

For applications that require C-terminal chemical activation of the PAS polypeptides their N-terminal amino group has to be blocked in order to prevent chemical side reactions. To this end, a glutamine (Gin; Q) residue was introduced at the genetic level at the N-terminus of each PAS peptide (hence, QPAS). Chemical cyclization of the N-terminal glutamine results in the formation of the pyroglutamate group (Pga) (Figure 5). Pga is a chemically inert physiological blocking group (an N-terminal protecting group), which is a common natural post-translational modification in mammals including humans. Conversion of glutamine (Gin; Q) to Pga was achieved by adding acetic acid to the RPC eluate to a final concentration of 1 % (v/v) and incubation at 50 °C for 72 h. To finally remove solvents (ACN, H2O) and acids (acetic acid, formic acid), a rotary evaporator was used. Toluene forms azeotropes with all of the used solvents/acids and was therefore used as entrainer. It is essential for any subsequent coupling reactions that the acids are completely removed. After complete evaporation, the N-terminally protected PAS polypeptide forms a film on the inner surface of the glass flask. After that, the Pga-PAS random coil polypeptide was dissolved in methanol (to achieve a concentration of 20-100 mg/ml) and filtrated using a 1 pm glass fiber syringe filter (Acrodisc, Pall Corporation, NY, USA). Subsequently, 8 vol. diethyl ether was added to the filtrate. The precipitated Pga-PAS40 random coil polypeptide (SEQ ID NO: 34) was collected by centrifugation (5,000 x g, 20 min) and washed twice with diethyl ether. Residual solvent was evaporated using a SpeedVac concentrator (30 °C, overnight). For chemical analysis, 10 pg of the Pga-PAS40 polypeptide was dissolved in H2O containing 0.1 % formic acid, and electrospray ionization-mass spectrometry (ESI-MS) was performed using a maXis Q- TOF instrument (Bruker Daltonics) in the positive ion mode. The TrxA-K(QPAS2OK)io-Hise concatemer and the TrxA-K(QPAS100K)2-Hise concatemers were cleaved in the same manner, resulting in the production of QPAS20 random coil polypeptides (SEQ ID NO: 28) and QPAS100 random coil polypeptides (SEQ ID NO: 29). Pga-formation (N-terminal protection) of the QPAS20 polypeptide (SEQ ID NO: 28) was accomplished as described above for the QPAS40 polypeptide. To promote Pga- formation for the QPAS100 polypeptide (SEQ ID NO: 29), a higher acetic acid concentration of 25 % (v/v) was used. Final preparation and analysis of the N-terminally protected Pga-PAS20 and Pga-PAS 100 random coil polypeptides (SEQ ID NO: 33 and 35, respectively) were performed as described above for Pga-PAS40 (SEQ ID NO: 34).

Figures 4 schematically illustrates the cleavage of the TrxA-K(QPAS40K)5-Hise concatemer using trypsin and CpB. Figures 5 schematically illustrates the N-terminal protection of a QPAS peptide via acid-catalyzed pyroglutamate (Pga) formation. Cleavage and N-terminal blocking as disclosed herein is exemplary and can be used also for TrxA-K(QPAS2OK)io-Hise, TrxA-K(QPAS100K)2-Hise and other (PAS and/or PA) concatemers comprising analogous cleavable linkers. The ESI-MS analyses shown in Figure 6 clearly demonstrate the uniformity and purity of the biotechnologically produced N-terminally protected PAS random coil polypeptides. This is exemplary also for other (PAS and/or PA) random coil polypeptides obtained by the herein provided means and methods.

Example 4: Production of Pga- PAS20 peptides with C-terminal cysteine (Pga-PAS20C)

The expression vector pD441-SR-TrxA-SapI-His6 (SEQ ID NO: 253) was digested with /ivCI and HinA II I to release a 43 bp fragment comprising the second Sapl restriction site of its multiple cloning site, a lysine codon (AAG) and the hexahistidine (Hise) coding sequence (see Example 1). Two synthetic oligonucleotides, XI (SEQ ID NO: 257) and X2 (SEQ ID NO: 258), were annealed and phosphorylated using T4 polynucleotide kinase (NEB) resulting in a double-stranded DNA insert containing a Sapl recognition sequence (GCTCTTC) followed by a nucleotide sequence encoding the amino acid sequence AlaCysArgThr fused with a hexahistidine (Hise) tag (SEQ ID NO: 259). The protruding ends of this DNA insert were compatible with the sticky ends of the /ivCI and Hzftdlll cleaved pD441-SR-TrxA-SapI-His6 vector after cleavage with Bb vCI and HinAlll. The corresponding vector backbone was purified using the Monarch DNA cleanup kit (New England Biolabs) and ligated with the synthetic DNA fragment, resulting in the expression vector pD441-SR-TrxA-SapI-CRT-Hise (SEQ ID NO: 260).

A synthetic DNA fragment (SEQ ID NO: 261) was digested with the restriction enzyme SapI, purified using the Monarch DNA cleanup kit (New England Biolabs) and ligated with the dephosphorylated vector backbone of pD441-SR-TrxA-SapI-CRT-Hise obtained after cleavage with SapI. The insertion of this DNA fragment completed the open reading frame for the TrxA-(QPAS2OCK/R)io-Hise fusion protein (SEQ ID NO: 262), which comprises a concatemer of 10 QPAS20C peptides, separated from each other by lysine residues (SEQ ID NO: 263). Figure 7 illustrates the plasmid map of the resulting IPTG- inducible expression plasmid pD441-SR-TrxA-(QPAS2OCK/R)io-Hise (SEQ ID NO: 264) and the amino acid sequence of the corresponding gene product (TrxA-(QPAS2OCK/R)io-Hise).

The expression plasmid pD441-SR-TrxA-(QPAS2OCK/R)io-Hise was used for the fermenter production of the TrxA-(QPAS2OCK/R)io-Hise fusion protein in E. coli NEBExpress cells (New England Biolabs). Fermentation, cell disruption and clearing of the cell extract were performed as described in Example 2. The TrxA-(QPAS2OCK/R)io-Hise concatemer was precipitated by addition of ammonium sulfate to a saturation of 40 % at 20 °C. After centrifugation, the supernatant was removed and the sediment was resuspended in IMAC running buffer (40 mM NaPi, 0.5 M NaCl, pH 7.5). Figure 8 shows an SDS-PAGE gel of the bacterial raw extract before and after ammonium sulfate precipitation. To dissolve disulphide- crosslinked TrxA-(QPAS2OCK/R)io-Hise polymer, tris(2-carboxyethyl)phosphine (TCEP) was dissolved in the supernatant to a final concentration of 7 mM and the solution was incubated for 1 h at 4 °C, followed by dialysis overnight at 4 °C against IMAC running buffer (regenerated cellulose dialysis tubing, 12 kDa MW cut-off; Biomol, Hamburg, Germany). Insoluble particles were removed from the dialyzed solution by centrifugation (12,200 x g, 4°C, 30 min) and the supernatant was subjected to IMAC on a HisTrap HP column (Cytiva) in IMAC running buffer. The TrxA-(QPAS2OCK/R)io-Hise concatemer was eluted by applying a concentration gradient of 10-400 mM imidazole/HCl in IMAC running buffer. The eluate was dialyzed against 20 mM Tris/HCl buffer (pH 9.0) and dithiothreitol (DTT) was added to a final concentration of 10 mM to prevent disulphide crosslinking.

To release the single PAS peptides (i.e., units/elements/parts/pieces/repeats) from the concatemer and also to remove the TrxA carrier protein and the Hise-tag, 61.5 pg trypsin (recombinant from E. coir, HiMedia Laboratories, Thane, India) together with 0.2 pg carboxypeptidase B (CpB, recombinant from E. coir ProSpec-Tany TechnoGene, Rehovot, Israel) were added per 1 mg TrxA-(QPAS2OCK/R)io-Hise concatemer, resulting in the specific proteolytic cleavage C-terminally to the lysine or arginine residue, and the subsequent cleavage of the C-terminal lysine or arginine from the released QPAS20C peptides. The reaction was incubated at 4 °C for 24 h.

The resulting QPAS20C random coil peptide (SEQ ID NO: 265) was purified on a Source RPC column (PS-DVB resin; Cytiva) using 2% (v/v) acetonitrile and 0.1% (v/v) formic acid as mobile phase. The peptide was eluted with a concentration gradient of 2-80 % (v/v) acetonitrile. To concentrate the QPAS20C peptide, a rotary evaporator was used. Toluene forms azeotropes with all of the used solvents/acids and was therefore used as entrainer. After complete evaporation, the PAS peptide formed a film on the glass vessel. Subsequently, the PAS peptide was dissolved in methanol at a concentration of 20-100 mg/ml and 8 vol. diethyl ether was added to the solution. The resulting QPAS20C random coil polypeptide precipitate was sedimented by centrifugation (5,000 x g, 20 min) and washed twice with diethyl ether. Conversion of glutamine (Gin; Q) to Pga was achieved by dissolving the QPAS20C polypeptide in 1 ml methanol and adding 0.5 ml H2O and 4.5 ml pivalic acid, followed by incubation at 50 °C for 24 h. Surprisingly, this acid could be applied at very high concentrations without forming unwanted side products (such as esters or amides, for example) and thus allowing a shorter Pga cyclization reaction time.

The resulting Pga-PAS20C random coil polypeptide (SEQ ID NO: 266) was precipitated from the pivalic acid mixture by addition of 8 vol. diethyl ether. The precipitate was sedimented by centrifugation (5,000 x g, 20 min) and washed four times with diethyl ether. Residual solvents were evaporated using a SpeedVac concentrator (30 °C, overnight). To reduce any disulphide-linked dimers of Pga-PAS20C that may have formed in the course of the purification and pivalic acid treatment, the precipitate was redissolved in an aqueous solution of 10 mM TCEP and incubated for 1 h at room temperature. TCEP was removed by dialysis against ultrapure water overnight at 4 °C (regenerated cellulose dialysis tubing, 1 kDa MW cutoff; Spectrum laboratories, Rancho Dominguez, CA) and the Pga-PAS20C polypeptide/peptide was lyophilized. 10 pg of the Pga-PAS20C peptide was dissolved in H2O with 0.1 % (v/v) formic acid and electrospray ionization-mass spectrometry (ESI-MS) was performed using a maXis Q-TOF instrument (Bruker Daltonics, Bremen, Germany) in the positive ion mode. The deconvoluted ESI-mass spectrum showed a distinct main species corresponding to the calculated mass of the Pga-PAS20C monomer (1884.0 Da) and a minor peak corresponding to the calculated mass of the disulphide-linked Pga-PAS20C dimer (3766.0 Da) (Figure 9).

Claims

New PCT-Patent Application based on EP23206414.7 XL-protein GmbH Vossius Ref.: AG3095 PCT S3 CLAIMS

1. A method for the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides linked by a cleavable linker, wherein the method comprises the steps of a) the recombinant production of a concatenated polypeptide comprising two or more random coil polypeptides, wherein said random coil polypeptides are linked by a cleavable linker; and b) the purification of said concatenated polypeptide.

2. The method of claim 1, wherein said random coil polypeptide comprises proline, alanine, and serine residues, and wherein said random coil polypeptide comprises at least one proline residue, at least one alanine residue, and at least one serine residue.

3. The method of claim 2, wherein said proline residues constitute more than about 4 mol% and less than about 40 mol% of the amino acids of said random coil polypeptide.

4. The method of claim 3, wherein said random coil polypeptide comprises at least about 95 mol% proline, alanine, and serine residues.

5. The method of claim 1, wherein said random coil polypeptide comprises proline and alanine, residues, and wherein said random coil polypeptide comprises at least one proline residue and at least one alanine residue.

6. The method of claim 5, wherein said proline residues constitute more than about 10 mol% and less than about 70 mol% of the amino acids of said random coil polypeptide.

7. The method of claim 6, wherein said random coil polypeptide comprises at least about 95 mol% proline and alanine residues.

8. The method of any one of claims 1 to 7, wherein said random coil polypeptide comprises no more than 6 identical consecutive amino acid residues.

9. The method of any one of claims 1 to 8, wherein said random coil polypeptide comprises an N- terminal or C-terminal cysteine residue.

10. The method of any one of claims 1 to 9, wherein said random coil polypeptide comprises about 10 to about 200 amino acid residues, preferably about 20, about 30, about 40, about 50, about 60, about 80, or about 100 amino acid residues.

11. The method of any one of claims 1 to 10, wherein said random coil polypeptide is selected from the group consisting of the following (I) to (II):

(I) a random coil polypeptide comprising an amino acid sequence selected from the group consisting of SEQ IDs NO: 1 to 12;

(II) a random coil polypeptide comprising an amino acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the amino acid sequence as defined in (I).

12. The method of any one of claims 1 to 11, wherein said random coil polypeptide additionally comprises an N-terminal blocking residue.

13. The method of claim 12, wherein said N-terminal blocking residue consists of a glutamine, glutamate, or pyroglutamate residue.

14. The method of any one of claims 1 to 13, wherein said concatenated polypeptide comprises between about 2 and about 20 random coil polypeptides, preferably at least about 2, at least about 5, or at least about 10 random coil polypeptides.

15. The method of any one of claims 1 to 14, wherein said two or more random coil polypeptides each comprise the same amino acid sequence.

16. The method of any one of claims 1 to 15, wherein said concatenated polypeptide comprises a carrier polypeptide.

17. The method of claim 16, wherein said carrier polypeptide is linked to one or more of said random coil polypeptides via a cleavable linker.

18. The method of claim 16 or 17, wherein said carrier polypeptide is selected from thioredoxin A (TrxA) from Escherichia coli (E. coli). maltose-binding protein (MBP) from E. coli or NusA from E. coli or SUMO from yeast or galactose-binding protein (GBP) from E. coli or glutathione- S-transferase, preferably TrxA.

19. The method of any one of claims 16 to 18, wherein said carrier polypeptide is an N-terminal carrier polypeptide.

20. The method of any one of claims 1 to 19, wherein said concatenated polypeptide comprises a tag.

21. The method of claim 20, wherein said tag is linked to one or more of said random coil polypeptides via a cleavable linker.

22. The method of claim 20 or 21, wherein said tag is an affinity tag, and wherein said affinity tag is selected from polyhistidine-tag, Strep-tag, Strep-tag II, FLAG tag, HA tag, Halo tag, Arg-tag, preferably a hexahistidine tag.

23. The method of any one of claims 20 to 22, wherein said tag is an N-terminal or a C-terminal tag, preferably a C-terminal tag.

24. The method of any one of claims 1 to 23, wherein said cleavable linker comprises one or more amino acid residues.

25. The method of claim 24, wherein said cleavable linker comprises less than about 10 amino acid residues, preferably one amino acid residue.

26. The method of any one of claims 1 to 25, wherein said cleavable linker does not solely comprise proline, alanine, or serine.

27. The method of any one of claims 1 to 26, wherein said cleavable linker comprises one or more basic amino acid residues, preferably one basic amino acid residue.

28. The method claim 27, wherein said one or more basic amino acid residues are selected from one or more lysine and/or one or more arginine residues, preferably one or more lysine residues, more preferably one lysine residue or one arginine residue.

29. The method of any one of claims 1 to 28, wherein said concatenated polypeptide comprises an amino acid sequence selected from the group consisting of the following (I) to (II):

(I) a concatenated polypeptide comprising an amino acid sequence selected from the group consisting of SEQ IDs NO: 13 to 24;

(II) a concatenated polypeptide comprising an amino acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the amino acid sequence as defined in (I).

30. The method of any one of claims 1 to 29 wherein step a) comprises the expression of said concatenated polypeptide via a nucleic acid molecule or a nucleic acid vector encoding said concatenated polypeptide.

31. The method of claim 30, wherein said nucleic acid molecule is selected from the group consisting of the following (I) to (IV):

(I) a nucleic acid molecule comprising a nucleic acid sequence selected from the group consisting of SEQ IDs NO: 36 to 47;

(II) a nucleic acid molecule hybridizing with a nucleic acid molecule comprising a nucleic acid sequence complementary to a nucleic acid sequence selected from the group consisting of SEQ IDs NO: 36 to 47 under stringent conditions;

(III) a nucleic acid molecule comprising a nucleic acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the nucleic acid sequence as defined in any one of (I) and (II); and

(IV) a nucleic acid molecule comprising a nucleic acid sequence encoding a concatenated polypeptide as defined in claim 29.

32. The method of claim 30, wherein said nucleic acid vector comprises means for inducible expression of the concatenated polypeptide, preferably an inducible promoter, more preferably an isopropyl P-d-1 -thiogalactopyranoside (IPTG) inducible promoter.

33. The method of claim 30 or 32, wherein said nucleic acid vector comprises a nucleic acid sequence selected from the group consisting of the following (I) to (IV):

(I) a nucleic acid vector comprising a nucleic acid sequence selected from the group consisting of SEQ IDs NO: 48 to 59;

(II) a nucleic acid vector hybridizing with a nucleic acid molecule comprising a nucleic acid sequence complementary to a nucleic acid sequence selected from the group consisting of SEQ IDs NO: 48 to 59 under stringent conditions;

(III) a nucleic acid vector comprising a nucleic acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the nucleic acid sequence as defined in any one of (I) and (II); and

34. The method of any one of claims 30 to 33, wherein step a) comprises the culturing of a host or host cell.

35. The method of claim 34, wherein said host or host cell is prokaryotic or eukaryotic, preferably procaryotic, more preferably E. coli. P. fluorescence, C. glutamicum or a Bacillus strain, even more preferably E. coli.

36. The method of any one of claims 1 to 35, wherein step b) comprises performing purification via affinity chromatography of said concatenated polypeptide, preferably metal affinity chromatography .

37. A concatenated polypeptide as defined in any one of claims 1 to 36, or a concatenated polypeptide obtained/obtainable by the method of any one of claims 1 to 36.

38. A nucleic acid molecule comprising a nucleic acid sequence encoding the concatenated polypeptide of claim 37.

39. The nucleic acid molecule of claim 38, wherein each random coil polypeptide comprised in said encoded concatenated polypeptide is encoded by a different nucleic acid sequence.

40. The nucleic acid molecule of claim 38 or 39, wherein said nucleic acid molecule comprises a nucleic acid sequence selected from the group consisting of the following (I) to (IV):

(I) a nucleic acid molecule comprising a nucleic acid sequence selected from any one of SEQ IDs NO: 36 to 47; (II) a nucleic acid molecule hybridizing with a nucleic acid molecule comprising a nucleic acid sequence complementary to the nucleic acid sequence selected from any one of SEQ IDs NO: 36 to 47 under stringent conditions;

(III) a nucleic acid molecule comprising a nucleic acid sequence having at least about 60%, at least 70%, at least about 80%, at least about 90%, at least about 92%, at least about 94%, at least about 96%, at least about 98%, or at least about 99% identity to the nucleic acid sequence as defined in any one of (I) and (II);

(IV) a nucleic acid molecule comprising a nucleic acid sequence encoding a concatenated polypeptide as defined in claim 37.

41. A nucleic acid molecule comprising a nucleic acid sequence encoding a random coil polypeptide coupled to a C-terminal cleavable linker, wherein said nucleic acid molecule is selected from the group consisting of the following (I) to (IV):

(I) a nucleic acid molecule comprising a nucleic acid sequence selected from any one of SEQ IDs NO: 60 to 127;

(II) a nucleic acid molecule hybridizing with a nucleic acid molecule comprising a nucleic acid sequence complementary to the nucleic acid sequence selected from any one of SEQ IDs NO: 60 to 127 under stringent conditions;

(IV) a nucleic acid molecule comprising a nucleic acid sequence encoding a random coil polypeptide comprising an amino acid sequence selected from any one of SEQ IDs NO: 1 to 12 coupled to a C-terminal cleavable linker as defined in any one of claims 24 to 28.

42. The nucleic acid molecule of claim 41, wherein the nucleic acid sequence further encodes an amino acid codon selected from the group of glutamine or glutamate on the 5 ’-end.

43. The nucleic acid molecule of any one of claims 40 to 42, wherein the nucleic acid sequence further encodes a carrier polypeptide preferably selected from the group of TrxA from E. coli. MBP from E. coli or NusA from E. coli or SUMO from yeast or GBP from E. coli or Glutathione-S-Transferase or a mutated dehalogenase from Rhodococcus sp. (HaloTag), preferably TrxA, and wherein the nucleic acid sequence encoding a carrier polypeptide is linked to the neighboring genetic element by a cleavable linker.

44. The nucleic acid molecule of any one of claims 40 to 43, wherein the nucleic acid sequence further encodes a tag, preferably an affinity tag selected from the group of polyhistidine-tag, Strep-tag, Strep-tag II, FLAG tag, HA tag, Halo tag, Arg-tag, preferably a hexahistidine tag, and wherein the nucleic acid sequence encoding a tag is linked to the neighboring genetic element by a cleavable linker.

45. The nucleic acid molecule of any one of claims 40 to 44, wherein said cleavable linker is encoded by a nucleic acid sequence encoding one or more basic amino acid residues, preferably one or more lysine or arginine residues, more preferably one or more lysine residues, most preferably one lysine residue or one arginine residue.

46. The polypeptide encoded by the nucleic acid molecule of any one of claims 40 to 45.

47. A nucleic acid vector comprising the nucleic acid molecule of any one of claims 40 to 45.

48. A host or host cell comprising the nucleic acid molecule of any one of claims 40 to 45, and/or a host or host cell comprising the nucleic acid vector of claim 47.

49. A method for the production of a random coil polypeptide, wherein said method comprises the steps of a) the cleavage of the concatenated polypeptide of claim 37, or the cleavage of a concatenated polypeptide expressed from the nucleic acid molecule of any one of claims 40 to 45 or the nucleic acid vector or claim 47; and b) the purification of said random coil polypeptide.

50. The method of claim 49, wherein in step a) said cleavage leads to the release of the random coil polypeptide, in particular the release from said concatenated polypeptide.

51. The method of claim 49 or 50, wherein step a) comprises the treatment of the concatenated polypeptide with one or more peptidases, one or more proteases, and/or one or more proteinases.

52. The method of claim 51, wherein said peptidases are one or more endopeptidases, or a combination of one or more endopeptidases and one or more exopeptidases.

53. The method of claim 52, wherein said endopeptidases are selected from trypsin, Arg-C proteinase, clostripain, chymotrypsin, pepsin, endoproteinase Lys-C, preferably trypsin.

54. The method of claim 52 or 53, wherein said exopeptidases are selected from carboxypeptidase B, carboxypeptidase E, carboxypeptidase N, carboxypeptidase D, metallocarboxypeptidase D, metalloendopeptidase Lys-N or carboxypeptidase A, preferably carboxypeptidase B.

55. The method of any one of claims 49 to 54, wherein step b) comprises performing purification via liquid chromatography (LC) of said random coil polypeptide, preferably via high-pressure liquid chromatography (HPLC).

56. The method of any one of claims 49 to 55, wherein the method further comprises the step of c) the protection of all but one reactive amino acid side chains of said random coil polypeptide.

57. The method of claim 56, wherein step c) comprises the protection of the N-terminus of said random coil polypeptide.

58. The method of claim 56 or 57, wherein step c) comprises leaving the C-terminus of said random coil polypeptide unprotected.

59. The method of claim 57 or 58, wherein step c) comprises the addition of an N-terminal protection group on the N-terminus of said random coil polypeptide, and wherein said N-terminal protection group is selected from pyroglutamoyl, formyl, CO(Ci-4 alkyl), and homopyroglutamoyl, wherein the alkyl moiety comprised in said -CO(Ci-4 alkyl) is optionally substituted with one or two groups independently selected from -OH, -O(Ci-4 alkyl), -NH(CI-4 alkyl), -N(CI-4 alkyl)( Ci-4 alkyl) and

-COOH.

60. The method of claim 59, wherein said N-terminal protection group is selected from pyroglutamoyl, formyl, acetyl, hydroxyacetyl, methoxyacetyl, ethoxyacetyl, propoxyacetyl, malonyl, propionyl,

2-hydroxypropionyl, 3 -hydroxypropionyl, 2-methoxypropionyl,

3 -methoxypropionyl, 2-ethoxypropionyl, 3 -ethoxypropionyl, succinyl, butyryl,

2-hydroxybutyryl, 3 -hydroxybutyryl, 4-hydroxybutyryl, 2-methoxybutyryl, 3 -methoxybutyryl,

4-methoxybutyryl, glycine betainyl, glutaryl, and homopyroglutamoyl, preferably pyroglutamoyl.

61. The method of any one of claims 56 to 60, wherein step c) comprises the cyclisation of an N-terminal blocking residue.

62. The method of claim 61, wherein cyclisation of said N-terminal blocking residue comprises the reaction of said N-terminal blocking residue with an acid selected from pivalic acid, acetic acid 2,2-dimethylbutyric acid, 2,2-dimethylvaleric acid, formic acid, 2-propanoic acid (propionic acid), butanoic acid (butyric acid), 2-hydroxypropanoic acid (lactic acid), 2,4-hexanedienoic acid (sorbic acid), 2-butenedioic acid (fumaric acid), hydroxybutanedioic acid (malic acid), 2,3 -dihydroxybutanedioic acid (tartaric acid), 2-hydroxy- 1,2, 3 -propanetricarboxylic acid (citric acid), benzenecarboxylic acid (benzoic acid), trifluoroacetic acid, trichloroacetic acid, p-toluenesulfonic acid, trifluoromethanesulfonic acid, preferably pivalic acid or acetic acid.

63. The method of any one of claims 56 to 62, wherein the method further comprises the step of c’) the purification and/or the concentration of said protected random coil polypeptide.

64. The method of claim 63, wherein step c’) is carried out after step c), and wherein step c) is carried out after step b).

65. The method of claim 63 or 64, wherein step c’) comprises the evaporation of solvents and acids that are in contact with said random coil polypeptide using an entrainer.

66. The method of claim 65, wherein said entrainer is selected from toluene, benzene, m-xylene, n- heptane, n-octane, tetrachloroethylene, ethyl acetate, preferably toluene.

67. The method of any one of claims 63 to 66, wherein step c’) comprises the precipitation of said random coil polypeptide using a solvent selected from diethyl ether, toluene, diisopropyl ether, cyclohexane, benzene, trichloromethane, di-n-butyl ether, o-xylene, m-xylene, p-xylene, butyl acetate, ethyl benzene, preferably diethyl ether.

68. A random coil polypeptide as defined in any one of claims 1 to 67, or a random coil polypeptide obtained/obtainable by the method of any one of claims 49 to 67, or a salt thereof.

69. Use of the random coil polypeptide of claim 68 for conjugation with a biomolecule.

70. A method for the production of a conjugate comprising a random coil polypeptide, wherein the method comprises the steps of a) the coupling of the random coil polypeptide of claim 68 to a biomolecule, and b) the purification of the conjugate.

71. The method of claim 70, wherein the biomolecule is selected from an enzyme, a protein, a peptide, a lipid, a dialkylamine, a fatty acid, a carbohydrate, a cyclodextrin, a nucleic acid, DNA, RNA or a peptide nucleic acid (PNA), preferably an enzyme.

72. The method of claim 70 or 71, wherein step b) comprises performing purification via LC of the conjugate.

73. A conjugate as defined in any one of claims 70 to 72, or a conjugate obtained/obtainable by the method of any one of claims 70 to 72, or a salt thereof.

74. A composition comprising the conjugate of claim 73, or a salt thereof.

75. The conjugate of claim 73 or a salt thereof, and/or the composition of claim 74 for use as a medicament.

76. A kit comprising the composition of claim 74, the conjugate of claim 73 or a salt thereof, the random coil polypeptide of claim 68 or a salt thereof, the concatenated polypeptide of claim 37, the nucleic acid molecule of any one of claims 40 to 45, the nucleic acid vector of claim 47, and/or the host or host cell of claim 48.