HK1247216B

HK1247216B - Microbial transglutaminases, substrates therefor and methods for the use thereof

Info

Publication number: HK1247216B
Application number: HK18106710.6A
Authority: HK
Inventors: Thomas Albert; Frank Bergmann; Victor Lyamichev; Jigar Patel; Michael Schraeml; Wojtek STEFFEN; Thomas STREIDL
Original assignee: F. Hoffmann-La Roche Ag
Priority date: 2014-12-19
Filing date: 2015-12-17
Publication date: 2022-07-15

Description

Microbial transglutaminases, substrates and methods of use thereof

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请是基于以下申请、要求以下申请的权益并将以下申请通过引用并入本文中：2014年12月19日提交且标题为“Identification of Transglutaminase Substratesand Uses Therefor”的美国临时专利申请系列号62/094,495，和2015年11月25日提交且标题为“System and Method for Identification and Characterization ofTransglutaminase Species”的美国临时专利申请系列号62/260,162。This application is based upon, claims the benefit of, and incorporates by reference U.S. Provisional Patent Application Serial No. 62/094,495, filed on December 19, 2014, and entitled “Identification of Transglutaminase Substrates and Uses Therefor,” and U.S. Provisional Patent Application Serial No. 62/260,162, filed on November 25, 2015, and entitled “System and Method for Identification and Characterization of Transglutaminase Species.”

关于联邦资助研究的声明Statement Regarding Federally Funded Research

不适用。not applicable.

背景技术Background Art

本发明一般地涉及转谷氨酰胺酶和底物的鉴别，因此，且更具体地地涉及来自白色库茨涅尔氏菌（Kutzneria albida）的微生物转谷氨酰胺酶的发现和表征。The present invention relates generally to the identification of transglutaminases and substrates, and more specifically to the discovery and characterization of a microbial transglutaminase from Kutzneria albida .

阐明酶活性和特异性的细节对于理解酶的生理功能而言和对于由酶催化的反应的生物技术应用而言是重要的。例如，转谷氨酰胺酶属于有关酶的大家族，包括微生物和哺乳动物转谷氨酰胺酶。转谷氨酰胺酶通过形成谷氨酰胺残基的γ-甲酰胺基和赖氨酸残基的ε-氨基之间的异肽键来催化两个多肽或肽链之间的交联。阐明转谷氨酰胺酶活性和特异性的细节对于由转谷氨酰胺酶催化的交联反应的生物技术应用（例如，用于标记、加标签、多蛋白复合物形成等的蛋白的修饰）而言是重要的。Elucidating the details of enzyme activity and specificity is important for understanding the physiological functions of enzymes and for the biotechnological applications of reactions catalyzed by enzymes. For example, transglutaminases belong to a large family of related enzymes that includes microbial and mammalian transglutaminases. Transglutaminases catalyze the cross-linking of two polypeptides or peptide chains by forming an isopeptide bond between the γ-carboxamide group of a glutamine residue and the ε-amino group of a lysine residue. Elucidating the details of transglutaminases' activity and specificity is important for the biotechnological applications of cross-linking reactions catalyzed by transglutaminases (e.g., modification of proteins for labeling, tagging, multiprotein complex formation, etc.).

因为它的小尺寸、稳健的性能、稳定性和它的活性的钙独立性，迄今为止微生物转谷氨酰胺酶是研究得最多的转谷氨酰胺酶。几个研究已经证实，多种长烷基胺可以替换转谷氨酰胺酶的赖氨酸底物，并且简单的二肽谷氨酰胺-甘氨酸可以充当谷氨酰胺底物。转谷氨酰胺酶的赖氨酸和谷氨酰胺底物的这些发现有助于开发关于转谷氨酰胺酶活性的多种测试和关于使用转谷氨酰胺酶修饰蛋白的实用测定。但是，在已知的和新颖的转谷氨酰胺酶的鉴别和表征中仍然可能产生几个挑战。一个挑战是，特定转谷氨酰胺酶对于异肽键形成的特异性可能对于特定应用而言过宽或过窄。另一个挑战是，具有相同或类似底物特异性的转谷氨酰胺酶不可用于正交标记策略等。另一个挑战是未表征的或较差表征的转谷氨酰胺酶的底物的鉴别。其它挑战可能随与给定的转谷氨酰胺酶有关的因素（诸如起源、特异性、活性、稳定性等和它们的组合）产生。Because of its small size, robust performance, stability and calcium independence of its activity, microbial transglutaminase is the most studied transglutaminase to date. Several studies have demonstrated that a variety of long alkylamines can replace the lysine substrate of transglutaminase and that the simple dipeptide glutamine-glycine can serve as a glutamine substrate. These discoveries of lysine and glutamine substrates for transglutaminases have facilitated the development of various tests for transglutaminase activity and practical assays for modifying proteins using transglutaminases. However, several challenges may still arise in the identification and characterization of known and novel transglutaminases. One challenge is that the specificity of a particular transglutaminase for isopeptide bond formation may be too broad or too narrow for a particular application. Another challenge is that transglutaminases with the same or similar substrate specificity are not available for orthogonal labeling strategies, etc. Another challenge is the identification of substrates for uncharacterized or poorly characterized transglutaminases. Additional challenges may arise with factors related to a given transglutaminase, such as origin, specificity, activity, stability, etc. and combinations thereof.

发明概述SUMMARY OF THE INVENTION

本发明通过提供用于鉴别和表征转谷氨酰胺酶的系统和方法、以及所用的底物和用途克服了前述缺点。The present invention overcomes the aforementioned disadvantages by providing systems and methods for identifying and characterizing transglutaminases, as well as substrates and uses therefor.

根据本发明的一个方面，用于微生物转谷氨酰胺酶的底物标签包括与肽序列YRYRQ (SEQ ID NO:1)具有至少80%序列同一性的酰基-供体标签，和与肽序列RYESK (SEQID NO:2)具有至少80%序列同一性的胺供体标签之一。According to one aspect of the present invention, a substrate tag for a microbial transglutaminase comprises one of an acyl-donor tag having at least 80% sequence identity to the peptide sequence YRYRQ (SEQ ID NO: 1), and an amine-donor tag having at least 80% sequence identity to the peptide sequence RYESK (SEQ ID NO: 2).

在一个方面，所述微生物转谷氨酰胺酶与白色库茨涅尔氏菌微生物转谷氨酰胺酶(SEQ ID NO:6)具有至少80%序列同一性。In one aspect, the microbial transglutaminase has at least 80% sequence identity to Kuznetsova albicans microbial transglutaminase (SEQ ID NO: 6).

在另一个方面，所述底物标签还包括可检测标记。In another aspect, the substrate tag further comprises a detectable label.

在另一个方面，所述可检测标记选自生物素部分、荧光染料、钌标记、放射性标记和化学发光标记。In another aspect, the detectable label is selected from the group consisting of a biotin moiety, a fluorescent dye, a ruthenium label, a radioactive label, and a chemiluminescent label.

在另一个方面，所述酰基-供体标签具有肽序列APRYRQRAA (SEQ ID NO:24)。In another aspect, the acyl-donor tag has the peptide sequence APRYRQRAA (SEQ ID NO: 24).

根据本发明的另一个方面，在有微生物转谷氨酰胺酶存在下形成异肽键的方法包括：将微生物转谷氨酰胺酶暴露于第一底物和第二底物，所述第一底物包括与肽序列YRYRQ(SEQ ID NO:1)具有至少80%序列同一性的酰基-供体标签，且所述第二底物包括与肽序列RYESK (SEQ ID NO:2)具有至少80%序列同一性的胺-供体标签，和交联所述第一底物和所述第二底物，由此形成所述酰基供体标签和所述氨基供体标签之间的异肽键。According to another aspect of the invention, a method for forming an isopeptide bond in the presence of a microbial transglutaminase comprises exposing the microbial transglutaminase to a first substrate and a second substrate, wherein the first substrate comprises an acyl-donor tag having at least 80% sequence identity to the peptide sequence YRYRQ (SEQ ID NO: 1), and the second substrate comprises an amine-donor tag having at least 80% sequence identity to the peptide sequence RYESK (SEQ ID NO: 2), and cross-linking the first substrate and the second substrate, thereby forming an isopeptide bond between the acyl-donor tag and the amino-donor tag.

在另一个方面，交联所述第一底物和所述第二底物的步骤形成酰基-供体标签的γ-甲酰胺基和氨基-供体标签的ε-氨基之间的异肽键。In another aspect, the step of cross-linking the first substrate and the second substrate forms an isopeptide bond between the γ-carboxamide group of the acyl-donor tag and the ε-amino group of the amino-donor tag.

在另一个方面，所述第一底物和所述第二底物中的至少一种包括可检测标记。In another aspect, at least one of the first substrate and the second substrate comprises a detectable label.

在另一个方面，所述第一底物与所述第二底物的交联以至少约70%的收率实现。In another aspect, cross-linking of the first substrate to the second substrate is achieved in a yield of at least about 70%.

在另一个方面，所述收率在约30分钟内实现。In another aspect, the yield is achieved in about 30 minutes.

根据本发明的另一个方面，用于在有微生物转谷氨酰胺酶存在下形成异肽键的试剂盒包括经纯化的微生物转谷氨酰胺酶，其与白色库茨涅尔氏菌微生物转谷氨酰胺酶(SEQID NO:6)具有至少80%序列同一性。According to another aspect of the invention, a kit for forming isopeptide bonds in the presence of a microbial transglutaminase comprises a purified microbial transglutaminase having at least 80% sequence identity to Kuznetsova albicans microbial transglutaminase (SEQ ID NO: 6).

在一个方面，所述试剂盒还包括第一底物和第二底物中的一种，所述第一底物包括与肽序列YRYRQ (SEQ ID NO:1)具有至少80%序列同一性的酰基-供体标签，所述第二底物包括与肽序列RYESK (SEQ ID NO:2)具有至少80%序列同一性的胺-供体标签。In one aspect, the kit further comprises one of a first substrate comprising an acyl-donor tag having at least 80% sequence identity to the peptide sequence YRYRQ (SEQ ID NO: 1) and a second substrate comprising an amine-donor tag having at least 80% sequence identity to the peptide sequence RYESK (SEQ ID NO: 2).

在另一个方面，所述试剂盒还包括所述第一底物和所述第二底物中的另一种。In another aspect, the kit further comprises the other of the first substrate and the second substrate.

根据本发明的另一个方面，用于形成异肽键的酶包括经纯化的微生物转谷氨酰胺酶，其与白色库茨涅尔氏菌微生物转谷氨酰胺酶(SEQ ID NO:6)具有至少80%序列同一性。According to another aspect of the present invention, the enzyme for forming an isopeptide bond comprises a purified microbial transglutaminase having at least 80% sequence identity to Kuznetsova albicans microbial transglutaminase (SEQ ID NO: 6).

在一个方面，在有铵存在下表达和分离所述分离的微生物转谷氨酰胺酶。In one aspect, the isolated microbial transglutaminase is expressed and isolated in the presence of ammonium.

在另一个方面，所述铵以至少约10 μM的浓度存在。In another aspect, the ammonium is present at a concentration of at least about 10 μM.

根据本发明的另一个方面，用于转谷氨酰胺酶的酰基-供体底物包括具有式Xaa₁-Xaa₂-Xaa₃-Xaa₄-Xaa₅的氨基酸序列，其中Xaa是任意氨基酸，其中Xaa₃、Xaa₄和Xaa₅中的至少一个是谷氨酰胺，其中Xaa₄和Xaa₅中的一个是精氨酸，其中所述氨基酸序列包括至少一个连续邻近谷氨酰胺的精氨酸，且其中所述氨基酸序列中选自精氨酸、谷氨酰胺、苯丙氨酸、色氨酸和酪氨酸的氨基酸的总数是至少4。According to another aspect of the invention, an acyl-donor substrate for a transglutaminase comprises an amino acid sequence having the formula _Xaa1 - _Xaa2 - _Xaa3 - _Xaa4 - _Xaa5 , wherein Xaa is any amino acid, wherein at least one of _Xaa3 , _Xaa4 and _Xaa5 is glutamine, wherein one of _Xaa4 and _Xaa5 is arginine, wherein the amino acid sequence comprises at least one arginine consecutively adjacent to glutamine, and wherein the total number of amino acids selected from the group consisting of arginine, glutamine, phenylalanine, tryptophan and tyrosine in the amino acid sequence is at least 4.

在一个方面，Xaa₅以及Xaa₁、Xaa₂和Xaa₃中的至少一个是精氨酸。In one aspect, Xaa ₅ and at least one of Xaa ₁ , Xaa ₂ , and Xaa ₃ are arginine.

根据本发明的另一个方面，用于转谷氨酰胺酶的胺-供体底物包括具有式Xaa₁-Xaa₂-Xaa₃-Xaa₄-Xaa₅的氨基酸序列，其中Xaa是任意氨基酸，其中所述氨基酸序列包括至少一个赖氨酸，其中Xaa₁和Xaa₂中的一个选自酪氨酸和精氨酸，且其中所述氨基酸序列中选自精氨酸、丝氨酸、酪氨酸和赖氨酸的氨基酸的总数是至少3。According to another aspect of the invention, an amine-donor substrate for a transglutaminase comprises an amino acid sequence having the formula _Xaa1 - _Xaa2 - _Xaa3 - _Xaa4 - _Xaa5 , wherein Xaa is any amino acid, wherein the amino acid sequence comprises at least one lysine, wherein one of _Xaa1 and _Xaa2 is selected from tyrosine and arginine, and wherein the total number of amino acids selected from arginine, serine, tyrosine and lysine in the amino acid sequence is at least 3.

在一个方面，Xaa₄和Xaa₅之一是赖氨酸。In one aspect, one of Xaa ₄ and Xaa ₅ is lysine.

在另一个方面，所述氨基酸序列包括不超过2个氨基酸赖氨酸。In another aspect, the amino acid sequence comprises no more than 2 amino acids lysine.

本发明的前述和其它方面和优点将从以下描述显现。在描述中，参考附图，其形成本发明的一部分，且其中通过举例说明显示了本发明的优选实施方案。这样的实施方案不一定代表本发明的整个范围，但是，因此参考权利要求和本文来解释本发明的范围。The foregoing and other aspects and advantages of the present invention will become apparent from the following description. In the description, reference is made to the accompanying drawings, which form a part hereof and in which preferred embodiments of the invention are shown by way of illustration. Such embodiments do not necessarily represent the entire scope of the invention, but, therefore, reference is made to the claims and this text for interpreting the scope of the invention.

附图简述BRIEF DESCRIPTION OF THE DRAWINGS

图1的示意图图解了根据本发明的用于鉴别和表征转谷氨酰胺酶种类的方法的一个实施方案。FIG1 is a schematic diagram illustrating one embodiment of a method for identifying and characterizing transglutaminase species according to the present invention.

图2A是白色库茨涅尔氏菌KALB_7456假定蛋白(上行)和茂原链霉菌（Streptomyces mobaraensis）微生物转谷氨酰胺酶(下行)的Clustal Omega 1.2.1版(Sievers等人, 2011.Molecular Systems Biology 7:539)多序列比对。相同氨基酸残基用星号(*)标记，类似残基用冒号(:)标记。茂原链霉菌（S.mobaraensis）微生物转谷氨酰胺酶催化三联体(Cys, Asp, His)的保守残基用灰色突出显示。Figure 2A shows a multiple sequence alignment of the Kuznetsova albicans KALB_7456 hypothetical protein (upper row) and the Streptomyces mobaraensis microbial transglutaminase (lower row) using Clustal Omega version 1.2.1 (Sievers et al., 2011. Molecular Systems Biology 7:539). Identical amino acid residues are marked with an asterisk (*), and similar residues are marked with a colon (:). Conserved residues in the catalytic triad (Cys, Asp, His) of the S. mobaraensis microbial transglutaminase are highlighted in gray.

图2B是来自白色库茨涅尔氏菌（K.albida）的假定转谷氨酰胺酶KALB_7456的氨基酸序列，包括通过ProP 1.0 (Duckert等人, 2004.Protein Engineering, Design and Selection 17: 107-112)确定的一般切割位点预测，并指出了预测的信号肽序列(‘s’)和前肽切割位点(‘P’)。Figure 2B is the amino acid sequence of the putative transglutaminase KALB_7456 from K. albida, including the general cleavage site prediction determined by ProP 1.0 (Duckert et al., 2004. Protein Engineering, Design and Selection 17: 107-112), with the predicted signal peptide sequence ('s') and propeptide cleavage site ('P') indicated.

图2C的条形图显示了通过ProP 1.0确定的来自白色库茨涅尔氏菌的假定转谷氨酰胺酶KALB_7456随氨基酸序列位置而变化的前肽切割潜力。短划线指示前肽切割潜力阈值，且虚线指示信号肽的预测氨基酸位置。Figure 2C is a bar graph showing the propeptide cleavage potential of the putative transglutaminase KALB_7456 from Kuznetsova albicans as a function of amino acid sequence position, as determined by ProP 1.0. The dashed line indicates the propeptide cleavage potential threshold, and the dotted line indicates the predicted amino acid position of the signal peptide.

图3A是显示白色库茨涅尔氏菌转谷氨酰胺酶(KalbTG)融合蛋白的表达概况的SDS-PAGE凝胶的光学图像。将氨基-端融合配偶体分组在邻近泳道中，如1-7编号的泳道对所指示的(1:8X-His标签；2：dsbA信号肽；3：ompT信号肽；4：大肠杆菌（E.coli）SlyD(EcSlyD)；5:2xEcSlyD；6：FkpA；7：麦芽糖结合蛋白)。给标记‘L’的泳道加载标准分子量阶梯(以kDa为单位显示值)。被标记为‘P’和‘S’的各个泳道分别表示大肠杆菌细胞裂解物的不溶性的(沉淀物)和可溶性的(上清液)级分。考马斯蓝染色的代表His-KalbTG的蛋白带在泳道对1中用星号(*)标记。Figure 3A is an optical image of an SDS-PAGE gel showing the expression profile of Kuznetsova albicans transglutaminase (KalbTG) fusion proteins. The amino-terminal fusion partners are grouped in adjacent lanes, as indicated by lane pairs numbered 1-7 (1: 8X-His tag; 2: dsbA signal peptide; 3: ompT signal peptide; 4: Escherichia coli ( E. coli ) SlyD (EcSlyD); 5: 2xEcSlyD; 6: FkpA; 7: maltose binding protein). Lanes labeled 'L' are loaded with a standard molecular weight ladder (values shown in kDa). Lanes labeled 'P' and 'S' represent the insoluble (precipitate) and soluble (supernatant) fractions of E. coli cell lysate, respectively. The Coomassie blue-stained protein band representing His-KalbTG is marked with an asterisk (*) in lane pair 1.

图3B是2xSlyD-融合蛋白表达和纯化策略的示意图。与对裂解敏感（sensitive-to-lysis）D (SlyD)蛋白的两个部分的N-端融合会赋予可溶性且可被因子Xa切割。胰蛋白酶对前肽序列的切割会使该酶进一步成熟。Figure 3B is a schematic diagram of the expression and purification strategy of the 2xSlyD-fusion protein. N-terminal fusion with two parts of the sensitive-to-lysis D (SlyD) protein confers solubility and is cleavable by Factor Xa. Cleavage of the propeptide sequence by trypsin further matures the enzyme.

图3C是显示KalbTG的模块纯化的SDS-PAGE凝胶的图像。给标记‘L’的泳道加载标准分子量阶梯(以kDa为单位显示值)。泳道1：来自第一次Ni⁺-IMAC梯度洗脱(0-250 mM咪唑)的含有2xSlyD-KalbTG的级分。泳道2：通过Ni⁺-IMAC梯度洗脱、柱上因子Xa消化和尺寸排阻色谱法纯化的KalbTG酶原。泳道3：用因子Xa和胰蛋白酶(0-250 mM咪唑)连续柱上消化以后来自第二次Ni⁺-IMAC梯度洗脱的级分。泳道4：泳道3的浓缩物，通过50,000分子量截止(MWCO)膜过滤。FIG3C is an image of an SDS-PAGE gel showing modular purification of KalbTG. Lanes labeled 'L' were loaded with a standard molecular weight ladder (values shown in kDa). Lane 1: Fractions containing 2xSlyD-KalbTG from the first Ni ⁺ -IMAC gradient elution (0-250 mM imidazole). Lane 2: Proenzyme of KalbTG purified by Ni ⁺ -IMAC gradient elution, on-column Factor Xa digestion, and size exclusion chromatography. Lane 3: Fractions from the second Ni ⁺ -IMAC gradient elution after sequential on-column digestion with Factor Xa and trypsin (0-250 mM imidazole). Lane 4: Concentrate of lane 3, filtered through a 50,000 molecular weight cutoff (MWCO) membrane.

图4A的log-log散布图显示了在有生物素化的胺-供体底物存在下由KalbTG针对5-聚体肽阵列上的重复部件产生的荧光信号数据之间的关联。每个数据点代表来自140万一式两份地合成的独特肽的文库的一对重复肽。22个显示最高荧光信号的数据点用它们各自的5-聚体肽序列标记。Figure 4A is a log-log scatter plot showing the correlation between the fluorescence signal data generated by KalbTG against repeat elements on a 5-mer peptide array in the presence of a biotinylated amine-donor substrate. Each data point represents a pair of repeat peptides from a library of 1.4 million unique peptides synthesized in duplicate. The 22 data points showing the highest fluorescence signal are labeled with their respective 5-mer peptide sequences.

图4B的log-log散布图显示了在有生物素化的谷氨酰胺-供体底物(Z-APRYRQRAAGGG-PEG-生物素)存在下由KalbTG针对5-聚体肽阵列上的重复部件产生的荧光信号数据之间的关联。每个数据点代表来自140万一式两份地合成的独特肽的文库的一对重复肽。17个显示最高荧光信号的数据点用它们各自的5-聚体肽序列标记。Figure 4B is a log-log scatter plot showing the correlation between the fluorescence signal data generated by KalbTG against repeat elements on a 5-mer peptide array in the presence of a biotinylated glutamine-donor substrate (Z-APRYRQRAAGGG-PEG-biotin). Each data point represents a pair of repeat peptides from a library of 1.4 million unique peptides synthesized in duplicate. The 17 data points showing the highest fluorescence signal are labeled with their respective 5-mer peptide sequences.

图5A的log-log散布图显示了在有生物素化的胺-供体底物存在下由茂原链霉菌MTG针对5-聚体肽阵列上的重复部件产生的荧光信号数据之间的关联。每个数据点代表来自140万一式两份地合成的独特肽的文库的一对重复肽。17个与来自图4A的由KalbTG产生的最高荧光信号对应的数据点用它们各自的5-聚体肽序列标记。FIG5A is a log-log scatter plot showing the correlation between fluorescence signal data generated by Streptomyces mobaraensis MTG against repeat elements on a 5-mer peptide array in the presence of a biotinylated amine-donor substrate. Each data point represents a pair of repeat peptides from a library of 1.4 million unique peptides synthesized in duplicate. The 17 data points corresponding to the highest fluorescence signals generated by KalbTG from FIG4A are labeled with their respective 5-mer peptide sequences.

图5B是由图5A的茂原链霉菌MTG产生的荧光信号数据的图，其中省略了来自图4A的带标签的数据点。16个与由MTG产生的最高荧光信号对应的数据点用它们各自的5-聚体肽序列标记。Figure 5B is a graph of the fluorescence signal data generated by the Streptomyces mobaraensis MTG of Figure 5A, with the labeled data points from Figure 4A omitted. The 16 data points corresponding to the highest fluorescence signals generated by MTG are labeled with their respective 5-mer peptide sequences.

图5C是由图4A的KalbTG产生的荧光信号数据的图，其中省略了来自图4A的带标签的数据点。16个与来自图5B的由茂原链霉菌MTG产生的最高荧光信号对应的数据点用它们各自的5-聚体肽序列标记。Figure 5C is a graph of the fluorescence signal data generated by KalbTG of Figure 4A, with the labeled data points from Figure 4A omitted. The 16 data points corresponding to the highest fluorescence signals generated by S. mobaraensis MTG from Figure 5B are labeled with their respective 5-mer peptide sequences.

图5D是如下得到的茂原链霉菌MTG和KalbTG活性的图：在GLDH偶联测定中，在有胺-供体底物尸胺(1 mM)存在下，在340 nm和37℃针对不同浓度的谷氨酰胺-供体底物Z-GGGDYALQGGGG (0-1 mM)和Z-GGGYRYRQGGGG (0-1 mM)测量NADH氧化速率。Figure 5D is a graph of S. mobara MTG and KalbTG activities obtained by measuring the NADH oxidation rate at 340 nm and 37°C for varying concentrations of the glutamine-donor substrates Z-GGGDYALQGGGG (0-1 mM) and Z-GGGYRYRQGGGG (0-1 mM) in the presence of the amine-donor substrate cadaverine (1 mM) in a GLDH-coupled assay.

图6A的一系列SDS-PAGE凝胶图像显示了关于Q-标记的嗜热栖热菌（Thermus thermophilus）SlyD部分的Cy3标记的实验时程(左：明视野；右：Cy3荧光)和对照数据(左：明视野；右：Cy3荧光)。通过与标记的6 kDa分子量对应的电泳迁移率的转移和通过标记的物质的荧光信号，观察到成功单标记。显示了用10倍标记过量执行的60分钟时程和用50倍标记过量执行的18小时温育的数据。给标记‘L’的泳道加载标准分子量阶梯(以kDa为单位显示值)。标记了‘-’和‘+’的泳道表示对照反应，其具有含有茂原链霉菌MTG Q-标签(DYALQ(SEQ ID NO: 22))和KalbTG (-)或茂原链霉菌MTG (+)的SlyD。Figure 6A is a series of SDS-PAGE gel images showing an experimental time course for Cy3 labeling of a Q-labeled Thermus thermophilus SlyD moiety (left: bright field; right: Cy3 fluorescence) and control data (left: bright field; right: Cy3 fluorescence). Successful single labeling was observed by a shift in electrophoretic mobility corresponding to the 6 kDa molecular weight of the label and by the fluorescent signal of the labeled species. Data are shown for a 60-minute time course performed with a 10-fold labeling excess and an 18-hour incubation performed with a 50-fold labeling excess. Lanes labeled 'L' were loaded with a standard molecular weight ladder (values shown in kDa). Lanes labeled '-' and '+' represent control reactions with SlyD containing either the Streptomyces mobaraensis MTG Q-tag (DYALQ (SEQ ID NO: 22)) and either KalbTG (-) or Streptomyces mobaraensis MTG (+).

图6B的一系列SDS-PAGE凝胶图像显示了对于6.2至9.0之间的pH值KalbTG标记效力的pH概况(左：明视野；右：Cy3荧光)。在pH 7.4观察到15分钟反应时间以后的最高标记收率。给标记了‘L’的泳道加载标准分子量阶梯(以kDa为单位显示值)。FIG6B is a series of SDS-PAGE gel images showing the pH profile of KalbTG labeling efficiency for pH values between 6.2 and 9.0 (left: bright field; right: Cy3 fluorescence). The highest labeling yield was observed after 15 minutes of reaction time at pH 7.4. A standard molecular weight ladder was loaded into the lane labeled 'L' (values shown in kDa).

图6C的一系列SDS-PAGE凝胶图像显示了具有Cy3和Cy5荧光标记的构建体YRYRQ-PEG27-(因子Xa切割位点)-PEG27-PEG27-DYALQ的双位点特异性官能化。每个组(标记了1、2和3)包括以下述次序从左至右排列的相同凝胶泳道的3个图像：i)明视野；ii) Cy3荧光；和iii) Cy5荧光。组1：肽构建体和10倍过量的Cy3标记的混合物。组2：在与KalbTG酶一起温育30 min以后肽构建体和10倍过量的Cy3标记的混合物。组3：将茂原链霉菌MTG酶和Cy5标记加入组2的组合物中，并在没有中间阻断或纯化步骤的情况下温育15 min；以接近定量收率取得双标记的构建体。Figure 6C is a series of SDS-PAGE gel images showing dual site-specific functionalization of the construct YRYRQ-PEG27-(Factor Xa cleavage site)-PEG27-PEG27-DYALQ with Cy3 and Cy5 fluorescent labels. Each group (labeled 1, 2, and 3) includes three images of the same gel lane arranged from left to right in the following order: i) bright field; ii) Cy3 fluorescence; and iii) Cy5 fluorescence. Group 1: mixture of peptide construct and 10-fold excess Cy3 label. Group 2: mixture of peptide construct and 10-fold excess Cy3 label after incubation with KalbTG enzyme for 30 minutes. Group 3: Streptomyces mobaraensis MTG enzyme and Cy5 label were added to the combination of Group 2 and incubated for 15 minutes without intermediate blocking or purification steps; the dual-labeled construct was obtained in near-quantitative yield.

图7A是KalbTG (深灰色)和茂原链霉菌MTG (浅灰色)的活性酶结构的三维排列，其揭示了核心和活性部位区域中的高保守和周围环区域中的高变异性。Figure 7A is a three-dimensional alignment of the active enzyme structures of KalbTG (dark gray) and S. mobaraensis MTG (light gray), which reveals high conservation in the core and active site regions and high variability in the surrounding loop regions.

图7B是KalbTG (深灰色)和茂原链霉菌MTG (浅灰色)的活性酶结构的三维表面覆盖，其图示了茂原链霉菌MTG和更紧凑的KalbTG的结合槽可以被前肽(带状结构)类似地占据。Figure 7B is a three-dimensional surface overlay of the active enzyme structures of KalbTG (dark grey) and S. mobara MTG (light grey), illustrating that the binding grooves of S. mobara MTG and the more compact KalbTG can be similarly occupied by the propeptide (ribbon structure).

图7C的三维带状结构图示了包括两个带强电荷的亲水环的KalbTG活性裂缝的形成的贡献因素，所述亲水环被认为介导底物募集、充当底物模仿物或它们的组合。所述两个亲水环用它们的对应序列(即，NHEEPR (SEQ ID NO:3)和YRYRAR (SEQ ID NO:4))标记。The three-dimensional ribbon structure of Figure 7C illustrates the contributing factors to the formation of the KalbTG active cleft, which includes two highly charged hydrophilic loops that are thought to mediate substrate recruitment, act as substrate mimics, or a combination thereof. The two hydrophilic loops are labeled with their corresponding sequences (i.e., NHEEPR (SEQ ID NO: 3) and YRYRAR (SEQ ID NO: 4)).

发明详述Detailed Description of the Invention

如以上所讨论的，在不同的情形下，可能有用的是，阐明酶活性和特异性的细节来提供那些酶的基础理解，以及用于开发包括那些酶的生物技术应用。例如，用于修饰治疗性和诊断性蛋白的常规化学策略经常缺乏位点特异性、连接稳定性、化学计量控制或它们的组合，从而产生可能造成干扰(例如，干扰治疗剂的免疫反应性或稳定性)的异质缀合物。在一个方面，预见到，将来的治疗和诊断试剂的工业开发将看到需要稳定的和真实位点特异性的缀合的复杂形式的大量增加。因此，需要给确立的化学策略提供有吸引力的和有成本效益的替代方案的新酶促方法。As discussed above, in different situations, it may be useful to illustrate the basic understanding of the enzymes provided by the details of enzymatic activity and specificity, and for developing biotechnology applications comprising those enzymes. For example, conventional chemical strategies for modifying therapeutic and diagnostic proteins often lack site-specificity, connection stability, stoichiometric control or a combination thereof, thereby producing heterogeneous conjugates that may cause interference (for example, immunoreactivity or stability of an interfering therapeutic agent). In one aspect, it is anticipated that future industrial development of treatment and diagnostic reagents will see a large increase in the complex forms of stable and true site-specific conjugations. Therefore, it is desirable to provide attractive and cost-effective alternatives to the chemical strategies established.

微生物转谷氨酰胺酶(MTG)是由Ajinomoto Co., Inc.的研究人员在1989年首次描述的蛋白-谷氨酰胺γ-谷氨酰基转移酶(EC 2.3.2.13)，并且是在许多食品和生物技术应用中最广泛地使用的用于交联蛋白和肽的酶集合之一。MTG首先在生物茂原链霉菌中发现并在以后从其中提取。MTG催化酰基(例如，谷氨酰胺侧链；酰基-供体)和烷基-胺(例如，赖氨酸侧链；胺-供体)之间的稳定异肽键的形成。在没有反应性胺基存在下，与水的酶反应导致谷氨酰胺侧链的脱氨。该细菌酶在没有添加辅因子（诸如Ca²⁺或GTP）的情况下和在pH、缓冲液和温度条件的宽范围中起作用。Microbial transglutaminase (MTG) is a protein-glutamine γ-glutamyltransferase (EC 2.3.2.13) first described in 1989 by researchers at Ajinomoto Co., Inc., and is one of the most widely used enzymes for cross-linking proteins and peptides in many food and biotechnology applications. MTG was first discovered in the organism Streptomyces mobaraensis and later isolated from it. MTG catalyzes the formation of a stable isopeptide bond between an acyl group (e.g., glutamine side chain; acyl-donor) and an alkyl-amine (e.g., lysine side chain; amine-donor). In the absence of a reactive amine group, the enzymatic reaction with water results in the deamination of the glutamine side chain. This bacterial enzyme functions in the absence of added cofactors (such as Ca2 ⁺ or GTP) and over a wide range of pH, buffer, and temperature conditions.

与分选酶A（其天然底物特异性是非常严谨的）相比，已知的MTG(例如，来自茂原链霉菌)通常就底物分子而言是不加选择的酶，且转谷氨酰胺酶变体的特异性保持在很大程度上是未知的。尽管在治疗性抗体-药物缀合物的开发中已经做出重要科学努力来建立MTG作为候选酶，但是这样的MTG介导的免疫缀合物的大规模生产受该酶的低特异性阻碍。In contrast to sortase A, whose natural substrate specificity is very stringent, known MTG (e.g., from Streptomyces mobara) is generally an indiscriminate enzyme with respect to substrate molecules, and the specificity of transglutaminase variants remains largely unknown. Although significant scientific efforts have been made to establish MTG as a candidate enzyme in the development of therapeutic antibody-drug conjugates, large-scale production of such MTG-mediated immunoconjugates has been hampered by the enzyme's low specificity.

已知的MTG种类主要是链霉菌属（Streptomyces）或芽孢杆菌属（Bacillus）家族的代表。这些MTG种类表现出非常类似的基本氨基酸结构和底物特异性。所有已知的有活性的MTG种类表现出至少约38 kDa的分子量。在性质上是交联酶，已知的MTG通常表现出对胺-供体底物的宽底物特异性和对酰基-供体底物的相对低特异性。用于高通量筛选改进的MTG底物的方案以前已经限于噬菌体淘选或mRNA展示。而最近领先的基于阵列的高通量筛选方案已经成功地鉴别出茂原链霉菌MTG的底物(Albert等人的美国临时专利申请系列号62/094,495，2014年12月19日提交)，仅该酶和同系物酶的底物特异性是已知的，由此排除了任何生物正交缀合方案(例如，使用两种或更多种不同标记-底物以及两种或更多种转谷氨酰胺酶种类对生物分子的同时标记)。因此，需要用于鉴别和表征新转谷氨酰胺酶种类的高通量方案。此外，需要具有更大活性、特异性或它们的组合的改进的转谷氨酰胺酶。此外，需要酰基-供体标签(例如，谷氨酰胺-或Q-标签)和胺-供体标签(例如，赖氨酸-或K-标签)，它们是目标转谷氨酰胺酶的特异性和独特底物。Known MTG species are primarily representatives of the Streptomyces or Bacillus families. These MTG species exhibit very similar basic amino acid structures and substrate specificities. All known active MTG species exhibit a molecular weight of at least approximately 38 kDa. Being cross-linking enzymes in nature, known MTGs generally exhibit broad substrate specificity for amine-donating substrates and relatively low specificity for acyl-donating substrates. High-throughput screening for improved MTG substrates has previously been limited to phage panning or mRNA display. While a recent leading array-based high-throughput screening approach has successfully identified substrates for Streptomyces mobaraensis MTG (U.S. Provisional Patent Application Serial No. 62/094,495 to Albert et al., filed December 19, 2014), only the substrate specificity of this enzyme and homologous enzymes is known, thereby precluding any bioorthogonal conjugation approaches (e.g., simultaneous labeling of biomolecules using two or more different label-substrates and two or more transglutaminases). Therefore, there is a need for high-throughput protocols for identifying and characterizing new transglutaminases. Furthermore, there is a need for improved transglutaminases with greater activity, specificity, or a combination thereof. Furthermore, there is a need for acyl-donor tags (e.g., glutamine- or Q-tags) and amine-donor tags (e.g., lysine- or K-tags) that are specific and unique substrates for the transglutaminases of interest.

用于鉴别和表征转谷氨酰胺酶种类的系统和方法可以克服这些和其它挑战。为此目的，本发明提供了已知和未知的候选转谷氨酰胺酶种类的结构和生物化学的表征。本发明进一步提供了候选转谷氨酰胺酶的重组生产的表征、经由高密度肽阵列对潜在转谷氨酰胺酶底物的高通量筛选、和使用新表征的转谷氨酰胺酶种类的生物分子的半-正交缀合。在另一个方面，本发明提供了酰基-供体标签(例如，谷氨酰胺-或Q-标签)和胺-供体标签(例如，赖氨酸-或K-标签)，它们是目标转谷氨酰胺酶的特异性和独特底物。在这里，术语‘标签’表示这样的序列：其包括一个或多个氨基酸或可以移植、融合、缀合或以其它方式连接至另一个结构的其它类似分子，诸如蛋白、肽、小分子、可检测标记(例如，荧光染料)、寡核苷酸、非氨基或核酸聚合物(例如，聚乙二醇)等。在一个方面，所述连接的性质应当允许酶接近所述标签，其中所述标签是所述酶的底物。Systems and methods for identifying and characterizing transglutaminases can overcome these and other challenges. To this end, the present invention provides structural and biochemical characterizations of known and unknown candidate transglutaminases. The present invention further provides characterizations of recombinant production of candidate transglutaminases, high-throughput screening of potential transglutaminases substrates via high-density peptide arrays, and semi-orthogonal conjugation of biomolecules using newly characterized transglutaminases. In another aspect, the present invention provides acyl-donor tags (e.g., glutamine- or Q-tags) and amine-donor tags (e.g., lysine- or K-tags) that are specific and unique substrates for target transglutaminases. As used herein, the term 'tag' refers to a sequence comprising one or more amino acids or other similar molecules that can be grafted, fused, conjugated, or otherwise attached to another structure, such as a protein, peptide, small molecule, detectable label (e.g., fluorescent dye), oligonucleotide, non-amino or nucleic acid polymer (e.g., polyethylene glycol), etc. In one aspect, the nature of the attachment should allow access of the tag to the enzyme, wherein the tag is a substrate for the enzyme.

在本发明的一个实施方案中，使用基于高通量阵列的筛选方案鉴别、重组地表达、纯化和表征以前未知的来自生物白色库茨涅尔氏菌的转谷氨酰胺酶种类。确定白色库茨涅尔氏菌转谷氨酰胺酶对它的阵列确定的底物序列表现出高选择性和底物特异性，但是与茂原链霉菌酶的底物序列仅较差地反应或根本不反应。因此，可以说白色库茨涅尔氏菌转谷氨酰胺酶与茂原链霉菌酶生物正交。在另一个方面，所述白色库茨涅尔氏菌转谷氨酰胺酶表现出比所有以前描述的MTG种类(例如，茂原链霉菌MTG是约38 kDa)令人惊讶地更低的分子量(约30 kDa)，从而预示关于生产和酶促标记目的的一个优点。此外，与所有当前已知的蛋白相比，白色库茨涅尔氏菌转谷氨酰胺酶具有明显不同的基本氨基酸结构。总之，这些性能使得白色库茨涅尔氏菌转谷氨酰胺酶对于宽范围的应用而言非常有吸引力，所述应用包括、但不限于生物分子与多种标记分子的通用的、有成本效益的和位点特异性的缀合。白色库茨涅尔氏菌转谷氨酰胺酶可以在其中有效的另外或替代应用包括用于体外诊断用途的治疗性抗体-药物缀合物或化学发光抗体的生产。In one embodiment of the present invention, a previously unknown transglutaminase species from the organism Kuznetsova albus was identified, recombinantly expressed, purified, and characterized using a high-throughput array-based screening protocol. The Kuznetsova albus transglutaminase was determined to exhibit high selectivity and substrate specificity for its array-determined substrate sequence, but reacted only poorly or not at all with the substrate sequence of the Streptomyces mobara enzyme. Thus, it can be said that the Kuznetsova albus transglutaminase is bioorthogonal to the Streptomyces mobara enzyme. In another aspect, the Kuznetsova albus transglutaminase exhibits a surprisingly lower molecular weight (approximately 30 kDa) than all previously described MTG species (e.g., Streptomyces mobara MTG is approximately 38 kDa), thereby foreshadowing an advantage for production and enzymatic labeling purposes. Furthermore, the Kuznetsova albus transglutaminase has a significantly different basic amino acid structure compared to all currently known proteins. Taken together, these properties make Kuznetsova transglutaminase very attractive for a wide range of applications, including, but not limited to, the versatile, cost-effective, and site-specific conjugation of biomolecules to a variety of labeling molecules. Additional or alternative applications in which Kuznetsova transglutaminase may be effective include the production of therapeutic antibody-drug conjugates or chemiluminescent antibodies for in vitro diagnostic use.

在一个方面，通过本发明的实施方案的实现可以克服由转谷氨酰胺酶的重组生产产生的许多挑战。参考图1，鉴别和表征转谷氨酰胺酶的方法100包括鉴别候选转谷氨酰胺酶的步骤102。步骤102可以包括检索已知或疑似的转谷氨酰胺酶种类的同系物以鉴别用于进一步研究的候选转谷氨酰胺酶。在一个示例性实施方案中，所述转谷氨酰胺酶可以是微生物转谷氨酰胺酶(例如，链轮丝菌属（Streptoverticillium sp.）转谷氨酰胺酶、库茨涅尔氏菌属（Kutzneria sp.）转谷氨酰胺酶、链霉菌属（Streptomyces sp）等)或哺乳动物转谷氨酰胺酶。在其中所述酶是哺乳动物转谷氨酰胺酶的实施方案中，所述哺乳动物转谷氨酰胺酶可以例如选自：人因子XIII A转谷氨酰胺酶、人因子XIII B转谷氨酰胺酶、因子XIII转谷氨酰胺酶、角质形成细胞转谷氨酰胺酶、组织-型转谷氨酰胺酶、表皮转谷氨酰胺酶、前列腺转谷氨酰胺酶、神经元转谷氨酰胺酶、人转谷氨酰胺酶5和人转谷氨酰胺酶7。In one aspect, many challenges arising from the recombinant production of transglutaminases can be overcome by implementing embodiments of the present invention. Referring to FIG1 , a method 100 for identifying and characterizing transglutaminases includes a step 102 of identifying candidate transglutaminases. Step 102 can include searching for homologs of known or suspected transglutaminases to identify candidate transglutaminases for further investigation. In an exemplary embodiment, the transglutaminases can be microbial transglutaminases (e.g., Streptoverticillium sp . transglutaminases, Kutzneria sp. transglutaminases, Streptomyces sp . transglutaminases, etc.) or mammalian transglutaminases. In embodiments wherein the enzyme is a mammalian transglutaminase, the mammalian transglutaminase can, for example, be selected from the group consisting of: human factor XIII A transglutaminase, human factor XIII B transglutaminase, factor XIII transglutaminase, keratinocyte transglutaminase, tissue-type transglutaminase, epidermal transglutaminase, prostate transglutaminase, neuronal transglutaminase, human transglutaminase 5, and human transglutaminase 7.

对已知或疑似的候选转谷氨酰胺酶的同系物的检索可以包括一个或多个检索工具或数据库的应用。一种合适的工具包括来自国家生物技术信息中心(National Centerfor Biotechnology Information，NCBI)的Protein Basic Local Alignment SearchTool (蛋白BLAST)。可以给蛋白BLAST工具提供已知或疑似的转谷氨酰胺酶种类的序列，其序列可以得自不同数据库。一个实例数据库是Universal Protein目录(UniProt)。但是，可以另外使用其它数据库，或作为UniProt的替代。在一个方面，当使用蛋白BLAST时，可以选择用于缩窄检索结果的阈值Expect-值(E-值)。在一个实施例中，它可以用于选择小于约10^-8的E-值。在另一个实施例中，它可以用于选择小于约10^-10的E-值。在另一个实施例中，它可以用于选择小于约10^-12的E-值。The search for known or suspected homologs of candidate transglutaminases can include the use of one or more search tools or databases. One suitable tool includes the Protein Basic Local Alignment Search Tool (Protein BLAST) from the National Center for Biotechnology Information (NCBI). The Protein BLAST tool can be provided with sequences of known or suspected transglutaminases, which can be obtained from various databases. An example database is the Universal Protein Catalog (UniProt). However, other databases can be used in addition to, or as an alternative to, UniProt. In one aspect, when using Protein BLAST, a threshold Expect-value (E-value) can be selected to narrow the search results. In one embodiment, it can be used to select an E-value of less than about 10^-8 . In another embodiment, it can be used to select an E-value of less than about 10^-10 . In another embodiment, it can be used to select an E-value of less than about 10^-12 .

步骤102还可以包括用比对工具执行转谷氨酰胺酶序列的序列比对。一个实例比对工具包括Clustal Omega 1.2.1工具(Sievers等人.2011, Molecular Systems Biology7: 539)。比对工具可以提供同一性百分比矩阵值，鉴别催化活性残基(如果已知的话)的潜在保存，或它们的组合。可以用在步骤102中的其它工具包括来自丹麦技术大学（TechnicalUniversity of Denmark）的ProP 1.0 Server(Duckert等人.2004, Protein Engineering, Design & Selection 17(1): 107-112)，以预测转谷氨酰胺酶的前肽和信号序列。在一个方面，候选转谷氨酰胺酶可以与已知的转谷氨酰胺酶具有至少约20%、至少约25%、至少约30%、至少约35%或更多的相似性。此外，通过就已知的或疑似的转谷氨酰胺酶而言至少一个或多个活性部位残基的保存可以表征候选转谷氨酰胺酶，从而指示可以保留酶促结构和功能。Step 102 can also include performing a sequence alignment of the transglutaminase sequences using an alignment tool. An example alignment tool includes the Clustal Omega 1.2.1 tool (Sievers et al. 2011, Molecular Systems Biology 7: 539). The alignment tool can provide a percent identity matrix value, identify potential conservation of catalytically active residues (if known), or a combination thereof. Other tools that can be used in step 102 include the ProP 1.0 Server from the Technical University of Denmark (Duckert et al. 2004, Protein Engineering, Design & Selection 17(1): 107-112) to predict propeptides and signal sequences of transglutaminases. In one aspect, the candidate transglutaminase can have at least about 20%, at least about 25%, at least about 30%, at least about 35%, or more similarity to a known transglutaminase. Furthermore, candidate transglutaminases can be characterized by conservation of at least one or more active site residues with respect to known or suspected transglutaminases, indicating that enzymatic structure and function may be retained.

通过使用前述工具中的一个或多个在步骤102中收集的信息可以用于从预测的或已知的转谷氨酰胺酶序列中选择候选转谷氨酰胺酶用于步骤104中的表达和纯化。在一个方面，步骤104可以包括快速地筛选候选转谷氨酰胺酶种类的表达条件。一种用于筛选的合适方法包括使用片段交换系统将候选转谷氨酰胺酶的遗传插入物插入为在宿主生物中的可溶性细胞溶质或周质表达设计的一种或多种表达载体中(Geertsma, 等人.2011.Biochemistry 50(15): 3272-3278)。还可以采用其它筛选方法。The information collected in step 102 using one or more of the aforementioned tools can be used to select candidate transglutaminases from predicted or known transglutaminases sequences for expression and purification in step 104. In one aspect, step 104 can include rapidly screening for expression conditions of candidate transglutaminases. One suitable method for screening includes inserting a genetic insert of a candidate transglutaminase into one or more expression vectors designed for soluble cytosolic or periplasmic expression in a host organism using a fragment exchange system (Geertsma, et al. 2011. Biochemistry 50(15): 3272-3278). Other screening methods can also be used.

步骤104还可以包括初步筛选来鉴别具有预期的电泳迁移率的全长蛋白的表达证据。在观察到全长有活性的候选转谷氨酰胺酶蛋白的差表达或不表达的情况下，可以将候选转谷氨酰胺酶序列与伴侣蛋白(例如，SlyD)融合以提高功能酶的表达的可能性。通过筛选不同的温育时间、温育温度、诱导物浓度、诱导时间、介质类型、介质体积等和它们的组合，可以进一步优化候选转谷氨酰胺酶的表达。Step 104 can also include a preliminary screen to identify evidence of expression of the full-length protein with the expected electrophoretic mobility. In the event that poor expression or no expression of the full-length active candidate transglutaminase protein is observed, the candidate transglutaminase sequence can be fused to a chaperone protein (e.g., SlyD) to increase the likelihood of expression of the functional enzyme. The expression of the candidate transglutaminase can be further optimized by screening different incubation times, incubation temperatures, inducer concentrations, induction times, media types, media volumes, etc., and combinations thereof.

一般而言，应当理解，在方法100的步骤104中可以采用许多可行的纯化和表达策略。在一个实施方案中，可以将候选转谷氨酰胺酶序列掺入模块表达构建体中。用于用在步骤104中的实例表达构建体可以包括伴侣蛋白模块、蛋白酶切割位点模块、纯化标签模块、检测模块等和它们的组合。例如，表达构建体可以包括一个或多个SlyD伴侣蛋白模块（其排列成产生转谷氨酰胺酶-伴侣蛋白融合蛋白）、一个或多个蛋白酶切割位点模块（其侧接转谷氨酰胺酶序列，用于在表达后分离各个模块）和一个或多个8X-组氨酸标签或其它纯化模块（用于回收表达的蛋白的一个或多个区段）。对于包括蛋白酶切割位点模块的表达构建体，可以用一种或多种蛋白酶处理表达的蛋白以产生活化的蛋白。例如，可以用因子Xa蛋白酶、胰蛋白酶蛋白酶、凝血酶蛋白酶或另一种类似蛋白酶处理表达的蛋白以从表达的候选转谷氨酰胺酶切割任何伴侣蛋白蛋白、前肽序列、纯化标签等。In general, it should be understood that a variety of feasible purification and expression strategies can be employed in step 104 of method 100. In one embodiment, the candidate transglutaminase sequence can be incorporated into a modular expression construct. Example expression constructs for use in step 104 can include chaperone modules, protease cleavage site modules, purification tag modules, detection modules, and the like, and combinations thereof. For example, an expression construct can include one or more SlyD chaperone modules (arranged to produce a transglutaminase-chaperone fusion protein), one or more protease cleavage site modules (flanked by transglutaminase sequences for separation of the modules after expression), and one or more 8X-histidine tags or other purification modules (for recovery of one or more segments of the expressed protein). For expression constructs that include protease cleavage site modules, the expressed protein can be treated with one or more proteases to produce an activated protein. For example, the expressed protein can be treated with Factor Xa protease, trypsin protease, thrombin protease, or another similar protease to cleave any chaperone protein, propeptide sequence, purification tag, and the like from the expressed candidate transglutaminase.

为了生产选择的转谷氨酰胺酶蛋白，可以将编码候选转谷氨酰胺酶（包括一些、零或所有预测的信号和前肽序列）的基因序列进行密码子优化和化学合成用于在特定宿主生物(例如，大肠杆菌)中表达。根据如例如以下文献描述的标准分子生物学方案可以执行表达：Green, 等人, 2012, Molecular Cloning: A Laboratory Manual, Cold SpringHarbor Laboratory Press。To produce a selected transglutaminase protein, the gene sequence encoding the candidate transglutaminase (including some, zero, or all of the predicted signal and propeptide sequences) can be codon-optimized and chemically synthesized for expression in a particular host organism (e.g., E. coli). Expression can be performed according to standard molecular biology protocols as described, for example, in Green, et al., 2012, Molecular Cloning: A Laboratory Manual , Cold Spring Harbor Laboratory Press.

在方法100的下一步106中，针对底物文库筛选在步骤104中表达和纯化的候选转谷氨酰胺酶以鉴别潜在底物，包括酰基-供体序列、胺-供体序列、或二者。在一个方面，所述底物文库可以包括通过无掩模的阵列合成(Albert等人的美国专利公开号2015/0185216，2014年12月19日提交)在阵列中合成的多个肽部件。所述肽部件可以从天然的氨基酸、非天然的氨基酸、其它分子结构单元等和它们的组合制备。此外，所述肽可以呈直链、环或受约束的(大环)形式。In the next step 106 of method 100, the candidate transglutaminases expressed and purified in step 104 are screened against a substrate library to identify potential substrates, including acyl-donor sequences, amine-donor sequences, or both. In one aspect, the substrate library can include a plurality of peptide components synthesized in an array by maskless array synthesis (U.S. Patent Publication No. 2015/0185216 to Albert et al., filed December 19, 2014). The peptide components can be prepared from natural amino acids, non-natural amino acids, other molecular building blocks, and the like, and combinations thereof. In addition, the peptides can be in linear, cyclic, or constrained (macrocyclic) form.

本文中使用的术语“肽”、“寡肽”或“肽结合剂”表示由氨基酸组成的有机化合物，其可以以直链(通过邻近氨基酸残基的羧基和氨基之间的肽键连接在一起)、以环状形式或以受约束的形式(例如，“大环”形式)排列。术语“肽”或“寡肽”也表示较短的多肽，即，由小于50个氨基酸残基组成的有机化合物。本文中使用的大环(或受约束的肽)以它的常规含义用于描述环状小分子诸如约500道尔顿至约2,000道尔顿的肽。As used herein, the terms "peptide," "oligopeptide," or "peptide binder" refer to organic compounds composed of amino acids, which can be arranged in a linear chain (linked together by peptide bonds between the carboxyl and amino groups of adjacent amino acid residues), in a cyclic form, or in a constrained form (e.g., a "macrocycle" form). The term "peptide" or "oligopeptide" also refers to shorter polypeptides, i.e., organic compounds composed of less than 50 amino acid residues. Macrocycle (or constrained peptide) is used herein in its conventional sense to describe cyclic small molecules such as peptides of about 500 to about 2,000 daltons.

术语“天然氨基酸”表示在蛋白中常见且用于蛋白生物合成的20种氨基酸之一以及可以在翻译过程中掺入蛋白中的其它氨基酸(包括吡咯赖氨酸和硒代半胱氨酸)。20种天然氨基酸包括组氨酸、丙氨酸、缬氨酸、甘氨酸、亮氨酸、异亮氨酸、天冬氨酸、谷氨酸、丝氨酸、谷氨酰胺、天冬酰胺、苏氨酸、精氨酸、脯氨酸、苯丙氨酸、酪氨酸、色氨酸、半胱氨酸、甲硫氨酸和赖氨酸。The term "natural amino acid" refers to one of the 20 amino acids commonly found in proteins and used in protein biosynthesis, as well as other amino acids (including pyrrolysine and selenocysteine) that can be incorporated into proteins during translation. The 20 natural amino acids include histidine, alanine, valine, glycine, leucine, isoleucine, aspartic acid, glutamic acid, serine, glutamine, asparagine, threonine, arginine, proline, phenylalanine, tyrosine, tryptophan, cysteine, methionine, and lysine.

术语“非天然氨基酸”表示不属于由标准遗传密码编码的那些或不在翻译过程中掺入蛋白中的有机化合物。因此，非天然氨基酸包括氨基酸或氨基酸的类似物，但不限于：氨基酸的D-立体异构体（isostereomer）、氨基酸的β-氨基-类似物、高瓜氨酸、高精氨酸、羟脯氨酸、高脯氨酸、鸟氨酸、4-氨基-苯丙氨酸、环己基丙氨酸、α-氨基异丁酸、N-甲基-丙氨酸、N-甲基-甘氨酸、正亮氨酸、N-甲基-谷氨酸、叔丁基甘氨酸、α-氨基丁酸、叔丁基丙氨酸、2-氨基异丁酸、α-氨基异丁酸、2-氨基茚满-2-甲酸、硒代甲硫氨酸、脱氢丙氨酸、羊毛硫氨酸、γ-氨基丁酸、及其衍生物，其中所述胺氮已经被单-或二-烷基化。The term "unnatural amino acid" refers to an organic compound that is not encoded by the standard genetic code or is not incorporated into proteins during translation. Thus, unnatural amino acids include, but are not limited to, amino acids or analogs of amino acids, including, but not limited to, D-isostereomers of amino acids, β-amino-analogs of amino acids, homocitrulline, homoarginine, hydroxyproline, homoproline, ornithine, 4-amino-phenylalanine, cyclohexylalanine, α-aminoisobutyric acid, N-methyl-alanine, N-methyl-glycine, norleucine, N-methyl-glutamate, tert-butylglycine, α-aminobutyric acid, tert-butylalanine, 2-aminoisobutyric acid, α-aminoisobutyric acid, 2-aminoindan-2-carboxylic acid, selenomethionine, dehydroalanine, lanthionine, γ-aminobutyric acid, and derivatives thereof, in which the amine nitrogen has been mono- or di-alkylated.

继续参考图1，方法100的步骤108包括从底物文库中鉴别最佳胺-供体底物序列、酰基-供体底物序列或二者。在一个方面，使用一个或多个直接或偶联测定，可以测量候选转谷氨酰胺酶在文库底物上的活性。一般而言，直接测定包括测量酶反应的反应物(例如，底物、辅因子等)和产物(例如，异肽键、去酰胺化的底物等)。例如，可以如下使用分光光度法跟踪酶反应的进程：测量与反应物或产物有关的吸光度随时间的变化。在酶反应无助于一个或多个直接测定测量的应用的情况下，或在直接测定以外或作为直接测定的替代，可以采用偶联测定。在偶联测定的情况下，目标酶反应的产物可以用作另一个更容易测量的继发反应的底物。偶联测定的例子包括涉及辅因子诸如NADP(H)和NAD(H)的氧化还原反应的测量，所述辅因子作为产物或反应物参与目标酶反应。在由转谷氨酰胺酶催化的反应的情况下，可以实现依赖于谷氨酸脱氢酶(GLDH)的氧化偶联测定。GLDH测定的一个例子包括β-酪蛋白作为交联底物和检测通过NADPH的GLDH氧化实现的脱酰胺化。值得注意的是，它可以用于选择与高通量筛选形式相容的测定，以便平行地研究大量底物(例如，大于100万)。Continuing with reference to FIG1 , step 108 of method 100 includes identifying the optimal amine-donor substrate sequence, acyl-donor substrate sequence, or both from a substrate library. In one aspect, the activity of a candidate transglutaminase on a library substrate can be measured using one or more direct or coupled assays. Generally speaking, direct assays involve measuring reactants (e.g., substrates, cofactors, etc.) and products (e.g., isopeptide bonds, deamidated substrates, etc.) of an enzyme reaction. For example, spectrophotometry can be used to track the progress of an enzyme reaction by measuring the change in absorbance associated with a reactant or product over time. In cases where an enzyme reaction is not amenable to one or more direct assays, or in addition to or as an alternative to direct assays, coupled assays can be employed. In the case of coupled assays, the product of the target enzyme reaction can serve as a substrate for another, more easily measured, secondary reaction. Examples of coupled assays include measurements of redox reactions involving cofactors such as NADP(H) and NAD(H), which participate in the target enzyme reaction as either a product or a reactant. In the case of reactions catalyzed by transglutaminase, oxidative coupling assays relying on glutamate dehydrogenase (GLDH) can be implemented. An example of a GLDH assay includes β-casein as a cross-linking substrate and detects deamidation via GLDH oxidation of NADPH. Notably, it can be used to select assays compatible with high-throughput screening formats to allow the study of large numbers of substrates (e.g., greater than 1 million) in parallel.

使用前述直接或间接测定之一，步骤108可以包括使用肽底物阵列来鉴别被候选转谷氨酰胺酶识别的特定序列或基序。例如，可以在一个或多个阵列上平行地量化数百万独特肽和生物素化的胺供体之间的转酰胺基反应，且可以确定具有最高信号输出的肽(即，最佳底物)的序列。此后，在方法100的下一步110中，可以在单独的(在阵列上或在溶液中)测定中重新合成最佳底物扫描并针对转谷氨酰胺酶活性进行测试。因此，步骤110包括在有在步骤108中鉴别出的最佳底物存在下表征候选转谷氨酰胺酶。Using one of the aforementioned direct or indirect assays, step 108 can include using a peptide substrate array to identify a specific sequence or motif recognized by the candidate transglutaminase. For example, the transamidation reaction between millions of unique peptides and a biotinylated amine donor can be quantified in parallel on one or more arrays, and the sequence of the peptide with the highest signal output (i.e., the best substrate) can be determined. Thereafter, in the next step 110 of method 100, the best substrate can be resynthesized in a separate (on the array or in solution) assay, scanned, and tested for transglutaminase activity. Thus, step 110 includes characterizing the candidate transglutaminase in the presence of the best substrate identified in step 108.

候选转谷氨酰胺酶的表征可以包括确定参数诸如特异性、选择性、亲和力、活性等。此外，可以关于正交性表征两种或更多种候选转谷氨酰胺酶。在这里，可以鉴别候选转谷氨酰胺酶作用于相同底物的能力。在两种不同转谷氨酰胺酶不能作用于相同底物的情况下，可以说所述两种不同转谷氨酰胺酶是正交的。但是，在两种不同转谷氨酰胺酶能够作用于一种或多种相同底物、但是具有不同活性程度的情况下，可以说所述两种不同转谷氨酰胺酶是半正交的。在一个方面，肽底物阵列可以一次性地递送所有可行5-聚体肽序列的读出。因此，从底物文库收集的数据可以用于鉴别每种候选转谷氨酰胺酶的底物特异性差异，用于鉴别正交、半-正交和非-正交转谷氨酰胺酶。Characterization of a candidate transglutaminase can include determining parameters such as specificity, selectivity, affinity, activity, etc. In addition, two or more candidate transglutaminases can be characterized with respect to orthogonality. Here, the ability of the candidate transglutaminases to act on the same substrate can be identified. In the case where two different transglutaminases cannot act on the same substrate, the two different transglutaminases can be said to be orthogonal. However, in the case where two different transglutaminases can act on one or more of the same substrates, but have different degrees of activity, the two different transglutaminases can be said to be semi-orthogonal. In one aspect, a peptide substrate array can deliver readouts of all viable 5-mer peptide sequences at once. Therefore, the data collected from the substrate library can be used to identify differences in substrate specificity of each candidate transglutaminase, and to identify orthogonal, semi-orthogonal, and non-orthogonal transglutaminases.

方法100的步骤110还可以包括表征候选转谷氨酰胺酶的在蛋白底物上执行位点特异性标记的能力。在一个方面测定中，可以在步骤110中在阵列上和在溶液中进一步分析在步骤108中鉴别的最佳底物序列。还可以执行实验以量化候选转谷氨酰胺酶与不同底物的交叉反应性。在一个方面，蛋白支架可用于标记方案，因为含有表位(即，底物序列)的环可以被移植到支架上用于呈递给结合剂或酶。一个包括支架应用的方案描述在2012年5月04日提交的Andres等人的PCT申请公开号WO 2012/150321中。实例支架可以有利地包括一个或多个FK506-结合蛋白(FKBP)结构域作为用于移植含表位的环的位点。Step 110 of method 100 can also include characterizing the ability of the candidate transglutaminase to perform site-specific labeling on a protein substrate. In one aspect assay, the optimal substrate sequence identified in step 108 can be further analyzed on an array and in solution in step 110. Experiments can also be performed to quantify the cross-reactivity of the candidate transglutaminase with different substrates. In one aspect, a protein scaffold can be used for labeling schemes because loops containing epitopes (i.e., substrate sequences) can be grafted onto the scaffold for presentation to a binding agent or enzyme. One scheme including the use of a scaffold is described in PCT Application Publication No. WO 2012/150321 to Andres et al., filed May 4, 2012. An example scaffold can advantageously include one or more FK506-binding protein (FKBP) domains as sites for grafting epitope-containing loops.

候选转谷氨酰胺酶对支架蛋白的标记可以在多种条件下实现。可以为标记实验改变的因素包括底物与转谷氨酰胺酶的比率、一种底物与另一种底物的比率、标记时间、pH等。值得注意的是，底物表示任何肽、蛋白或其它结构，包括一种或多种胺-供体或酰基-供体底物序列。实例底物包括具有移植在其上面的酰基-供体或胺-供体底物序列的支架蛋白，与酰基-供体或胺-供体底物序列缀合或以其它方式结合的可检测标记，分离的酰基-供体或胺-供体底物序列，等，和它们的组合。使用标准技术，诸如与光学(例如，明视野、荧光)检测组合的十二烷基硫酸钠聚丙烯酰胺凝胶电泳(SDS-PAGE)，可以随时间测量标记收率。例如，第一底物可以包括一个或多个可检测标记。通过在SDS-PAGE凝胶上鉴别分子量迁移，随后检测凝胶内的标记，可以分析第一底物与第二底物(例如，蛋白支架)的交联。Labeling of the scaffold protein by a candidate transglutaminase can be achieved under a variety of conditions. Factors that can be varied for labeling experiments include the ratio of substrate to transglutaminase, the ratio of one substrate to another substrate, labeling time, pH, etc. It is noteworthy that substrate represents any peptide, protein or other structure, including one or more amine-donor or acyl-donor substrate sequences. Example substrates include scaffold proteins having acyl-donor or amine-donor substrate sequences grafted thereon, detectable labels conjugated to or otherwise bound to acyl-donor or amine-donor substrate sequences, isolated acyl-donor or amine-donor substrate sequences, etc., and combinations thereof. Using standard techniques, such as sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) combined with optical (e.g., bright field, fluorescence) detection, the labeling yield can be measured over time. For example, the first substrate can include one or more detectable labels. By identifying molecular weight migration on an SDS-PAGE gel and then detecting the label within the gel, the cross-linking of the first substrate with the second substrate (e.g., protein scaffold) can be analyzed.

用于与本发明的实施方案一起使用的标记包括可以与转谷氨酰胺酶底物组合的任意合适的标记。合适的标记的例子包括荧光标记、化学发光标记、放射性标记、化学标记(例如，包含“点击”化学)、半抗原、毒素、等和它们的组合。更一般而言，合适的标记与转谷氨酰胺酶的至少一种底物(例如，酰基-供体底物或胺-供体底物)相容，因为所述标记不会消除转谷氨酰胺酶的作用于经标记的底物的能力。此外，合适的标记可以产生相对于未标记的转谷氨酰胺酶底物可检测的信号。用于用在本文任意适当实施方案中的可检测标记的具体例子可以包含荧光素、罗丹明、德克萨斯红、藻红蛋白、俄勒冈绿(例如，俄勒冈绿488、俄勒冈绿514等)、AlexaFluor 488、AlexaFluor 647 (Molecular Probes, Eugene,Oregon)、Cy3、Cy5、Cy7、生物素、钌、DyLight荧光剂（包括、但不限于DyLight 680）、CW 800、反式-环辛烯、四嗪、甲基四嗪等。半抗原的例子包括生物素、地高辛配基、二硝基苯基等。毒素的例子包括鹅膏毒肽(例如，鹅膏蕈碱)、maitansinoids等。Labels for use with embodiments of the present invention include any suitable label that can be combined with a transglutaminase substrate. Examples of suitable labels include fluorescent labels, chemiluminescent labels, radioactive labels, chemical labels (e.g., comprising "click" chemistry), haptens, toxins, etc. and combinations thereof. More generally, suitable labels are compatible with at least one substrate for transglutaminase (e.g., an acyl-donor substrate or an amine-donor substrate) in that the label does not eliminate the ability of transglutaminase to act on the labeled substrate. In addition, suitable labels can produce a detectable signal relative to an unlabeled transglutaminase substrate. Specific examples of detectable labels for use in any suitable embodiment herein can include fluorescein, rhodamine, Texas Red, phycoerythrin, Oregon Green (e.g., Oregon Green 488, Oregon Green 514, etc.), AlexaFluor 488, AlexaFluor 647 (Molecular Probes, Eugene, Oregon), Cy3, Cy5, Cy7, biotin, ruthenium, DyLight fluorescer (including, but not limited to, DyLight 680), CW 800, trans-cyclooctene, tetrazine, methyltetrazine, etc. Examples of haptens include biotin, digoxigenin, dinitrophenyl, etc. Examples of toxins include amatoxins (e.g., amanitin), maitansinoids, etc.

在某些实施方案中，所述步骤110还可以包括鉴别和表征转谷氨酰胺酶的三维(3D)晶体结构以提供对转谷氨酰胺酶的性质的进一步洞察。可以在有或没有一种或多种底物、辅因子等存在下完成（performed）转谷氨酰胺酶的晶体结构。晶体结构的分析可以提供对位点特异性诱变的可能位置的洞察用于改善转谷氨酰胺酶的性能。具有特定底物序列的候选转谷氨酰胺酶的结晶可以进一步揭示转谷氨酰胺酶和底物序列之间的相互作用以通知对转谷氨酰胺酶和底物序列中的任一种或两种的修饰从而为特定应用定制转谷氨酰胺酶的性能。此外，晶体结构的分析可以充当基于阵列的底物发现的可靠性的独立证实(参见实施例5)。In certain embodiments, step 110 may also include identifying and characterizing a three-dimensional (3D) crystal structure of the transglutaminase to provide further insight into the properties of the transglutaminase. The crystal structure of the transglutaminase may be performed in the presence or absence of one or more substrates, cofactors, etc. Analysis of the crystal structure may provide insight into possible locations for site-specific mutagenesis to improve the performance of the transglutaminase. Crystallization of candidate transglutaminases with specific substrate sequences may further reveal interactions between the transglutaminase and substrate sequences to inform modifications to either or both of the transglutaminase and substrate sequences to tailor the performance of the transglutaminase for a specific application. In addition, analysis of the crystal structure may serve as an independent confirmation of the reliability of array-based substrate discovery (see Example 5).

在方法100的步骤112中，选择在步骤108中鉴别出且在步骤110中表征的底物序列用于用在选择的候选转谷氨酰胺酶的下游应用中。一般而言，特定酰基-供体或胺-供体底物序列可以是对给定的转谷氨酰胺酶独特的。因此，对于给定的应用，它可用于首先选择转谷氨酰胺酶，并然后选择一个或多个底物序列。对于其中特异性和选择性是重要的应用(例如，使用两种或更多种转谷氨酰胺酶的正交标记)，步骤112可以包括选择被选定的转谷氨酰胺酶特异性地和选择性地标记的底物序列。但是，其它应用可能受益于选择可以被超过一种转谷氨酰胺酶起作用的底物序列。还可以选择底物以达到特定程度的转谷氨酰胺酶活性，此时它可用于实现更快的或更慢的反应时间。为了进一步定制选择的底物序列，在步骤112以后，方法100可以返回至步骤106用于额外轮的筛选。在该情况下，可以在肽阵列上的随后轮的筛选中将选择的底物延伸、成熟等。用于肽序列的延伸和成熟的实例方法描述在2014年12月19日提交的Albert等人的美国专利公开号2015/0185216中。In step 112 of method 100, the substrate sequence identified in step 108 and characterized in step 110 is selected for use in a downstream application of the selected candidate transglutaminase. In general, a particular acyl-donor or amine-donor substrate sequence can be unique to a given transglutaminase. Thus, for a given application, it can be used to first select a transglutaminase and then select one or more substrate sequences. For applications where specificity and selectivity are important (e.g., orthogonal labeling using two or more transglutaminases), step 112 can include selecting a substrate sequence that is specifically and selectively labeled with the selected transglutaminase. However, other applications may benefit from selecting a substrate sequence that can be acted upon by more than one transglutaminase. Substrates can also be selected to achieve a specific degree of transglutaminase activity, in which case it can be used to achieve faster or slower reaction times. To further customize the selected substrate sequence, after step 112, method 100 can return to step 106 for additional rounds of screening. In this case, the selected substrates can be extended, matured, etc. in subsequent rounds of screening on the peptide array. Example methods for extension and maturation of peptide sequences are described in U.S. Patent Publication No. 2015/0185216, filed December 19, 2014, to Albert et al.

在其它的实施方案中，它可用于提供具有不加选择的转谷氨酰胺酶活性的位点特异性标记。如果不重组地生产底物，如果可以控制标记位点和标记比率，或标记位点和标记比率对于当前的应用不是至关重要的等，不加选择的转谷氨酰胺酶可以是有用的。用不加选择的转谷氨酰胺酶进行标记的一个例子是净荷与去糖基化的或糖基化的IgG的缀合。但是，具有非特异性活性的转谷氨酰胺酶可以限于可能应用的狭窄范围。因此，在其它情形中，可能有用的是提供对仅一种特定底物或类似底物集合具有特异性活性的转谷氨酰胺酶。In other embodiments, it can be used to provide site-specific labeling with non-selective transglutaminase activity. Non-selective transglutaminases may be useful if the substrate is not produced recombinantly, if the labeling sites and labeling ratios can be controlled, or if the labeling sites and labeling ratios are not critical for the current application, etc. An example of labeling with a non-selective transglutaminase is the conjugation of a payload to a deglycosylated or glycosylated IgG. However, transglutaminases with non-specific activity may be limited to a narrow range of possible applications. Therefore, in other cases, it may be useful to provide a transglutaminase that has specific activity for only one particular substrate or a collection of similar substrates.

总之，方法100可以用于与一种或多种对应的底物一起鉴别和表征一种或多种候选转谷氨酰胺酶。可以在底物文库上表达和筛选假定的或已知的转谷氨酰胺酶以鉴别引起期望的转谷氨酰胺酶活性的初步底物序列。然后可以选择最佳底物并任选地以迭代方式细化，由此产生可以为多种应用实现的转谷氨酰胺酶-底物组合。In summary, method 100 can be used to identify and characterize one or more candidate transglutaminases together with one or more corresponding substrates. Putative or known transglutaminases can be expressed and screened on a substrate library to identify preliminary substrate sequences that elicit the desired transglutaminase activity. The optimal substrate can then be selected and optionally refined in an iterative manner, thereby generating transglutaminases-substrates combinations that can be implemented for a variety of applications.

实施例Example

实施例1: 白色库茨涅尔氏菌微生物转谷氨酰胺酶的鉴别Example 1: Identification of Kuznetsova albicans microbial transglutaminase

为位点特异性的缀合方案（如抗体-药物缀合物）建立可行的且稳健的、酶促的、工业规模方法对偶联酶产生高需求。在其它因素中，它可用于这样的方案以具有高反应速率、缀合效力和底物特异性。此外，它可用于这样的方案以在生产中是经济的，包括具有低分子量的酶，独立于辅因子的酶，等，和它们的组合。关于新微生物转谷氨酰胺酶的发现，使用茂原链霉菌蛋白-谷氨酰胺γ-谷氨酰基转移酶的氨基酸序列作为查询，执行对可能满足所有提及的标准的该酶的同系物的检索。这会产生来自细菌白色库茨涅尔氏菌DSM 43870的假定基因产物KALB_7456，所述白色库茨涅尔氏菌DSM 43870是在2014年测序的形成孢子的革兰氏阳性细菌(Rebets等人, 2014.BMC genomics 15: 885)。The need for a viable, robust, enzymatic, industrial-scale method for site-specific conjugation (e.g., antibody-drug conjugates) creates a high demand for conjugating enzymes. Among other factors, these enzymes are suited for high reaction rates, conjugation efficiency, and substrate specificity. Furthermore, they are suited for production-economical enzymes, including enzymes with low molecular weights, enzymes that are cofactor-independent, and combinations thereof. Regarding the discovery of new microbial transglutaminases, a search was conducted for homologs of this enzyme that potentially met all of the aforementioned criteria, using the amino acid sequence of the Streptomyces mobara protein-glutamine γ-glutamyltransferase as a query. This yielded the putative gene product KALB_7456 from the bacterium Kuznetsova albus DSM 43870, a spore-forming, Gram-positive bacterium sequenced in 2014 (Rebets et al., 2014. BMC Genomics 15: 885).

使用NCBI蛋白BLAST工具的网络界面来检索与茂原链霉菌的MTG类似的序列。输入茂原链霉菌蛋白-谷氨酰胺γ-谷氨酰基转移酶的氨基酸序列(UniProt登录号P81453)作为查询。茂原链霉菌蛋白-谷氨酰胺γ-谷氨酰基转移酶的氨基酸序列如下：Use the NCBI protein BLAST tool web interface to search for sequences similar to MTG from Streptomyces mobaraensis. Enter the amino acid sequence of Streptomyces mobaraensis protein-glutamine gamma-glutamyltransferase (UniProt accession number P81453) as the query. The amino acid sequence of Streptomyces mobaraensis protein-glutamine gamma-glutamyltransferase is as follows:

手工筛选小于10^-10的E-值和比茂原链霉菌MTG更短的多肽序列的结果，产生来自细菌菌株白色库茨涅尔氏菌DSM 43870的假定基因产物KALB_7456 (GenBank登录号AHI00814.1; UniProt登录号W5WHY8)。基因产物KALB_7456的氨基酸序列如下：Manual screening for peptide sequences with E-values less than 10 ^-10 and shorter than Streptomyces mobaraensis MTG yielded the putative gene product KALB_7456 (GenBank accession number AHI00814.1; UniProt accession number W5WHY8) from the bacterial strain Kuznetsova albicans DSM 43870. The amino acid sequence of the gene product KALB_7456 is as follows:

使用Clustal Omega 1.2.1进行茂原链霉菌和白色库茨涅尔氏菌序列的序列比对，产生同一性百分比矩阵中32%的值，并鉴别出茂原链霉菌MTG的催化活性残基的保存(C140、R331和H350，基于P81453编号(SEQ ID NO:5))。使用来自丹麦技术大学（TechnicalUniversity of Denmark）的ProP 1.0 Server预测假定的白色库茨涅尔氏菌微生物转谷氨酰胺酶的前肽和信号序列。唯一预测的前肽切割位点是具有超过阈值的评分(0.513)的VAAPTPR/AP，其中在氨基酸R和A之间的斜杠标志指示预测的切割位点。Sequence alignment of the S. mobaraensis and K. albicans sequences was performed using Clustal Omega 1.2.1, resulting in a 32% identity matrix value and identifying the conservation of catalytically active residues from S. mobaraensis MTG (C140, R331, and H350, numbered based on P81453 (SEQ ID NO: 5)). The propeptide and signal sequence of the putative K. albicans microbial transglutaminase were predicted using the ProP 1.0 Server from the Technical University of Denmark. The only predicted propeptide cleavage site was VAAPTPR/AP, which had a score exceeding the threshold (0.513), where the slash mark between the amino acids R and A indicates the predicted cleavage site.

对比茂原链霉菌(SEQ ID NO:5)和白色库茨涅尔氏菌(SEQ ID NO:6)基因产物的一级结构表明30%相似性，具有活性部位残基的不同保守性(图2A)，从而指示酶结构和功能可以得到保留。整个白色库茨涅尔氏菌基因产物显著小于茂原链霉菌MTG，白色库茨涅尔氏菌基因产物总计达到计算的30.1 kDa的分子量。由于茂原链霉菌MTG作为无活性酶原产生且被细胞外蛋白酶加工以产生38 kDa活性形式，为假定的白色库茨涅尔氏菌MTG预测到类似的活化机制，并使用ProP 1.0服务器分析蛋白的N-端区域中的信号和前肽序列的概率(图2B和2C)。序列VAAPTPR/AP是唯一预测的前肽切割位点，其中切割发生在氨基酸精氨酸和脯氨酸之间，如斜杠所指示的。序列VAAPTPR/AP对应于茂原链霉菌MTG中的分散酶位点SAGPSFR/AP，但是据推测不具有分散酶反应性，因为苯丙氨酸是酶识别基序中的必需残基。另外，用ProP 1.0服务器预测具有高概率切割位点GLPTLIA/TT的信号肽。但是，预测的信号肽切割位点不带有与显著更长的茂原链霉菌MTG前序列或其它已知信号肽的序列相似性。基于预测的信号肽和前肽切割位点，计算成熟的白色库茨涅尔氏菌转谷氨酰胺酶、前酶和前原酶的分子量分别为26.4 kDa、27.7 kDa和30.1 kDa。Comparison of the primary structures of the gene products of Streptomyces mobaraensis (SEQ ID NO: 5) and Kuznetsova albus (SEQ ID NO: 6) revealed 30% similarity, with varying conservation of active site residues ( FIG. 2A ), indicating that enzyme structure and function may be preserved. The entire Kuznetsova albus gene product is significantly smaller than that of Streptomyces mobaraensis MTG, which totals a calculated molecular weight of 30.1 kDa. Since Streptomyces mobaraensis MTG is produced as an inactive zymogen and processed by extracellular proteases to produce a 38 kDa active form, a similar activation mechanism was predicted for the putative Kuznetsova albus MTG, and the signal and propeptide sequences in the N-terminal region of the protein were analyzed using the ProP 1.0 server ( FIG. 2B and 2C ). The sequence VAAPTPR/AP is the only predicted propeptide cleavage site, where cleavage occurs between the amino acids arginine and proline, as indicated by the slash. The sequence VAAPTPR/AP corresponds to the dispase site SAGPSFR/AP in Streptomyces mobara MTG, but is presumed to lack dispase reactivity, as phenylalanine is an essential residue in the enzyme recognition motif. Furthermore, a signal peptide with a high-probability cleavage site, GLPTLIA/TT, was predicted using the ProP 1.0 server. However, the predicted signal peptide cleavage site did not bear sequence similarity to the significantly longer Streptomyces mobara MTG presequence or other known signal peptides. Based on the predicted signal peptide and propeptide cleavage sites, the molecular weights of the mature Kuznetsova transglutaminase, proenzyme, and preproenzyme were calculated to be 26.4 kDa, 27.7 kDa, and 30.1 kDa, respectively.

实施例2：用于重组生产KalbTG的平行构建体评价Example 2: Evaluation of parallel constructs for recombinant production of KalbTG

为了快速地筛选假定的白色库茨涅尔氏菌转谷氨酰胺酶(KalbTG)的表达条件，我们使用片段交换系统将合成的遗传插入物插入为在大肠杆菌中的可溶性细胞溶质或周质表达设计的多种表达载体中(Geertsma, 等人.2011.Biochemistry 50(15): 3272-3278)。 To rapidly screen expression conditions for a putative Kuznetsova albicans transglutaminase (KalbTG), we used a fragment exchange system to insert synthetic genetic inserts into various expression vectors designed for soluble cytosolic or periplasmic expression in Escherichia coli (Geertsma, et al. 2011. Biochemistry 50(15): 3272-3278) .

在5 ml规模的初步筛选清楚地证实：具有全长KalbTG融合体的预期电泳迁移率的蛋白被表达，并且与串联的SlyD伴侣蛋白的融合(Scholz, 等人.2005.Journal of Molecular Biology 345(5): 1229-1241)在所有经测试的构建体中产生最高量的可溶性蛋白(图3A)。通过筛选不同的温育时间和温度、异丙基β-D-1-硫代吡喃半乳糖苷(IPTG)诱导物浓度和诱导时间、培养基类型和体积，进一步优化该构建体的表达。参考图3B和3C，选择的融合构建体的模块性质通过连续纯化和蛋白水解性裂解步骤的组合提供了SlyD融合蛋白、前酶和活化的酶。从N-端开始，表达构建体200包括两个连续SlyD伴侣蛋白202、因子Xa蛋白酶切割位点(C₁) KalbTG前肽204、胰蛋白酶蛋白酶切割位点(C₂) KalbTG酶206和8X-组氨酸标签208。SlyD伴侣蛋白202被因子Xa蛋白酶210从表达构建体200切下。此外，前肽204被胰蛋白酶蛋白酶212从KalbTG酶206切下，从而产生KalbTG构建体200的活化形式。纯化的和活化的酶在4℃和经多个冻融循环保持稳定。使用示差扫描量热法(DSC)确定纯化的和活化的酶的熔点是48.9℃。应当理解，本文描述的方法代表许多可行纯化策略之一。此外，所述的平行克隆方案使得能够重新评价不同的构建体和以有效且经济的方式在实验室规模的生产方法。Initial screening at a 5 ml scale clearly demonstrated that a protein with the expected electrophoretic mobility of the full-length KalbTG fusion was expressed, and that fusion with the tandem SlyD chaperone (Scholz, et al. 2005. Journal of Molecular Biology 345(5): 1229-1241) produced the highest amount of soluble protein of all the constructs tested ( FIG3A ). Expression of this construct was further optimized by screening different incubation times and temperatures, isopropyl β-D-1-thiogalactopyranoside (IPTG) inducer concentrations and induction times, media types, and volumes. Referring to FIG3B and 3C , the modular nature of the selected fusion constructs provided the SlyD fusion protein, proenzyme, and activated enzyme through a combination of sequential purification and proteolytic cleavage steps. Starting from the N-terminus, the expression construct 200 includes two consecutive SlyD chaperones 202, a Factor Xa protease cleavage site ( _C1 ) KalbTG propeptide 204, a trypsin protease cleavage site ( _C2 ) KalbTG enzyme 206, and an 8X-histidine tag 208. The SlyD chaperones 202 are cleaved from the expression construct 200 by Factor Xa protease 210. In addition, the propeptide 204 is cleaved from the KalbTG enzyme 206 by trypsin protease 212, thereby generating the activated form of the KalbTG construct 200. The purified and activated enzyme remains stable at 4°C and through multiple freeze-thaw cycles. The melting point of the purified and activated enzyme was determined to be 48.9°C using differential scanning calorimetry (DSC). It should be understood that the method described herein represents one of many possible purification strategies. In addition, the parallel cloning scheme described enables the re-evaluation of different constructs and the production process on a laboratory scale in an efficient and economical manner.

关于KalbTG的生产，将编码假定的白色库茨涅尔氏菌微生物转谷氨酰胺酶（包括预测的信号和前肽(KalbTGpp)，仅包括预测的前肽(kalbTGt3)，不包括预测的信号和前肽(kalbTGt1)，或不包括预测的信号且在前肽后面插入另外的因子Xa切割位点(kalbTGt2)）的基因序列为大肠杆菌表达进行密码子优化(Roche Sequence Analysis Web界面)、化学合成(GeneArt, ThermoFisher, Regensburg)并经由片段交换(Fx)克隆(Geertsma, 等人.2011, Biochemistry 50(15): 3272-3278)来克隆进载体中从而提供对裂解敏感D伴侣蛋白(SlyD, UniProt条目P0A9K9, 在Asp165以后截短(Scholz等人.2005, Journal Of Molecular Biology 345(5): 1229-1241)的两个N-端部分，继之以蛋白酶因子Xa切割位点并提供C-端8X-His标签。在Asp165以后截短的SlyD的氨基酸序列如下：For the production of KalbTG, the gene sequence encoding the putative Kuznetsova albicans microbial transglutaminase (including the predicted signal and propeptide ( KalbTGpp ), including only the predicted propeptide ( kalbTGt3 ), excluding the predicted signal and propeptide ( kalbTGt1 ), or excluding the predicted signal and with an additional Factor Xa cleavage site inserted after the propeptide ( kalbTGt2 )) was codon-optimized for E. coli expression (Roche Sequence Analysis Web interface), chemically synthesized (GeneArt, ThermoFisher, Regensburg) and cloned via fragment exchange (Fx) cloning (Geertsma, et al. 2011, Biochemistry 50(15): 3272-3278) into a vector providing the cleavage-sensitive D chaperone protein (SlyD, UniProt entry P0A9K9, truncated after Asp165 (Scholz et al. 2005, Journal Of Molecular Biology 345(5): 1229-1241), followed by a protease factor Xa cleavage site and a C-terminal 8X-His tag. The amino acid sequence of the truncated SlyD after Asp165 is as follows:

所述载体是基于Qiagen的pQE-80系列，包含T5启动子的IPTG-可诱导的蛋白表达并提供对氨苄西林的抗性。用于在该工作中描述的所有实验的表达构建体被称作EcSlyD2-Xa-KalbTGt3-8xHis。另外，作为最初的表达筛选，在载体中执行片段交换克隆，其赋予与8X-His标签、dsbA和ompT信号肽、单个SlyD或FkpA伴侣蛋白部分和麦芽糖结合蛋白(MBP)的N-端融合。根据标准分子生物学方案(Green, 等人, 2012.“Molecular Cloning: ALaboratory Manual”, Cold Spring Harbor Laboratory Press)执行质粒制备和表达质粒对化学感受态的大肠杆菌Bl21 Tuner细胞的转化。The carrier is based on the pQE-80 series of Qiagen, comprises the IPTG-inducible protein expression of T5 promoter and provides resistance to ampicillin.The expression construct for all experiments described in this work is called EcSlyD2-Xa-KalbTGt3-8xHis.In addition, as initial expression screening, in carrier, carry out fragment exchange cloning, it gives and 8X-His tag, dsbA and ompT signal peptide, single SlyD or FkpA chaperone protein part and maltose binding protein (MBP) N-terminal fusion.According to standard molecular biology protocol (Green, et al., 2012. " Molecular Cloning: A Laboratory Manual ", Cold Spring Harbor Laboratory Press), carry out plasmid preparation and expression plasmid to the transformation of the intestinal bacteria Bl21 Tuner cells of chemical competence.

为了制备有活性的KalbTG酶，以1:50的比率给0.4升至1升之间的Terrific Broth(TB)培养基接种EcSlyD2-Xa-KalbTGt3-8xHis-标签在大肠杆菌Bl21 Tuner中的过夜培养物。EcSlyD2-Xa-KalbTGt3-8xHis-标签表达构建体的氨基酸序列如下：To prepare active KalbTG enzyme, 0.4 to 1 liter of Terrific Broth (TB) medium was inoculated with an overnight culture of EcSlyD2-Xa-KalbTGt3-8xHis-Tag in E. coli Bl21 Tuner at a ratio of 1:50. The amino acid sequence of the EcSlyD2-Xa-KalbTGt3-8xHis-Tag expression construct is as follows:

将细胞在带有挡板的摇瓶中在37℃、180 rpm温育，并在细胞密度已经达到0.8-1.2的OD_600nm以后用1 mM IPTG诱导蛋白表达。通过离心(在4℃在7878 x g保持30 min)收获细胞。抛弃上清液，并将细胞沉淀物在-80℃储存或立即处理用于镍固定化的金属亲和色谱法(Ni⁺-IMAC)。Cells were incubated in baffled shake flasks at 37°C, 180 rpm, and protein expression was induced with 1 mM IPTG after the cell density reached an OD _{600 nm} of 0.8-1.2. Cells were harvested by centrifugation (7878 x g for 30 min at 4°C). The supernatant was discarded, and the cell pellet was either stored at -80°C or immediately processed for nickel-immobilized metal affinity chromatography (Ni ⁺ -IMAC).

关于EcSlyD2-Xa-KalbTGt3-8xHis的随后Ni⁺-IMAC纯化，在有溶菌酶和DNA酶I存在下将细胞沉淀物再悬浮于30-50 ml磷酸盐缓冲盐水(PBS)中。通过在2千巴高压匀浆化来破碎细胞。为了除去细胞碎片，将悬浮液离心(在4℃在17,210 x g保持30 min)。For subsequent Ni ⁺ -IMAC purification of EcSlyD2-Xa-KalbTGt3-8xHis, the cell pellet was resuspended in 30-50 ml of phosphate-buffered saline (PBS) in the presence of lysozyme and DNase I. The cells were disrupted by high-pressure homogenization at 2 kbar. To remove cell debris, the suspension was centrifuged (17,210 x g at 4°C for 30 min).

将除去了细胞碎片的上清液穿过0.45µm聚醚砜(PES)膜过滤，并加载到5 ml HisTrap柱上，用至少5柱体积的PBS洗涤，并用0-250 mM的咪唑在PBS中的溶液(30 ml, 5 mlmin^-1)用线性梯度洗脱His-标记的蛋白。将3 ml通过Abs_280nm鉴别的含有蛋白的级分收集，在PBS中稀释，并通过AmiconUltra浓缩器(10 000 MWCO；5000 x g保持15-30 min)浓缩。通过Bradford Assay (BioRad, 根据生产商的说明书)确定级分的蛋白浓度。通过SDS-PAGE(ThermoFisher Novex, 根据生产商的说明书)分析每个样品5-10µg蛋白。将经纯化的蛋白等分进200µl体积，通过在液氮中的短温育进行冷冻，并在-80℃储存。The supernatant, free of cell debris, was filtered through a 0.45 µm polyethersulfone (PES) membrane and loaded onto a 5 ml HisTrap column. The column was washed with at least 5 column volumes of PBS, and the His-tagged protein was eluted using a linear gradient of 0-250 mM imidazole in PBS (30 ml, 5 ml min ⁻¹ ). 3 ml fractions containing protein, identified by Abs _{280 nm} , were collected, diluted in PBS, and concentrated using an Amicon Ultra concentrator (10,000 MWCO; 5,000 xg for 15-30 min). The protein concentration of the fractions was determined using a Bradford assay (BioRad, according to the manufacturer's instructions). 5-10 µg of protein per sample was analyzed by SDS-PAGE (ThermoFisher Novex, according to the manufacturer's instructions). The purified protein was aliquoted into 200 µl volumes, frozen by brief incubation in liquid nitrogen, and stored at −80°C.

为了从EcSlyD2-Xa-KalbTGt3-8xHis-标签切割SlyD伴侣蛋白和前肽，将所述蛋白固定化在5 ml His Trap柱上，并先后用因子Xa和胰蛋白酶执行柱上消化。每50µg总蛋白施加1微克因子Xa，并在柱上温育1.5小时。将蛋白酶和切割的SlyD用PBS从柱洗掉。To cleave the SlyD chaperone and propeptide from the EcSlyD2-Xa-KalbTGt3-8xHis-tag, the protein was immobilized on a 5 ml His Trap column and digested on-column with Factor Xa followed by trypsin. 1 microgram of Factor Xa was applied per 50 µg of total protein and incubated on the column for 1.5 hours. The protease and cleaved SlyD were washed from the column with PBS.

因子Xa消化以后EcSlyD2-Xa-KalbTGt3-8xHis-标签表达构建体的氨基酸序列如下：The amino acid sequence of the EcSlyD2-Xa-KalbTGt3-8xHis-tag expression construct after Factor Xa digestion is as follows:

在胰蛋白酶消化以后，KalbTG酶(KalbTGt3)的氨基酸序列如下：After trypsin digestion, the amino acid sequence of the KalbTG enzyme (KalbTGt3) is as follows:

关于KalbTG的晶体结构分析，在因子Xa消化以后用0-250 mM线性咪唑梯度洗脱His-标记的蛋白，并使用尺寸排阻色谱法(GE Superdex 200 pg 16/60, PBS)执行精制步骤。可替换地，为了接收有活性的且纯的酶制品，如下执行活化：将200µg ml^-1胰蛋白酶加到His Trap柱上并温育15-30 min。将蛋白酶和切割的前肽用PBS从柱洗掉，并以与上述相同的方式收集消化的KalbTG。为了从有活性的KalbTG消除高分子量杂质，将酶制品穿过AmiconUltra浓缩器(50,000 MWCO)过滤。通过GLDH偶联测定测试滤液的活性，并通过SDS-PAGE分析纯度，如在图3C中所示。将剩余的滤液分成200µl等分试样，在液氮中冷冻并在-80℃储存。For crystal structure analysis of KalbTG, after Factor Xa digestion, the His-tagged protein was eluted using a 0-250 mM linear imidazole gradient and a polishing step was performed using size exclusion chromatography (GE Superdex 200 μg 16/60, PBS). Alternatively, to obtain an active and pure enzyme preparation, activation was performed as follows: 200 μg ^ml trypsin was added to a His Trap column and incubated for 15-30 min. The protease and cleaved propeptide were washed from the column with PBS, and the digested KalbTG was collected in the same manner as above. To eliminate high-molecular-weight impurities from the active KalbTG, the enzyme preparation was filtered through an Amicon Ultra concentrator (50,000 MWCO). The filtrate was tested for activity using a GLDH-coupled assay and analyzed for purity by SDS-PAGE, as shown in Figure 3C. The remaining filtrate was divided into 200 μl aliquots, frozen in liquid nitrogen, and stored at -80°C.

在用于制备有活性的KalbTG酶的方案的另一个实施方案中，将携带质粒pQE-EcSlyD2-Xa-KalbTGt1-8H (ColE1起源；IPTG可诱导的T5启动子)的大肠杆菌BL21 Tuner接种进10 L标准大肠杆菌发酵培养基中，所述标准大肠杆菌发酵培养基类似于TerrificBroth (酵母浸出物, K₂HPO₄, NH₄Cl, 甘油, 消泡剂, MgSO₄•7H₂O, H₃PO₄, NaOH)且含有另外1 g NH₄Cl/升。EcSlyD2-Xa-KalbTGt1-8xHis构建体的序列如下：In another embodiment of the protocol for producing active KalbTG enzyme, E. coli BL21 Tuner carrying the plasmid pQE-EcSlyD2-Xa-KalbTGt1-8H (ColE1 origin; _{IPTG-inducible T5 promoter) was inoculated into 10 L of standard E. coli fermentation medium similar to TerrificBroth (yeast extract, K2HPO4, NH4Cl} _, _glycerol , antifoam, _MgSO4.7H2O , _H3PO4 , NaOH) and containing an additional 1 g _of _NH4Cl per liter. The sequence of the EcSlyD2 _- Xa-KalbTGt1-8xHis construct is as follows:

EcSlyD2-Xa-KalbTGt1-8xHis构建体具有与图3B所示的构建体类似(但是不相同)的模块组成，一个显著差异是EcSlyD2-Xa-KalbTGt1-8xHis构建体省略了KalbTG前肽204和胰蛋白酶蛋白酶切割位点(C₂)。The EcSlyD2-Xa-KalbTGt1-8xHis construct has a similar (but not identical) modular composition to the construct shown in FIG3B , with one notable difference being that the EcSlyD2-Xa-KalbTGt1-8xHis construct omits the KalbTG propeptide 204 and the trypsin protease cleavage site (C ₂ ).

在35℃进行发酵26 h，直到达到44的OD₆₀₀。将细胞收获并再悬浮于含有50 mMTris-HCl pH 8.0、1 mM EDTA、1 mM DTT和10 mM (NH₄)₂SO₄的缓冲液中。通过高压匀浆器在800巴破碎细胞。将得到的细胞提取物用1-3%Polymin-G20预处理，并然后以大约30 mg/ml的蛋白浓度加载到Q-Sepharose XL柱(强阴离子交换基质; GE Healthcare LifeSciences)上。将结合的蛋白用20 mM Tris-HCl pH 8.0、1 mM EDTA、1 mM DTT、10 mM(NH₄)₂SO₄和150 mM NaCl洗涤，并然后用30柱体积的150-500 mM NaCl梯度洗脱。将洗脱液在20 mM Tris-HCl pH 8.0、0.1 mM EDTA、0.1 mM DTT、10 mM (NH₄)₂SO₄、500 mM NaCl中透析(10 kDa分子量截止值)，浓缩，并加载到Ni-NTA柱上。将结合的His-标记的蛋白用20 mMTris-HCl pH 8.0、0.1 mM EDTA、0.1 mM DTT、10 mM (NH₄)₂SO₄、500 mM NaCl、25 mM咪唑洗涤，并用20柱体积的25-200 mM咪唑梯度洗脱。将经纯化的蛋白在20 mM Tris-HCl、1 mMEDTA、1 mM DTT和10 mM (NH₄)₂SO₄（pH 8.0）中透析(10 kDa分子量截止值)，浓缩至1.77mg/ml，通过SDS-PAGE和GLDH活性测定进行分析，并在-80℃在10 mg等分试样中冷冻。在使用之前，通过用10 kDa分子量截止值过滤器透析来除去(NH₄)₂SO₄。如本文中所述执行因子Xa消化以从KalbTG构建体除去2xSlyD部分，由此产生具有以下序列的KalbTG酶(KalbTGt1)：Fermentation was carried out at 35°C for 26 h until an _OD600 of 44 _was reached. The cells were harvested and resuspended in a buffer containing 50 mM Tris-HCl pH 8.0, 1 mM EDTA, 1 mM DTT, and 10 mM ( _NH4 ) _2SO4 . The cells were disrupted by a high-pressure homogenizer at 800 bar. The resulting cell extract was pretreated with 1-3% Polymin-G20 and then loaded onto a Q-Sepharose XL column (strong anion exchange matrix; GE Healthcare LifeSciences) at a protein concentration of approximately 30 mg/ml. The bound protein was washed with 20 mM Tris-HCl pH 8.0, 1 mM EDTA, 1 mM DTT, 10 mM ( _NH4 ) _2SO4 _, and 150 mM NaCl, and then eluted with a 150-500 mM NaCl gradient over 30 column volumes. The eluate was dialyzed in 20 mM Tris-HCl pH 8.0, 0.1 mM EDTA, 0.1 mM DTT, 10 mM (NH ₄ ) ₂ SO ₄ , 500 mM NaCl (10 kDa molecular weight cutoff), concentrated, and loaded onto a Ni-NTA column. The bound His-tagged protein was washed with 20 mMTris-HCl pH 8.0, 0.1 mM EDTA, 0.1 mM DTT, 10 mM (NH ₄ ) ₂ SO ₄ , 500 mM NaCl, 25 mM imidazole and eluted with a 20 column volume 25-200 mM imidazole gradient. The purified protein was dialyzed (10 kDa molecular weight cutoff) against 20 mM Tris-HCl, 1 mM EDTA, 1 mM DTT, and 10 mM (NH ₄ ) ₂ SO ₄ (pH 8.0), concentrated to 1.77 mg/ml, analyzed by SDS-PAGE and GLDH activity assay, and frozen at -80°C in 10 mg aliquots. Prior to use, (NH ₄ ) ₂ SO ₄ was removed by dialysis against a 10 kDa molecular weight cutoff filter. Factor Xa digestion was performed as described herein to remove the 2xSlyD portion from the KalbTG construct, thereby generating the KalbTG enzyme (KalbTGt1) having the following sequence:

用于表达从EcSlyD2-Xa-KalbTGt1-8xHis构建体衍生出的KalbTG酶的方案的一个方面包括向发酵液和纯化缓冲液中添加铵离子的来源(即，(NH₄)₂SO₄或NH₄Cl)，其为KalbTG酶的天然抑制剂。铵(或氨)的应用使得能够生产KalbTG融合蛋白，无需前肽序列的表达或下游切割。KalbTG酶的自催化活性被铵离子(来自溶液中的氯化铵)的存在可逆地抑制，直到应用KalbTG酶之前用于除去铵离子的最终透析。令人惊奇地，包括氯化铵的应用的纯化过程会导致如通过GLDH测定所测量的多达约9倍的KalbTG酶活性增加，由此使得KalbTG酶与当前商购可得的MTG相比非常具有竞争力(参见实施例3、表4)。因此，它可用于在有氯化铵、另一种铵盐或另一种氨来源存在下表达、纯化或以其它方式分离转谷氨酰胺酶。One aspect of the protocol for expressing the KalbTG enzyme derived from the EcSlyD2-Xa-KalbTGt1-8xHis construct includes the addition of a source of ammonium ions (i.e., (NH ₄ ) ₂ SO ₄ or NH ₄ Cl), a natural inhibitor of the KalbTG enzyme, to the fermentation broth and purification buffer. The use of ammonium (or ammonia) enables the production of the KalbTG fusion protein without the need for expression or downstream cleavage of the propeptide sequence. The autocatalytic activity of the KalbTG enzyme is reversibly inhibited by the presence of ammonium ions (from ammonium chloride in solution) until the final dialysis to remove ammonium ions before the use of the KalbTG enzyme. Surprisingly, the purification process including the use of ammonium chloride results in an increase in KalbTG enzyme activity of up to about 9-fold as measured by the GLDH assay, making the KalbTG enzyme very competitive with currently commercially available MTG (see Example 3, Table 4). Therefore, it can be used to express, purify, or otherwise isolate transglutaminases in the presence of ammonium chloride, another ammonium salt, or another source of ammonia.

在某些实施方案中，它可用于选择铵的来源，其中抗衡离子在整个过程中具有中性作用。例如，在其中铵抗衡离子是硫酸根的实验中，与使用氯离子作为抗衡离子相比观察到对表达的负面影响。但是，硫酸根作为抗衡离子的应用可能就纯化步骤而言具有微小至无负面效应。因此，它可用于首先确定抗衡离子对给定的转谷氨酰胺酶的表达和纯化的影响。此外，可以改变在给定的转谷氨酰胺酶的表达或纯化过程中存在的铵离子的浓度。在一个实施方案中，所述铵可以以至少约10 μM的浓度存在。在另一个实施方案中，所述铵可以以至少约100 μM的浓度存在。在另一个实施方案中，所述铵可以以至少约1 mM的浓度存在。在另一个实施方案中，所述铵可以以至少约10 mM的浓度存在。In certain embodiments, it can be useful to select a source of ammonium where the counterion has a neutral effect in the overall process. For example, in experiments where the ammonium counterion was sulfate, a negative effect on expression was observed compared to using chloride as the counterion. However, the use of sulfate as the counterion may have minimal to no negative effect with respect to the purification steps. Therefore, it can be useful to first determine the effect of the counterion on the expression and purification of a given transglutaminase. Additionally, the concentration of ammonium ions present during the expression or purification of a given transglutaminase can be varied. In one embodiment, the ammonium may be present at a concentration of at least about 10 μM. In another embodiment, the ammonium may be present at a concentration of at least about 100 μM. In another embodiment, the ammonium may be present at a concentration of at least about 1 mM. In another embodiment, the ammonium may be present at a concentration of at least about 10 mM.

实施例3：肽阵列用于转谷氨酰胺酶底物发现的用途Example 3: Use of peptide arrays for transglutaminase substrate discovery

为了鉴别白色库茨涅尔氏菌转谷氨酰胺酶的潜在酰基-供体和胺-供体底物，执行在5-聚体肽阵列上的高通量筛选(图4A和4B)。通过商业测定来证实对于SlyD-融合的和成熟的KalbTG酶而言至少1.65 U/mg的活性(Zedira MTG-ANiTA-KIT；相对于4.3 U mg^-1，通过与试剂盒一起提供的MTG和使用BSA的0.07 U/mg空白值)。所述测定使用β-酪蛋白作为交联底物并检测通过谷氨酸脱氢酶依赖性的NADPH氧化实现的脱酰胺化。通过用由无掩模的阵列合成制备的肽阵列(2014年12月19日提交的Albert等人的美国专利公开号2015/0185216)测定KalbTG，鉴别特异性的识别基序。在两个阵列上平行地量化140万独特5-聚体肽和生物素化的胺供体之间的转酰胺基反应的效率，并确定具有最高周转的肽的序列(图4A)。将9种最好的肽重新合成，并在单独的GLDH偶联测定中测试KalbTG活性。GLDH偶联测定的结果与对应的阵列数据一起显示在表1中。应当指出，序列MLAQG (SEQ ID NO:13)的阵列信号被标记为不适用(n.a.)，因为该序列没有被包括在阵列上。类似地，背景信号的测量仅适用于在阵列上执行的实验。To identify potential acyl-donor and amine-donor substrates for Kuznetsov albicans transglutaminase, a high-throughput screen on a 5-mer peptide array was performed (Figures 4A and 4B). An activity of at least 1.65 U/mg for SlyD-fused and mature KalbTG enzymes was confirmed by commercial assays (Zedira MTG-ANiTA-KIT; relative to 4.3 U mg ^-1 by MTG provided with the kit and a 0.07 U/mg blank using BSA). The assay used β-casein as a cross-linked substrate and detected deamidation by glutamate dehydrogenase-dependent NADPH oxidation. KalbTG was assayed using a peptide array prepared by maskless array synthesis (U.S. Patent Publication No. 2015/0185216 to Albert et al., filed December 19, 2014) to identify specific recognition motifs. The efficiency of the transamidation reaction between 1.4 million unique 5-mer peptides and a biotinylated amine donor was quantified in parallel on two arrays, and the sequence of the peptide with the highest turnover was determined ( FIG4A ). The nine best peptides were resynthesized and tested for KalbTG activity in a separate GLDH coupled assay. The results of the GLDH coupled assay are shown in Table 1 along with the corresponding array data. It should be noted that the array signal for the sequence MLAQG (SEQ ID NO: 13) is marked as not applicable (na) because this sequence was not included on the array. Similarly, the measurement of background signal applies only to experiments performed on the array.

使用100µM的9种表现最佳的阵列选择的含谷氨酰胺的底物和2种茂原链霉菌MTG含谷氨酰胺的底物中的每一种，在GLDH偶联测定中，通过在有胺-供体底物(500µM)存在下在340 nm和37℃测量肽阵列上的生物素化的胺-供体的掺入和NADH氧化速率，得到KalbTG活性。观察到最佳阵列选择的底物和它们在GLDH偶联测定中的表现之间的强关联，而KalbTG对优选的茂原链霉菌MTG底物DYALQ (SEQ ID NO: 22)和MLAQG (SEQ ID NO: 13)没有表现出活性。表1中最佳序列的测试证实了YRYRQ (SEQ ID NO: 1)和RYRQR (SEQ ID NO14)作为最佳表现5-聚体底物，分别具有3.52±0.08 pmol NADHs^-1和3.60±0.12 pmolNADHs^-1的周转率。含赖氨酸的底物YKYRQ (SEQ ID NO:20)在GLDH测定中表现出最高周转率(4.00±0.18 pmols^-1)，不受理论的限制，这可能是由赖氨酸交叉反应性造成的假象，且因而在进一步分析中略去。令人惊奇地，用众所周知的茂原链霉菌MTG识别基序MLAQGS (SEQID NO:23)（以5-聚体序列MLAQG (SEQ ID NO: 13)为代表）或茂原链霉菌MTG底物DYALQ(SEQ ID NO:22)没有检测到活性。KalbTG activity was determined in a GLDH-coupled assay by measuring the incorporation of biotinylated amine-donors and NADH oxidation rates on the peptide array in the presence of amine-donor substrates (500 µM) at 340 nm and 37°C using 100 µM of each of the nine best-performing array-selected glutamine-containing substrates and two S. mobara MTG glutamine-containing substrates. A strong correlation was observed between the best array-selected substrates and their performance in the GLDH-coupled assay, while KalbTG showed no activity against the preferred S. mobara MTG substrates DYALQ (SEQ ID NO: 22) and MLAQG (SEQ ID NO: 13). Testing of the top sequences in Table 1 confirmed YRYRQ (SEQ ID NO: 1) and RYRQR (SEQ ID NO: 14) as the best-performing 5-mer substrates, with turnover rates of 3.52 ± 0.08 pmol ^NADHs and 3.60 ± 0.12 pmol ^NADHs , respectively. The lysine-containing substrate YKYRQ (SEQ ID NO: 20) exhibited the highest turnover rate in the GLDH assay (4.00 ± 0.18 pmol ^s ). Without being bound by theory, this may be an artifact of lysine cross-reactivity and was therefore omitted from further analysis. Surprisingly, no activity was detected using the well-known S. mobara MTG recognition motif MLAQGS (SEQ ID NO: 23) (represented by the 5-mer sequence MLAQG (SEQ ID NO: 13)) or the S. mobara MTG substrate DYALQ (SEQ ID NO: 22).

在肽阵列上的第二轮成熟产生APRYRQRAA (SEQ ID NO:24)作为最佳表现9-聚体底物，然后将其重新合成为生物素化的肽以充当酰基供体用于发现在5-聚体肽阵列背后的经优化的赖氨酸识别基序(图4B)。再次，将6种最佳的含赖氨酸的肽重新合成，并在KalbTG溶液中活性测定中使用含有所述经优化的谷氨酰胺识别序列YRYRQ (SEQ ID NO:1)作为酰基供体的肽进行测试(表2)。应当理解，由于从阵列中略去尸胺，关于尸胺的阵列信号的计算不适用。A second round of maturation on the peptide array yielded APRYRQRAA (SEQ ID NO: 24) as the best performing 9-mer substrate, which was then resynthesized as a biotinylated peptide to serve as an acyl donor for discovery of the optimized lysine recognition motif behind the 5-mer peptide array ( FIG. 4B ). Again, the six best lysine-containing peptides were resynthesized and tested in an activity assay in KalbTG solution using peptides containing the optimized glutamine recognition sequence YRYRQ (SEQ ID NO: 1) as an acyl donor (Table 2). It should be understood that since cadaverine was omitted from the array, calculations of the array signal for cadaverine were not applicable.

参考表2，使用100µM的以下每一种，在GLDH偶联测定中，在有谷氨酰胺-供体底物(200µM)存在下，通过在340 nm和37℃测量生物素化的谷氨酰胺-供体的掺入和NADH氧化速率，得到KalbTG活性：i) 6种最佳表现的阵列选择的赖氨酸底物，ii)尸胺，和iii)优选的MTG赖氨酸底物ARSKL (SEQ ID NO:30)。用序列RYESK(SEQ ID NO:2)观察到在GLDH测定中的最高周转(4.47±0.16 pmol NADHs^-1)。相对于尸胺(3.51±0.12 pmol s^-1)或ARSKL(SEQID NO:30) (3.87±0.31 pmols^-1)（以前在肽阵列上作为优选的MTG赖氨酸供体基序建立的肽），这是小的、但显著的增加。关于茂原链霉菌MTG底物肽ARSKL(SEQ ID NO:30)的额外细节，可以参见2014年12月19日提交的Albert等人的美国临时专利申请系列号62/094,495。Referring to Table 2, KalbTG activity was determined by measuring biotinylated glutamine donor incorporation and NADH oxidation rates at 340 nm and 37°C in the presence of a glutamine donor substrate (200 µM) in a GLDH-coupled assay using 100 µM of each of the following: i) the six best-performing array-selected lysine substrates, ii) cadaverine, and iii) the preferred MTG lysine substrate, ARSKL (SEQ ID NO:30). The highest turnover in the GLDH assay was observed with the sequence RYESK (SEQ ID NO:2) (4.47 ± 0.16 pmol NADHs ^-1 ). This represents a small but significant increase relative to cadaverine (3.51 ± 0.12 pmol s ^-1 ) or ARSKL (SEQ ID NO:30) (3.87 ± 0.31 pmol s ^-1 ), peptides previously established on a peptide array as preferred MTG lysine donor motifs. Additional details regarding the S. mobara MTG substrate peptide ARSKL (SEQ ID NO: 30) can be found in U.S. Provisional Patent Application Serial No. 62/094,495 to Albert et al., filed December 19, 2014.

关于肽阵列的构建，使用除了半胱氨酸和甲硫氨酸以外的18种天然氨基酸的所有组合、以及相同氨基酸的任何二聚体或更长重复、和含有选自HR、RH、HK、KH、RK、KR、HP和PQ序列的二肽的任何肽，设计了1,360,732种独特5-聚体肽的文库。使用无掩模的光指导的肽阵列合成在相同阵列上一式两份地合成所述文库。每个5-聚体肽在N-端和C-端上侧接3个氨基酸长的接头，所述接头使用具有3:1比率的甘氨酸和丝氨酸的混合物合成。Regarding the construction of the peptide array, a library of 1,360,732 unique 5-mer peptides was designed using all combinations of the 18 natural amino acids except cysteine and methionine, as well as any dimer or longer repeat of the same amino acid, and any peptide containing a dipeptide selected from the sequences HR, RH, HK, KH, RK, KR, HP, and PQ. The library was synthesized in duplicate on the same array using maskless light-directed peptide array synthesis. Each 5-mer peptide was flanked by a 3-amino acid linker at the N- and C-termini, synthesized using a mixture of glycine and serine with a 3:1 ratio.

为了测试KalbTG对含谷氨酰胺的底物的特异性，使用N-(生物素基)尸胺作为赖氨酸底物的替代物以将肽阵列上的谷氨酰胺-肽生物素化。在SecureSeal^TM腔室(Grace Bio-Labs)中在1200 μL 100 mM Tris-HCl pH8、1 mM DTT、50 μM N-(生物素基)尸胺、0.2 ng μl^-1 KalbTG中在37℃执行KalbTG标记反应45分钟。温育以后，将腔室取出，并将阵列在20 mMTris-HCl（pH7.8）、0.2 M NaCl、1%SDS中洗涤1分钟，随后在20 mM Tris-HCl中洗涤1分钟。将与阵列连接的生物素在10 mM Tris-HCl（pH7.4）、1%碱可溶性的酪蛋白、0.05%吐温-20中用0.3 μgml^-1 Cy5-抗生蛋白链菌素在室温染色1小时。用荧光扫描仪在2 μm的分辨率和635nm的波长测量Cy5荧光强度。To test the specificity of KalbTG for glutamine-containing substrates, N-(biotinyl)cadaverine was used as an alternative to lysine substrates to biotinylate glutamine-peptides on the peptide array. The KalbTG labeling reaction was performed in a SecureSeal ^™ chamber (Grace Bio-Labs) at 37°C for 45 minutes in 1200 μL of 100 mM Tris-HCl (pH 8), 1 mM DTT, 50 μM N-(biotinyl)cadaverine, and 0.2 ng ^μl⁻¹ KalbTG. Following incubation, the chamber was removed and the array was washed in 20 mM Tris-HCl (pH 7.8), 0.2 M NaCl, and 1% SDS for 1 minute, followed by a 1-minute wash in 20 mM Tris-HCl. The biotin-linked array was stained with 0.3 μg ^ml Cy5-streptavidin in 10 mM Tris-HCl (pH 7.4), 1% alkali-soluble casein, and 0.05% Tween-20 for 1 hour at room temperature. Cy5 fluorescence intensity was measured using a fluorescence scanner at a resolution of 2 μm and a wavelength of 635 nm.

为了测试KalbTG对赖氨酸底物的特异性，将化学合成的Z-APRYRQRAAGGG-PEG-生物素肽（其包括序列APRYRQRAAGGG(SEQ ID NO:31)）用作含谷氨酰胺的底物以将含赖氨酸的肽生物素化。如上所述，用0.1 ng μl^-1 KalbTG、0.8 μM肽在37℃进行阵列生物素化15分钟。应当指出，在肽或其它类似构建体前面的“Z-”在本文中用于代表羧基苄基，除非另外说明。To test the specificity of KalbTG for lysine substrates, a chemically synthesized Z-APRYRQRAAGGG-PEG-biotin peptide (containing the sequence APRYRQRAAGGG (SEQ ID NO: 31)) was used as a glutamine-containing substrate to biotinylate lysine-containing peptides. Array biotinylation was performed as described above using 0.1 ng ^μl⁻¹ KalbTG and 0.8 μM peptide at 37°C for 15 minutes. It should be noted that "Z-" preceding a peptide or other similar construct herein represents a carboxybenzyl group unless otherwise indicated.

在37℃在含有10 μM N-(生物素基)尸胺和0.1 ng μl^-1茂原链霉菌MTG的100 mMTris-HCl（pH8）、1 mM DTT中执行在肽阵列上的茂原链霉菌MTG反应15分钟。如在图4A中所示，用KalbTG在5-聚体肽阵列上鉴别出的最佳的22种谷氨酰胺-供体底物是(没有特定顺序) FRQRG (SEQ ID NO:18)、YRYRQ (SEQ ID NO:1)、QRQRQ (SEQ ID NO:19)、FRQRQ (SEQID NO:16)、RYRQR (SEQ ID NO:14)、RQRQR (SEQ ID NO:17)、YRQSR (SEQ ID NO:32)、YKYRQ (SEQ ID NO:20)、LRYRQ (SEQ ID NO:33)、YRQRA (SEQ ID NO:34)、VRYRQ (SEQ IDNO:35)、QRQTR (SEQ ID NO:36)、YRQTR (SEQ ID NO:37)、PRYRQ (SEQ ID NO:38)、RFSQR(SEQ ID NO:39)、WQRQR (SEQ ID NO:40)、QYRQR (SEQ ID NO:21)、VRQRQ (SEQ ID NO:41)、RYTQR (SEQ ID NO:42)、AYRQR (SEQ ID NO:43)、YQRQR (SEQ ID NO:44)和RYSQR(SEQ ID NO:15)。如在图4B中所示，用KalbTG在5-聚体肽阵列上鉴别出的最佳的17种赖氨酸-供体底物是(没有特定顺序) NYRFK (SEQ ID NO:45)、YQKWK (SEQ ID NO:46)、YKYKY(SEQ ID NO:47)、RWKFK (SEQ ID NO:48)、RFYSK (SEQ ID NO:49)、YKYAK (SEQ ID NO:50)、YRYAK (SEQ ID NO:51)、RYSYK (SEQ ID NO:52)、YKSFK (SEQ ID NO:53)、YKSWK (SEQID NO:54)、KYRYK (SEQ ID NO:55)、YKYNK (SEQ ID NO:56)、RYSKY (SEQ ID NO:25)、RYESK (SEQ ID NO:2)、PYKYK (SEQ ID NO:57)、FYKYK (SEQ ID NO:58)和FYESK (SEQ IDNO:59)。用MTG鉴别出且显示在图5B和5C中的16种谷氨酰胺-供体底物是(没有特定顺序)EWVAQ (SEQ ID NO:60)、EWALQ (SEQ ID NO:61)、DYFLQ (SEQ ID NO:62)、DYALQ (SEQ IDNO:22)、EYWLQ (SEQ ID NO:63)、DWALQ (SEQ ID NO:64)、DWYLQ (SEQ ID NO:65)、DYWLQ(SEQ ID NO:66)、EYVAQ (SEQ ID NO:67)、DYVAQ (SEQ ID NO:68)、DWVAQ (SEQ ID NO:69)、EYVLQ (SEQ ID NO:70)、EWIAQ (SEQ ID NO:71)、WYALQ (SEQ ID NO:72)、EYALQ (SEQID NO:73)和EYFLQ (SEQ ID NO:74)。The S. mobara MTG reaction on the peptide array was performed in 100 mM Tris-HCl (pH 8), 1 mM DTT containing 10 μM N-(biotinyl)cadaverine and 0.1 ng μl ⁻¹ S. mobara MTG at 37° C. for 15 minutes. As shown in FIG4A , the best 22 glutamine-donor substrates identified on the 5-mer peptide array using KalbTG are (in no particular order) FRQRG (SEQ ID NO: 18), YRYRQ (SEQ ID NO: 1), QRQRQ (SEQ ID NO: 19), FRQRQ (SEQ ID NO: 16), RYRQR (SEQ ID NO: 14), RQRQR (SEQ ID NO: 17), YRQSR (SEQ ID NO: 32), YKYRQ (SEQ ID NO: 20), LRYRQ (SEQ ID NO: 33), YRQRA (SEQ ID NO: 34), VRYRQ (SEQ ID NO: 35), QRQTR (SEQ ID NO: 36), YRQTR (SEQ ID NO: 37), PRYRQ (SEQ ID NO: 38), RFSQR (SEQ ID NO: 39), WQRQR (SEQ ID NO: 40), QYRQR (SEQ ID NO: 41). NO:21), VRQRQ (SEQ ID NO:41), RYTQR (SEQ ID NO:42), AYRQR (SEQ ID NO:43), YQRQR (SEQ ID NO:44) and RYSQR (SEQ ID NO:15). As shown in Figure 4B, the best 17 lysine-donor substrates identified on the 5-mer peptide array using KalbTG are (in no particular order) NYRFK (SEQ ID NO:45), YQKWK (SEQ ID NO:46), YKYKY (SEQ ID NO:47), RWKFK (SEQ ID NO:48), RFYSK (SEQ ID NO:49), YKYAK (SEQ ID NO:50), YRYAK (SEQ ID NO:51), RYSYK (SEQ ID NO:52), YKSFK (SEQ ID NO:53), YKSWK (SEQ ID NO:54), KYRYK (SEQ ID NO:55), YKYNK (SEQ ID NO:56), RYSKY (SEQ ID NO:25), RYESK (SEQ ID NO:2), PYKYK (SEQ ID NO:57), FYKYK (SEQ ID NO:58), and FYESK (SEQ ID NO:59). The 16 glutamine-donor substrates identified using MTG and shown in Figures 5B and 5C are (in no particular order) EWVAQ (SEQ ID NO:60), EWALQ (SEQ ID NO:61), DYFLQ (SEQ ID NO:62), DYALQ (SEQ ID NO:22), EYWLQ (SEQ ID NO:63), DWALQ (SEQ ID NO:64), DWYLQ (SEQ ID NO:65), DYWLQ (SEQ ID NO:66), EYVAQ (SEQ ID NO:67), DYVAQ (SEQ ID NO:68), DWVAQ (SEQ ID NO:69), EYVLQ (SEQ ID NO:70), EWIAQ (SEQ ID NO:71), WYALQ (SEQ ID NO:72), EYALQ (SEQ ID NO:73), and EYFLQ (SEQ ID NO:74).

实施例4：KalbTG对成熟的谷氨酰胺底物的特异性和应用于半-正交缀合Example 4: Specificity of KalbTG for mature glutamine substrates and application in semi-orthogonal conjugation

由于肽阵列可以一次递送关于所有可行5-聚体肽的读出，单个数据集各自足以评价酶在底物特异性方面存在如何差异。在用MTG执行的阵列上的信号分布的中场(图5A)发现了最佳KalbTG谷氨酰胺底物(图4A)。通过对比，最佳表现的茂原链霉菌MTG含谷氨酰胺的底物(图5B)在KalbTG阵列上表现出相对较低的信号(图5C)。为了证实两种转谷氨酰胺酶具有正交的含谷氨酰胺的底物偏好和量化交叉反应性的量，在有不同浓度的底物肽Z-GGGYRYRQGGGG和Z-GGGDYALQGGGG存在下确定了两种酶的动力学(图5D)。值得注意的是，Z-缀合的底物包括肽序列GGGYRYRQGGGG(SEQ ID NO:75)和GGGDYALQGGGG (SEQ ID NO:76)。茂原链霉菌MTG对两种底物表现出在0.6-0.9 mM范围内的类似K_M值，而周转k_cat对于包括优选DYALQ (SEQ ID NO:22)序列的Z-GGGDYALQGGGG底物而言显著更高(1.39 s^-1相对于YRYRQ(SEQ ID NO:1)的0.93s^-1)，从而分别产生1.64 x 10³ [M^-1 s^-1]和1.44 x 10³ [M^-1 s^-1]的催化效率(k_catK_M ^-1) (表3)。相对于经工程改造的茂原链霉菌MTG酶，KalbTG似乎具有更低的底物结合效率(2 mM的K_M)，但是更高的周转(1.92 s^-1的k_cat)，从而导致0.89 x 10³ [M^-1 s^-1]的k_catK_M ^-1。KalbTG似乎是对茂原链霉菌MTG底物Z-GGGDYALQGGGG完全无反应的，因而不可确定动力学参数，如在表3中用‘n.d.’指示的。Since peptide arrays can deliver readouts for all available 5-mer peptides at once, a single data set is sufficient to evaluate how enzymes differ in substrate specificity. The optimal KalbTG glutamine substrate ( FIG. 4A ) was found in the middle of the signal distribution on the array performed with MTG ( FIG. 5A ). By contrast, the best-performing S. mobaraensis MTG glutamine-containing substrate ( FIG. 5B ) exhibited a relatively low signal on the KalbTG array ( FIG. 5C ). To confirm that the two transglutaminases have orthogonal glutamine-containing substrate preferences and to quantify the amount of cross-reactivity, the kinetics of the two enzymes were determined in the presence of different concentrations of the substrate peptides Z-GGGYRYRQGGGG and Z-GGGDYALQGGGG ( FIG. 5D ). Notably, the Z-conjugated substrates include the peptide sequences GGGYRYRQGGGG (SEQ ID NO: 75) and GGGDYALQGGGG (SEQ ID NO: 76). Streptomyces mobaraensis MTG exhibited similar _KM values in the range of 0.6-0.9 mM for both substrates, while the turnover _kcat was significantly higher for the Z-GGGDYALQGGGG substrate comprising the preferred DYALQ (SEQ ID NO: 22) sequence (1.39 s ^-1 versus 0.93 s- ¹ for YRYRQ (SEQ ID NO: 1)), resulting in catalytic efficiencies (kcat _KM ^-1 ) of 1.64 x ¹⁰³ [M ^-1 s ^-1 ] and 1.44 x ¹⁰³ [M ^-1 s ^-1 ], _respectively (Table 3). Relative to the engineered S. mobara MTG enzyme, KalbTG appears to have a lower substrate binding efficiency ( _KM of 2 mM), but a higher turnover ( _kcat of 1.92 s ^-1 ), resulting in a _kcat KM ^-1 _of 0.89 x ¹⁰³ [M ^-1 s ^-1 ]. KalbTG appears to be completely unreactive towards the S. mobara MTG substrate Z-GGGDYALQGGGG, and thus no kinetic parameters could be determined, as indicated by 'nd' in Table 3.

接着，应用阵列和溶液中数据来执行蛋白底物上的位点特异性标记。分子伴侣SlyD是用于标记方案的有用支架，因为含表位的环可以移植到FKBP结构域上用于呈递给结合剂或酶(Andres等人的PCT申请公开号WO 2012/150321A1)。生产了由嗜热栖热菌FKBP结构域和KalbTG识别序列RYRQR (SEQ ID NO:14)组成的嵌合蛋白。用10倍过量的KalbTG K-标签-Cy3和72:1的底物:酶比率进行标记，在15分钟以后得到大约70%收率的标记的蛋白种类(图6A)。该收率在60分钟的时程中保持恒定。与50倍标记过量一起温育仅仅轻微增加标记的物质的收率。在SDS-PAGE凝胶上观察到从13 kDa至19 kDa的分子量迁移，这刚好对应于单个6 kD标记分子的掺入。含有茂原链霉菌MTG序列DYALQ (SEQ ID NO:22)而不是RYRQR(SEQ ID NO:14)的相同地构建的FKBP结构域当与KalbTG一起温育时没有表现出标记的掺入(图6A)，从而表明所述反应限于KalbTG识别基序的位点，并且FKBP结构域固有的5个其它谷氨酰胺都没有被识别。此外，我们在pH 6.2、6.8、7.4、8.0、8.5和9测定了标记反应的pH依赖性(图6B)。在pH 7.4发现了15分钟以后的最高标记效率，活性在pH 8.5和以上逐渐消失（trail off）。这些发现与公开的茂原链霉菌MTG的pH偏好较好地对应。Next, the array and solution data were used to perform site-specific labeling on protein substrates. The molecular chaperone SlyD is a useful scaffold for labeling schemes because the loop containing the epitope can be transplanted onto the FKBP domain for presentation to a binding agent or enzyme (PCT Application Publication No. WO 2012/150321A1 by Andres et al.). A chimeric protein consisting of the Thermus thermophilus FKBP domain and the KalbTG recognition sequence RYRQR (SEQ ID NO: 14) was produced. Labeling was performed with a 10-fold excess of KalbTG K-tag-Cy3 and a substrate:enzyme ratio of 72:1, resulting in approximately 70% yield of labeled protein species after 15 minutes (Figure 6A). This yield remained constant over a 60-minute time course. Incubation with a 50-fold excess of labeling only slightly increased the yield of labeled material. A molecular weight shift from 13 kDa to 19 kDa was observed on the SDS-PAGE gel, which corresponds exactly to the incorporation of a single 6 kD labeled molecule. An identically constructed FKBP domain containing the S. mobara MTG sequence DYALQ (SEQ ID NO: 22) instead of RYRQR (SEQ ID NO: 14) showed no label incorporation when incubated with KalbTG ( FIG6A ), indicating that the reaction is restricted to the site of the KalbTG recognition motif and that none of the five other glutamines inherent to the FKBP domain are recognized. Furthermore, we determined the pH dependence of the labeling reaction at pH 6.2, 6.8, 7.4, 8.0, 8.5, and 9 ( FIG6B ). The highest labeling efficiency after 15 minutes was observed at pH 7.4, with activity trailing off at pH 8.5 and above. These findings correspond well with the published pH preferences of S. mobara MTG.

回到图6C，使用KalbTG的高序列特异性将6 kDa Cy3标记缀合至包含KalbTG和茂原链霉菌MTG含谷氨酰胺的基序的7 kDa底物肽的YRYRQ (SEQ ID NO:1)位点。所述反应进行30分钟以饱和YRYRQ (SEQ ID NO:1)位点。SDS-PAGE分析证实所述标记掺入在单个位点处。随后将底物肽与茂原链霉菌MTG和6 kDa Cy5标记一起温育15分钟。这导致位点特异性地双标记的缀合物的形成，所有单标记的种类已经可见地转化成双标记的种类。这些结果证实KalbTG和茂原链霉菌MTG构成具有空前使用方便、收率和效率的半-正交标记系统。因此，KalbTG可以用于在工业规模合成在治疗或诊断应用中的目标复杂蛋白缀合物。Returning to Figure 6C, the high sequence specificity of KalbTG was used to conjugate a 6 kDa Cy3 tag to the YRYRQ (SEQ ID NO: 1) site of a 7 kDa substrate peptide containing the glutamine-containing motifs of KalbTG and Streptomyces mobara MTG. The reaction was allowed to proceed for 30 minutes to saturate the YRYRQ (SEQ ID NO: 1) site. SDS-PAGE analysis confirmed that the tag was incorporated at a single site. The substrate peptide was then incubated with Streptomyces mobara MTG and the 6 kDa Cy5 tag for 15 minutes. This resulted in the formation of site-specifically doubly labeled conjugates, with all singly labeled species visibly converted to doubly labeled species. These results demonstrate that KalbTG and Streptomyces mobara MTG constitute a semi-orthogonal labeling system with unprecedented ease of use, yield, and efficiency. Therefore, KalbTG can be used to synthesize complex protein conjugates of interest for therapeutic or diagnostic applications on an industrial scale.

关于GLDH偶联测定，为了确定在阵列测定中选择的KalbTG肽是否也是溶液反应中的优选底物和量化KalbTG和茂原链霉菌MTG与不同底物的交叉反应性，应用了关于茂原链霉菌MTG活性的连续谷氨酸脱氢酶(GLDH)偶联测定(参见Oteng-Pabi, 等人,2013.Analytical biochemistry 441(2): 169-173)。Regarding the GLDH coupled assay, to determine whether the KalbTG peptides selected in the array assay were also preferred substrates in solution reactions and to quantify the cross-reactivity of KalbTG and S. mobaraensis MTG with different substrates, a continuous glutamate dehydrogenase (GLDH) coupled assay for S. mobaraensis MTG activity was applied (see Oteng-Pabi, et al., 2013. Analytical biochemistry 441(2): 169-173).

关于谷氨酰胺底物评价，在有500µMα-酮戊二酸盐、500µM或1 mM尸胺（作为胺供体替代含赖氨酸的肽）、2 U ml^-1的谷氨酸脱氢酶(GLDH)、500µM NADH和在0-1 mM之间浓度的含谷氨酰胺的底物肽(Z-GGGQRWRQGGGG、Z-GGGWRYRQGGGG、Z-GGGYRYRQGGGG、Z-GGGRYRQRGGGG、Z-GGGRYSQRGGGG、Z-GGGFRQRQGGGG、Z-GGGRQRQRGGGG、Z-GGGFRQRGGGGG、Z-GGGQRQRQGGGG、Z-GGGYKYRQGGGG、Z-GGGQYRQRGGGG、Z-GGGDYALQGGGG或Z-GGGMLAQGSGGG)存在下在200 mM MOPS、1 mM EDTA pH 7.2中在透明的96-孔微孔滴定板中进行测定(每孔总体积200µl)。值得注意的是，Z-缀合的肽包括序列GGGQRWRQGGGG (SEQ ID NO:77)、GGGWRYRQGGGG (SEQ ID NO:78)、GGGYRYRQGGGG (SEQ ID NO:75)、GGGRYRQRGGGG (SEQ IDNO:79)、GGGRYSQRGGGG (SEQ ID NO:80)、GGGFRQRQGGGG (SEQ ID NO:81)、GGGRQRQRGGGG(SEQ ID NO:82)、GGGFRQRGGGGG (SEQ ID NO:83)、GGGQRQRQGGGG (SEQ ID NO:84)、GGGYKYRQGGGG (SEQ ID NO:85)、GGGQYRQRGGGG (SEQ ID NO:86)、GGGDYALQGGGG (SEQ IDNO:76)和GGGMLAQGSGGG (SEQ ID NO:87)。For glutamine substrate evaluation, the cells were cultured in the presence of 500 µM α-ketoglutarate, 500 µM or 1 mM cadaverine (as an amine donor to replace lysine-containing peptides), 2 U ml ⁻¹ glutamate dehydrogenase (GLDH), 500 µM NADH, and 0-1 The assay was performed in the presence of mid-mM concentrations of a glutamine-containing substrate peptide (Z-GGGQRWRQGGGG, Z-GGGWRYRQGGGG, Z-GGGYRYRQGGGG, Z-GGGRYRQRGGGG, Z-GGGRYSQRGGGG, Z-GGGFRQRQGGGG, Z-GGGRQRQRGGGG, Z-GGGFRQRGGGGG, Z-GGGQRQRQGGGG, Z-GGGYKYRQGGGG, Z-GGGQYRQRGGGG, Z-GGGDYALQGGGG, or Z-GGGMLAQGSGGG) in 200 mM MOPS, 1 mM EDTA, pH 7.2, in a clear 96-well microtiter plate (total volume 200 µl per well). Notably, Z-conjugated peptides include the sequences GGGQRWRQGGGG (SEQ ID NO:77), GGGWRYRQGGGG (SEQ ID NO:78), GGGYRYRQGGGG (SEQ ID NO:75), GGGRYRQRGGGG (SEQ ID NO:79), GGGRYSQRGGGG (SEQ ID NO:80), GGGFRQRQGGGG (SEQ ID NO:81), GGGRQRQRGGGG (SEQ ID NO:82), GGGFRQRGGGGG (SEQ ID NO:83), GGGQRQRQGGGG (SEQ ID NO:84), GGGYKYRQGGGG (SEQ ID NO:85), GGGQYRQRGGGG (SEQ ID NO:86), GGGDYALQGGGG (SEQ ID NO:76), and GGGMLAQGSGGG (SEQ ID NO:87).

关于胺底物评价，除了使用100µM的各胺底物(Z-GGGRYSKYGGGG、Z-GGGAYRTKGGGG、Z-GGGRYRSKGGGG、Z-GGGYKGRGGGGG、Z-GGGRYGKSGGGG、Z-GGGRYESKGGGG、Z-GGGPGRYKGGGG、Z-GGGARSKLGGGG或尸胺)和200µM谷氨酰胺供体(Z-GGGYRYRQGGGG或Z-GGGDYALQGGGG)以外，测定条件与谷氨酰胺底物评价相同。值得注意的是，胺底物包括序列GGGRYSKYGGGG (SEQ IDNO:88)、GGGAYRTKGGGG (SEQ ID NO:89)、GGGRYRSKGGGG (SEQ ID NO:90)、GGGYKGRGGGGG(SEQ ID NO:91)、GGGRYGKSGGGG (SEQ ID NO:92)、GGGRYESKGGGG (SEQ ID NO:93)、GGGPGRYKGGGG (SEQ ID NO:94)和GGGARSKLGGGG (SEQ ID NO:95)。For the amine substrate evaluation, the assay conditions were the same as for the glutamine substrate evaluation, except that 100 µM of each amine substrate (Z-GGGRYSKYGGGG, Z-GGGAYRTKGGGG, Z-GGGRYRSKGGGG, Z-GGGYKGRGGGGG, Z-GGGRYGKSGGGG, Z-GGGRYESKGGGG, Z-GGGPGRYKGGGG, Z-GGGARSKLGGGG, or cadaverine) and 200 µM of a glutamine donor (Z-GGGYRYRQGGGG or Z-GGGDYALQGGGG) were used. Notably, amine substrates include the sequences GGGRYSKYGGGG (SEQ ID NO:88), GGGAYRTKGGGG (SEQ ID NO:89), GGGRYRSKGGGG (SEQ ID NO:90), GGGYKGRGGGGG (SEQ ID NO:91), GGGRYGKSGGGG (SEQ ID NO:92), GGGRYESKGGGG (SEQ ID NO:93), GGGPGRYKGGGG (SEQ ID NO:94), and GGGARSKLGGGG (SEQ ID NO:95).

通过加入5µg ml^-1的茂原链霉菌MTG或KalbTG开始反应，并使用温度控制在37℃的Biotek Synergy H4微量培养板读数器在340 nm连续记录NADH的氧化60分钟，在每次测量之前具有短摇动区间。在短迟延期（其中GLDH被转谷氨酰胺酶介导的氨释放饱和）以后，观察到与转谷氨酰胺酶周转对应的吸光度相对于时间的线性速率，并进行Michaelis-Menten动力学分析。使用方程式1的公式(以前通过NADH标准曲线确定)将按毫光密度单位/分钟(mODmin^-1)计的吸光度速率转化成NADH周转的摩尔速率(pmols^-1)：Reactions were initiated by adding 5 µg ^ml⁻¹ of Streptomyces mobaraensis MTG or KalbTG, and NADH oxidation was recorded continuously at 340 nm for 60 minutes using a Biotek Synergy H4 microplate reader controlled at 37°C, with brief shaking intervals preceding each measurement. After a short delay period (in which GLDH was saturated with transglutaminase-mediated ammonia release), a linear rate of absorbance versus time corresponding to transglutaminase turnover was observed and Michaelis-Menten kinetic analysis was performed. Absorbance rates in milliospatial density units per minute ( ^mODmin⁻¹ ) were converted to molar rates of NADH turnover ( ^pmols⁻¹ ) using the formula of Equation 1 (previously determined from an NADH standard curve):

(方程式1)(Equation 1)

关于标记测定，将来自嗜热栖热菌的伴侣蛋白SlyD (Universal ProteinResource (UniProt) Number Q5SLE7)用作KalbTG的标记支架。SlyD序列是：For the labeling assay, the chaperone protein SlyD from Thermus thermophilus (Universal Protein Resource (UniProt) Number Q5SLE7) was used as the labeling scaffold for KalbTG. The SlyD sequence is:

将KalbTG谷氨酰胺供体序列(Q-标签)重组地移植到SlyD的FKBP结构域上，从而产生下述多肽序列：The KalbTG glutamine donor sequence (Q-tag) was recombinantly grafted onto the FKBP domain of SlyD, resulting in the following polypeptide sequence:

在大肠杆菌Bl21 Tuner中生产8X-组氨酸-标记的蛋白，并通过标准的基于镍琼脂糖的固定化的金属离子亲和和尺寸排阻色谱法(HisTrap, Superdex 200; GEHealthcare)进行纯化。8X-histidine-tagged proteins were produced in E. coli Bl21 Tuner and purified by standard nickel agarose-based immobilized metal ion affinity and size exclusion chromatography (HisTrap, Superdex 200; GE Healthcare).

将标记的肽化学合成为具有(按从N-端至C-端的次序)“Z-”基团(即，羧基苄基)、转谷氨酰胺酶赖氨酸供体序列(K-标签)、8-氨基-3,6-二氧杂辛酸(O2Oc)、肽、和Cy3或Cy5荧光染料。标记的肽的基本化学结构是：The labeled peptide is chemically synthesized to have (in order from N-terminus to C-terminus) a "Z-" group (i.e., carboxybenzyl), a transglutaminase lysine donor sequence (K-tag), 8-amino-3,6-dioxaoctanoic acid (O2Oc), a peptide, and a Cy3 or Cy5 fluorescent dye. The basic chemical structure of the labeled peptide is:

KalbTG K-标签-Cy3：KalbTG K-label-Cy3:

MTG K-标签-Cy5：MTG K-Tag-Cy5:

其中E是谷氨酸盐，U是β-丙氨酸，且C(sCy3-MH)和C(Cy5-MH)分别代表在合成后被磺基-Cy3马来酰亚胺或Cy5马来酰亚胺修饰的C-端半胱氨酸。wherein E is glutamate, U is β-alanine, and C(sCy3-MH) and C(Cy5-MH) represent C-terminal cysteine modified with sulfo-Cy3 maleimide or Cy5 maleimide, respectively, after synthesis.

关于正交标记实验，将含有KalbTG和MTG Q-标签的分子化学合成为具有基本化学结构：For orthogonal labeling experiments, molecules containing KalbTG and MTG Q-tags were chemically synthesized with the basic chemical structure:

。.

使用商购可得的结构单元以0.25 mmol规模经由标准的基于芴基甲氧羰基(FMOC)的固相肽合成来合成所有肽。固相合成以后，将肽用95%TFA、2.5%三异丙基硅烷和2.5%水的溶液切割。然后将肽用二异丙基醚沉淀，并使用水/TFA乙腈梯度通过基于反相C18的高效液相色谱法(RP18-HPLC)纯化。通过肽分别与磺基-Cy3马来酰亚胺(Lumiprobe)和Cy5马来酰亚胺(GE Healthcare)的反应实现染料标记。使用水/TFA乙腈梯度通过RP18-HPLC实现染料标记的肽的纯化。通过应用Kinetex C18 2.6µm, 50 x 3 mm柱(Phenomenex)的液相色谱法- 质谱法(LC-MS) (Thermo Scientific RSLC-MSQplus系统)证实肽的身份。All peptides were synthesized using commercially available building blocks on a 0.25 mmol scale via standard solid phase peptide synthesis based on fluorenylmethoxycarbonyl (FMOC). After solid phase synthesis, the peptide was cleaved with a solution of 95% TFA, 2.5% triisopropylsilane, and 2.5% water. The peptide was then precipitated with diisopropyl ether and purified by high performance liquid chromatography (RP18-HPLC) based on reversed phase C18 using a water/TFA acetonitrile gradient. Dye labeling was achieved by reacting the peptide with sulfo-Cy3 maleimide (Lumiprobe) and Cy5 maleimide (GE Healthcare). Purification of dye-labeled peptides was achieved using a water/TFA acetonitrile gradient by RP18-HPLC. The identity of the peptide was confirmed by liquid chromatography-mass spectrometry (LC-MS) (Thermo Scientific RSLC-MSQplus system) using a Kinetex C18 2.6µm, 50 x 3 mm column (Phenomenex).

如果没有另外指出，在200 mM MOPS pH 7.2和1 mM EDTA中在有72µM底物蛋白、720µM标记肽和1µM转谷氨酰胺酶存在下在37℃进行标记反应15分钟。关于pH依赖性的标记概况，在200 mM MOPS中（对于pH 6.2、6.8和7.4中的每一种）或在200 mM Tris中（对于pH8.0、8.5和9.0中的每一种）进行实验。关于正交标记实验，将1.5µM EcSlyD2-Xa-KalbTGt3-8xHis加入含有100µM底物肽和1 mM KalbTG K-标签-Cy3的20µl体积中。在37℃温育30分钟以后，加入1 mM MTG K-标签-Cy5和1.5µM茂原链霉菌MTG，并在37℃温育另外15分钟。通过加入50 mM TCA，停止反应。在温育步骤之间取样并通过SDS-PAGE凝胶中荧光(BioRadChemiDoc凝胶记录系统，Cy3和Cy5 LED和滤片设置)进行分析。在另一个正交标记实验中，将4.65µM EcSlyD2-Xa-KalbTGt3-8xHis加入含有302µM底物肽和3.02 mM KalbTG K-标签-Cy3 (即，对于KalbTG而言，Cy3标记的赖氨酸底物)的50µl体积中。在37℃温育30分钟以后，将混合物通过加入400µl缓冲液进行稀释，并随后浓缩至50µl (10 kDa分子量截止值旋转过滤器)。接着，加入3.02 mM MTG K-标签-Cy5 (即，对于茂原链霉菌MTG而言，Cy5标记的赖氨酸底物)和4.65µM茂原链霉菌MTG，并在37℃温育另外15分钟。通过加入50µM (NH₄)₂SO₄，停止反应。然后，加入2µg因子Xa，并将混合物在室温温育2小时。通过加入50 mM TCA，停止反应。将混合物过滤(0.2µm旋转过滤器)，并通过具有在214nm和305nm的紫外检测的LC-MS进行分析。Unless otherwise indicated, labeling reactions were performed in 200 mM MOPS, pH 7.2, and 1 mM EDTA in the presence of 72 µM substrate protein, 720 µM labeled peptide, and 1 µM transglutaminase at 37°C for 15 minutes. For pH-dependent labeling profiles, experiments were performed in 200 mM MOPS (for each of pH 6.2, 6.8, and 7.4) or in 200 mM Tris (for each of pH 8.0, 8.5, and 9.0). For orthogonal labeling experiments, 1.5 µM EcSlyD2-Xa-KalbTGt3-8xHis was added to a 20 µl volume containing 100 µM substrate peptide and 1 mM KalbTG K-tag-Cy3. After incubation at 37°C for 30 minutes, 1 mM MTG K-tag-Cy5 and 1.5 µM Streptomyces mobaraensis MTG were added, and the mixture was incubated at 37°C for an additional 15 minutes. The reaction was stopped by adding 50 mM TCA. Samples were taken between incubation steps and analyzed by fluorescence in SDS-PAGE gels (BioRad ChemiDoc gel documentation system, Cy3 and Cy5 LEDs and filter sets). In another orthogonal labeling experiment, 4.65 µM EcSlyD2-Xa-KalbTGt3-8xHis was added to a 50 µl volume containing 302 µM substrate peptide and 3.02 mM KalbTG K-tag-Cy3 (i.e., Cy3-labeled lysine substrate for KalbTG). After incubation at 37°C for 30 minutes, the mixture was diluted by adding 400 µl of buffer and then concentrated to 50 µl using a 10 kDa molecular weight cutoff spin filter. Next, 3.02 mM MTG K-Tag-Cy5 (i.e., Cy5-labeled lysine substrate for Streptomyces mobaraensis MTG) and 4.65 µM Streptomyces mobaraensis MTG were added and incubated at 37°C for an additional 15 minutes. The reaction was stopped by the addition of 50 µM (NH ₄ ) ₂ SO ₄ . Then, 2 µg of Factor Xa was added, and the mixture was incubated at room temperature for 2 hours. The reaction was stopped by the addition of 50 mM TCA. The mixture was filtered (0.2 µm spin filter) and analyzed by LC-MS with UV detection at 214 nm and 305 nm.

进行另一个实验来确定从两种不同构建体制备并在两种不同温度储存的KalbTG酶的活性。测试的KalbTGt3 (SEQ ID NO:10)和KalbTGt1 (SEQ ID NO: 12)酶分别得自在实施例3中描述的EcSlyD2-Xa-KalbTGt3-8xHis-标签构建体(SEQ ID NO:8)和EcSlyD2-Xa-KalbTGt1-8xHis-标签构建体(SEQ ID NO:11)。在4℃或-20℃储存以后测试两种KalbTG酶的活性，并根据公开的方案(Oteng-Pabi, 等人, 2013.Analytical biochemistry 441(2): 169-173)与商购可得的茂原链霉菌MTG的活性进行对比。具体地，以200µl总体积在37℃进行测定。测定混合物包括200 mM MOPS、1 mM EDTA和1 mM DTT（在pH 7.4）。此外，与2U/ml GLDH和0.1µg至1.0µg之间的转谷氨酰胺酶之一一起，加入500µM浓度的α-酮戊二酸盐和NADH。谷氨酰胺底物是200µM Z-GGGYRYRQGGGG-OH，且胺供体是10 mM尸胺，其中Z代表羧基苄基。一式三份地(在微孔滴定板中的平均3个孔)收集所有数据并减去基线活性(不含酶的缓冲液)。得到的数据显示在表4中。Another experiment was performed to determine the activity of KalbTG enzymes prepared from two different constructs and stored at two different temperatures. The KalbTGt3 (SEQ ID NO: 10) and KalbTGt1 (SEQ ID NO: 12) enzymes tested were obtained from the EcSlyD2-Xa-KalbTGt3-8xHis-tag construct (SEQ ID NO: 8) and EcSlyD2-Xa-KalbTGt1-8xHis-tag construct (SEQ ID NO: 11) described in Example 3, respectively. The activity of the two KalbTG enzymes was tested after storage at 4°C or -20°C and compared to the activity of commercially available Streptomyces mobaraensis MTG according to a published protocol (Oteng-Pabi, et al., 2013. Analytical biochemistry 441(2): 169-173). Specifically, the assay was performed at 37°C in a total volume of 200 μl. The assay mixture consisted of 200 mM MOPS, 1 mM EDTA, and 1 mM DTT (at pH 7.4). α-ketoglutarate and NADH were also added at a concentration of 500 µM, along with 2 U/mL GLDH and 0.1 µg to 1.0 µg of one of the transglutaminases. The glutamine substrate was 200 µM Z-GGGYRYRQGGGG-OH, and the amine donor was 10 mM cadaverine, where Z represents a carboxybenzyl group. All data were collected in triplicate (average of three wells in a microtiter plate), and baseline activity (enzyme-free buffer) was subtracted. The resulting data are shown in Table 4.

实施例5：KalbTG的3D晶体结构的鉴别和表征Example 5: Identification and characterization of the 3D crystal structure of KalbTG

如在图7A-7C中所示，来自白色库茨涅尔氏菌的MTG的完整结构与来自茂原链霉菌的MTG的比对提供了对KalbTG的性质的洞察。在一个方面，将前19个N-端氨基酸和C-端人工GGGSG-8X-His标签扰乱，且因此不参与所述结构。KalbTG的总结构类似于如以前所述的茂原链霉菌MTG结构(Kashiwagi, 等人.2002.JBioChem 277(46): 44252-44260.)，从而形成α+ β折叠类别的圆盘样形状，其中两个多环突出物形成活性部位裂缝。但是，所述结构在两个α-螺旋（在Kashiwagi氏编号中的α₄和α₅）（它们不存在于KalbTG结构中）和两个小β-链（β₂和β₄）（它们包含在KalbTG中更低疏水的残基，分别用SF替代AF和用QV替代LV）中存在差异，从而使KalbTG中的总元件达到仅9个α-螺旋和6个β-链。催化三联体(Cys64、Asp255、His274)是在结构上保守的(从KalbTG开放读码框的开始编号的Cys82、Asp211、His224)。但是，KalbTG Cys82的巯基侧链比它的茂原链霉菌MTG相应物在活性裂缝中嵌入得深2.6Å。茂原链霉菌MTG酶原的晶体结构(Yang, 等人.2011.J Bio Chem 286(9): 7301-7307)显示活性裂缝被L-形前肽紧密地占据。KalbTG的结合槽可以没有位阻地与茂原链霉菌MTG的前肽对齐，从而指示类似的酶原机制可以存在于KalbTG中(图7B)。令人惊讶地，在Cys82附近形成活性裂缝的环之一具有氨基酸序列YRYRAR (SEQ ID NO:4)，其除了谷氨酰胺侧链以外与在肽阵列上发现的优选KalbTG底物相同(即，最佳的2个5-聚体肽是YRYRQ (SEQ ID NO:1)和RYRQR (SEQ ID NO:14)；图7C)。因此，晶体结构分析充当肽阵列就底物序列的鉴别而言的可靠性的独立验证。As shown in Figures 7A-7C, alignment of the complete structure of MTG from Kuznetsova albicans with MTG from Streptomyces mobara provides insight into the properties of KalbTG. In one aspect, the first 19 N-terminal amino acids and the C-terminal artificial GGGSG-8X-His tag are disrupted and therefore do not participate in the structure. The overall structure of KalbTG is similar to the previously described Streptomyces mobara MTG structure (Kashiwagi, et al. 2002. J BioChem 277(46): 44252-44260.), forming a disc-like shape of the α+β pleated sheet type with two polycyclic protrusions forming the active site cleft. However, the structure differs in two α-helices ( _α4 and _α5 in Kashiwagi numbering) that are absent from the KalbTG structure and two small β-strands ( _β2 and _β4 ) that contain less hydrophobic residues in KalbTG, replacing AF with SF and LV with QV, respectively, bringing the total elements in KalbTG to only nine α-helices and six β-strands. The catalytic triad (Cys64, Asp255, His274) is structurally conserved (Cys82, Asp211, His224 numbered from the beginning of the KalbTG open reading frame). However, the sulfhydryl side chain of Cys82 in KalbTG is embedded 2.6 Å deeper in the active cleft than in its S. mobaraensis MTG counterpart. The crystal structure of the S. mobara MTG zymogen (Yang et al. 2011. J Bio Chem 286(9): 7301-7307) shows that the active cleft is tightly occupied by the L-shaped propeptide. The binding groove of KalbTG can be aligned with the propeptide of S. mobara MTG without steric hindrance, indicating that a similar zymogen mechanism may exist in KalbTG (Figure 7B). Surprisingly, one of the loops forming the active cleft near Cys82 has the amino acid sequence YRYRAR (SEQ ID NO: 4), which, except for the glutamine side chain, is identical to the preferred KalbTG substrate found on the peptide array (i.e., the best two 5-mer peptides are YRYRQ (SEQ ID NO: 1) and RYRQR (SEQ ID NO: 14); Figure 7C). Therefore, the crystal structure analysis serves as an independent verification of the reliability of the peptide array with respect to the identification of substrate sequences.

关于KalbTG的结晶和结构表征，如下使用坐滴（sitting drop）(200 nL)蒸汽扩散方法在22℃在PBS中使KalbTG结晶：将8 mg/mL蛋白与未缓冲的由0.2 M酒石酸铵、20%PEG3350组成的蓄池进行1:1混合。将晶体在含有20%乙二醇的蓄池溶液中冷冻保护，然后在液氮中快速冷却。使用Pilatus 6M检测器在SLS束线PX-II中在100 K收集数据，并用XDS(PMID 20124692)积分和在空间群P3中排列（scaled）。l = 3n反射具有>9的I/σ，从而使得螺旋轴的存在不太可能。Self-Patterson和成对分析没有揭示可疑的数据病理（pathologies）。晶胞体积与不对称单元中的两个或三个KalbTG分子一致，分别具有3.5Å³/Da和2.3Å³/Da的Matthews参数。数据收集统计学总结在表5中。For crystallization and structural characterization of KalbTG, KalbTG was crystallized in PBS at 22°C using the sitting drop (200 nL) vapor diffusion method as follows: 8 mg/mL of protein was mixed 1:1 with an unbuffered reservoir solution consisting of 0.2 M ammonium tartrate and 20% PEG3350. Crystals were cryoprotected in a reservoir solution containing 20% ethylene glycol and then rapidly cooled in liquid nitrogen. Data were collected at 100 K using a Pilatus 6M detector at SLS beamline PX-II and integrated and scaled in space group P3 using XDS (PMID 20124692). The l = 3n reflection had an I/σ > 9, making the presence of a helical axis unlikely. Self-Patterson and pairwise analyses revealed no suspected pathologies in the data. The unit cell volume is consistent with two or three KalbTG molecules in the asymmetric unit, with Matthews parameters of 3.5 ^Å³ /Da and 2.3 ^Å³ /Da, respectively. Data collection statistics are summarized in Table 5.

使用茂原链霉菌转谷氨酰胺酶(354个残基，RCSB Protein Data Band (PDB) IDNo.3iu0)作为检索模型，通过分子替换确定KalbTG (226残基)的结构。使用完全茂原链霉菌TG的第一个尝试通常是不成功的，不受理论的限制，因为所述酶具有非常不同的大小。两种转谷氨酰胺酶在KalbTG的整个长度上具有28.2%序列同一性和38.9%序列相似性。当以213的log-可能性增益(LLG)在空间群P3中检索不对称单元中的两个分子时，茂原链霉菌TG的一种不具有环区域且被修剪至疏水核心的变体用Phaser结晶软件(McCoy等人, 2007. JAppl Crystallogr.8月1日; 40(Pt 4): 658-674)产生潜在溶液。三角空间群P3₁和P3₂没有产生溶液，这与l = 3n反射的高强度一致。用BUSTER结晶软件(Blanc等人, 2004. Acta Cryst.D60, 2210-2221)将模型细化至46%的R_自由。一些二级结构元件在电子密度地图中可见且被包括在所述模型中，然后在CBUCCANEER和REFMAC5中对其进行10轮自动模型构建和细化(Winn等人, 2011.Acta Cryst.D67, 235-242)。得到的模型含有所有蛋白残基且具有30%的R_自由。所述结构在COOT (Emsley等人, 2010.Acta Cryst.D66, 486-501)在完成，并用PHENIX (Adams等人, 2010.Acta Cryst.D66, 213-221)细化。模型细化统计学收集在表5中。The structure of KalbTG (226 residues) was determined by molecular replacement using Streptomyces mobaraensis transglutaminase (354 residues, RCSB Protein Data Band (PDB) ID No. 3iu0) as a search model. Initial attempts using the full Streptomyces mobaraensis TG were generally unsuccessful, not to be limited by theory, because the enzymes have very different sizes. The two transglutaminases share 28.2% sequence identity and 38.9% sequence similarity over the entire length of KalbTG. When searching for both molecules in the asymmetric unit in space group P3 with a log-likelihood gain (LLG) of 213, a variant of Streptomyces mobaraensis TG lacking the loop region and trimmed to the hydrophobic core yielded a potential solution using Phaser crystallization software (McCoy et al., 2007. J Appl Crystallogr. Aug 1;40(Pt 4):658-674). Trigonal space groups _P31 and _P32 did not yield a solution, consistent with the high intensity of the l = 3n reflection. The model was refined to an R _freedom of 46% using BUSTER crystallization software (Blanc et al., 2004. Acta Cryst . D60, 2210-2221). Some secondary structure elements were visible in the electron density map and included in the model, which was then subjected to 10 rounds of automatic model building and refinement in CBUCCANEER and REFMAC5 (Winn et al., 2011. Acta Cryst . D67, 235-242). The resulting model contained all protein residues and had an R _freedom of 30%. The structure was refined in COOT (Emsley et al., 2010. Acta Cryst . D66, 486-501) and refined with PHENIX (Adams et al., 2010. Acta Cryst. D66, 213-221). Model refinement statistics are summarized in Table 5.

参考表5，最高分辨率壳的统计学显示在括弧中。Referring to Table 5, the statistics of the highest resolution shells are shown in parentheses.

使用PBS作为参照，在20℃至90℃的温度范围在VP-Capillary DSC仪器(MicroCal/GE Healthcare)和90℃h^-1的扫描率，执行动态扫描量热法(DSC)测量。Dynamic scanning calorimetry (DSC) measurements were performed in a VP-Capillary DSC instrument (MicroCal/GE Healthcare) in the temperature range of 20°C to 90°C and a scan rate of 90°C h ^-1 using PBS as a reference.

实施例6：鉴别的KalbTG肽底物特征的分析Example 6: Analysis of the characteristics of the identified KalbTG peptide substrate

参考表6和7，在本文中鉴别的KalbTG肽底物的分析揭示了那些底物共享的一组特征。With reference to Tables 6 and 7, analysis of the KalbTG peptide substrates identified herein revealed a set of features shared by those substrates.

首先转至表6，22个被鉴别为KalbTG的酰基-供体底物的5-聚体肽序列使用单字母氨基酸代码列出，与包括氨基酸位置(从N-端至C-端编号)以及单个氨基酸(R和Q)和氨基酸组(R+Q、F+W+Y和R+Q+F+W+Y)的氨基酸计数的信息一起。在一个方面，对于KalbTG，数据揭示酰基-供体底物（包括具有式Xaa₁-Xaa₂-Xaa₃-Xaa₄-Xaa₅的5-聚体氨基酸序列，其中Xaa是任意氨基酸）通常符合几个设计原则。首先，每个5-聚体序列包括至少一个谷氨酰胺(Q)。更具体地，5-聚体序列的第三、第四和第五位置(即，Xaa₃、Xaa₄和Xaa₅)中的至少一个是谷氨酰胺。值得注意的是，通常没有观察到具有两个或更多个相邻谷氨酰胺的序列。但是，观察到几个包括在第三和第五位置中的每一个处的谷氨酰胺的序列。在另一个方面，做出每个5-聚体序列包括至少一个精氨酸(R)的观察。更具体地，5-聚体序列的第四和第五位置(即，Xaa₄和Xaa₅)中的至少一个是精氨酸。例如，对于分析的22个序列中的每一个，在第四或第五位置中的任一个(但是并非两个)中发现了精氨酸。此外，做出这样的观察：当第五位置是精氨酸时，至少一个另外的精氨酸位于5-聚体序列的第一、第二和第三位置(即，Xaa₁、Xaa₂和Xaa₃)中的一个处。Turning first to Table 6, 22 5-mer peptide sequences identified as acyl-donor substrates for KalbTG are listed using single-letter amino acid codes, along with information including amino acid position (numbered from N-terminus to C-terminus) and amino acid counts for individual amino acids (R and Q) and amino acid groups (R+Q, F+W+Y, and R+Q+F+W+Y). In one aspect, for KalbTG, the data revealed that the acyl-donor substrates (including 5-mer amino acid sequences having the formula _Xaa1 - _Xaa2 - _Xaa3 - _Xaa4 - _Xaa5 , where Xaa is any amino acid) generally adhered to several design principles. First, each 5-mer sequence included at least one glutamine (Q). More specifically, at least one of the third, fourth, and fifth positions (i.e., _Xaa3 , _Xaa4 , and _Xaa5 ) of the 5-mer sequence was a glutamine. Notably, sequences with two or more adjacent glutamines were generally not observed. However, several sequences were observed that included a glutamine at each of the third and fifth positions. In another aspect, it was observed that each 5-mer sequence included at least one arginine (R). More specifically, at least one of the fourth and fifth positions (i.e., Xaa ₄ and Xaa ₅ ) of the 5-mer sequence was an arginine. For example, for each of the 22 sequences analyzed, an arginine was found at either (but not both) the fourth or fifth position. Furthermore, it was observed that when the fifth position was an arginine, at least one additional arginine was located at one of the first, second, and third positions (i.e., Xaa ₁ , Xaa ₂ , and Xaa ₃ ) of the 5-mer sequence.

在另一个方面，所述5-聚体序列各自包括至少一个连续邻近谷氨酰胺的精氨酸。例如，5-聚体序列FRQRG (SEQ ID NO: 8)包括在第三位置处的谷氨酰胺，其侧接位于第二和第四位置二者处的精氨酸，而5-聚体序列YRYRQ (SEQ ID NO: 1)包括在第四位置处的精氨酸，紧接着是在第五位置的谷氨酰胺。除了至少一个精氨酸和至少一个谷氨酰胺在每个5-聚体序列中的存在以外，发现具有芳族侧链的氨基酸(即，苯丙氨酸、色氨酸和酪氨酸)存在于许多5-聚体序列中。具体地，做出这样的观察：在每个5-聚体序列中被精氨酸、谷氨酰胺、苯丙氨酸、色氨酸或酪氨酸占据的位置的总数是至少4 (参见表6中的最后一列--关于R+Q+F+W+Y的氨基酸计数)。例如，对于被选自精氨酸、谷氨酰胺、苯丙氨酸、色氨酸和酪氨酸的氨基酸占据的共四个位置，5-聚体序列FRQRG (SEQ ID NO: 8)包括一个谷氨酰胺、两个精氨酸和一个苯丙氨酸。在另一个实施例中，对于被选自精氨酸、谷氨酰胺、苯丙氨酸、色氨酸和酪氨酸的氨基酸占据的共五个位置，5-聚体序列YRYRQ (SEQ ID NO: 1)包括两个精氨酸、两个苯丙氨酸和一个谷氨酰胺。值得注意的是，缺少芳族氨基酸的5-聚体序列包括总计至少4个选自精氨酸和谷氨酰胺的氨基酸(例如，QRQRQ (SEQ ID NO: 19), QRQTR (SEQ IDNO: 36))。In another aspect, the 5-mer sequences each include at least one arginine adjacent to a glutamine. For example, the 5-mer sequence FRQRG (SEQ ID NO: 8) includes a glutamine at the third position, flanked by arginines at both the second and fourth positions, while the 5-mer sequence YRYRQ (SEQ ID NO: 1) includes an arginine at the fourth position, followed by a glutamine at the fifth position. In addition to the presence of at least one arginine and at least one glutamine in each 5-mer sequence, amino acids with aromatic side chains (i.e., phenylalanine, tryptophan, and tyrosine) were found in many 5-mer sequences. Specifically, it was observed that the total number of positions occupied by arginine, glutamine, phenylalanine, tryptophan, or tyrosine in each 5-mer sequence was at least 4 (see the last column in Table 6 for amino acid counts of R+Q+F+W+Y). For example, the 5-mer sequence FRQRG (SEQ ID NO: 8) includes one glutamine, two arginines, and one phenylalanine for a total of four positions occupied by amino acids selected from arginine, glutamine, phenylalanine, tryptophan, and tyrosine. In another embodiment, the 5-mer sequence YRYRQ (SEQ ID NO: 1) includes two arginines, two phenylalanines, and one glutamine for a total of five positions occupied by amino acids selected from arginine, glutamine, phenylalanine, tryptophan, and tyrosine. Notably, the 5-mer sequence lacking aromatic amino acids includes a total of at least four amino acids selected from arginine and glutamine (e.g., QRQRQ (SEQ ID NO: 19), QRQTR (SEQ ID NO: 36)).

参考表7，22个被鉴别为KalbTG的胺-供体底物的5-聚体肽序列使用单字母氨基酸代码列出，与包括氨基酸位置(从N-端至C-端编号)以及单个氨基酸(K、Y、R、S)和氨基酸组(K+Y+R+S)的氨基酸计数的信息一起。类似于在表6中关于酰基-供体序列得到的数据，关于KalbTG，胺-供体底物（包括具有式Xaa₁-Xaa₂-Xaa₃-Xaa₄-Xaa₅的5-聚体氨基酸序列，其中Xaa是任意氨基酸）通常符合几个设计原则。首先，每个5-聚体序列包括至少一个赖氨酸(K)。更具体地，每个5-聚体序列的第四和第五位置中的至少一个是赖氨酸，唯一例外是5-聚体序列YKGRG (SEQ ID NO: 29)。值得注意的是，通常没有观察到具有两个或更多个相邻赖氨酸的序列。尽管观察到几个包括两个总赖氨酸的5-聚体序列，在分析的5-聚体序列中没有发现具有超过两个赖氨酸的5-聚体序列。在另一个方面，做出这样的观察：大多数5-聚体序列包括至少一个酪氨酸(Y)，表7中的22个序列中的仅有的两个例外是RWKFK (SEQ ID NO:48)和ARSKL (SEQ ID NO: 30)。Referring to Table 7, 22 5-mer peptide sequences identified as amine-donor substrates for KalbTG are listed using a single-letter amino acid code, along with information including amino acid position (numbered from N-terminus to C-terminus) and amino acid counts for individual amino acids (K, Y, R, S) and amino acid groups (K+Y+R+S). Similar to the data obtained for the acyl-donor sequences in Table 6, for KalbTG, amine-donor substrates (including 5-mer amino acid sequences having the formula _Xaa1 - _Xaa2 - _Xaa3 - _Xaa4 - _Xaa5 , where Xaa is any amino acid) generally adhere to several design principles. First, each 5-mer sequence includes at least one lysine (K). More specifically, at least one of the fourth and fifth positions of each 5-mer sequence is a lysine, with the sole exception of the 5-mer sequence YKGRG (SEQ ID NO: 29). Notably, sequences with two or more adjacent lysines were not generally observed. Although several 5-mer sequences were observed that included two total lysines, no 5-mer sequences with more than two lysines were found among the 5-mer sequences analyzed. On the other hand, it was observed that the majority of 5-mer sequences included at least one tyrosine (Y), with the only two exceptions among the 22 sequences in Table 7 being RWKFK (SEQ ID NO: 48) and ARSKL (SEQ ID NO: 30).

在氨基酸赖氨酸和酪氨酸之后，在表7的5-聚体序列中出现的接下来最常见的氨基酸是精氨酸和丝氨酸，至少一个精氨酸或丝氨酸存在于许多5-聚体序列中。具体地，做出这样的观察：在每个5-聚体序列中被赖氨酸、酪氨酸、精氨酸或丝氨酸占据的位置的总数是至少3个(参见表7中的最后一列--关于K+Y+R+S的氨基酸计数)。例如，对于被选自赖氨酸、酪氨酸、精氨酸和丝氨酸的氨基酸占据的共3个位置，5-聚体序列NYRFK (SEQ ID NO: 45)包括酪氨酸、精氨酸和赖氨酸。在另一个实施例中，对于被选自赖氨酸、酪氨酸、精氨酸和丝氨酸的氨基酸占据的共5个位置，5-聚体序列RYRSK (SEQ ID NO: 27)包括2个精氨酸、1个酪氨酸、1个丝氨酸和1个赖氨酸。After the amino acids lysine and tyrosine, the next most common amino acids occurring in the 5-mer sequences of Table 7 are arginine and serine, with at least one arginine or serine present in many 5-mer sequences. Specifically, it was observed that the total number of positions occupied by lysine, tyrosine, arginine, or serine in each 5-mer sequence was at least 3 (see the last column in Table 7 for amino acid counts of K+Y+R+S). For example, the 5-mer sequence NYRFK (SEQ ID NO: 45) includes tyrosine, arginine, and lysine for a total of 3 positions occupied by amino acids selected from lysine, tyrosine, arginine, and serine. In another embodiment, the 5-mer sequence RYRSK (SEQ ID NO: 27) includes 2 arginines, 1 tyrosine, 1 serine, and 1 lysine for a total of 5 positions occupied by amino acids selected from lysine, tyrosine, arginine, and serine.

值得注意的是，所有5-聚体序列包括共至少2个选自赖氨酸、酪氨酸、和精氨酸或丝氨酸的氨基酸。例如，ARSKL (SEQ ID NO:30)包括1个赖氨酸、1个精氨酸、1个丝氨酸和0个酪氨酸。因此，选自赖氨酸、酪氨酸和精氨酸的氨基酸的总数是2，且选自赖氨酸、酪氨酸和丝氨酸的氨基酸的总数也是2。当然，如上面讨论的，在5-聚体序列ARSKL (SEQ ID NO:30)中被赖氨酸、酪氨酸、精氨酸或丝氨酸占据的位置的总数是至少3。在一个有关的方面，具有1个赖氨酸以及氨基酸酪氨酸和精氨酸中的至少一个的5-聚体序列包括共至少2个选自赖氨酸、酪氨酸和精氨酸的氨基酸(例如，ARSKL (SEQ ID NO:30), FYESK (SEQ ID NO:59))。此外，每个5-聚体序列包括在位置1和2中的一个处的至少一个酪氨酸或精氨酸(即，Xaa₁和Xaa₂)。It is noteworthy that all 5-mer sequences include a total of at least two amino acids selected from lysine, tyrosine, and arginine or serine. For example, ARSKL (SEQ ID NO:30) includes one lysine, one arginine, one serine, and zero tyrosines. Thus, the total number of amino acids selected from lysine, tyrosine, and arginine is two, and the total number of amino acids selected from lysine, tyrosine, and serine is also two. Of course, as discussed above, the total number of positions occupied by lysine, tyrosine, arginine, or serine in the 5-mer sequence ARSKL (SEQ ID NO:30) is at least three. In a related aspect, 5-mer sequences having one lysine and at least one of the amino acids tyrosine and arginine include a total of at least two amino acids selected from lysine, tyrosine, and arginine (e.g., ARSKL (SEQ ID NO:30), FYESK (SEQ ID NO:59)). In addition, each 5-mer sequence includes at least one tyrosine or arginine at one of positions 1 and 2 (ie, Xaa ₁ and Xaa ₂ ).

在图中显示的示意程序框图通常作为逻辑程序框图简图来阐述。这样，描述的次序和标记的步骤指示呈现的方法的一个实施方案。可以设想在功能、逻辑或效果方面与解释的方法的一个或多个步骤或其部分等同的其它步骤和方法。另外，提供了在图中采用的形式和符号来解释所述方法的逻辑步骤，并且被理解为并非限制所述方法的范围。尽管可以采用多个箭头类型和线类型，但是它们被理解为并非限制对应的方法的范围。实际上，一些箭头或其它连接物可以用于仅仅指示所述方法的逻辑流。例如，箭头可以指示描述的方法的列举步骤之间的未指定持续时间的等待或监测阶段。另外，特定方法发生的次序可以是或不是严格地按照显示的对应步骤的次序。The schematic flow charts shown in the figures are generally described as logic flow chart diagrams. Thus, the order of description and the steps marked indicate one embodiment of the method presented. Other steps and methods that are equivalent in function, logic or effect to one or more steps of the method explained or parts thereof can be envisioned. In addition, the forms and symbols used in the figures are provided to explain the logical steps of the method and are understood not to limit the scope of the method. Although multiple arrow types and line types can be used, they are understood not to limit the scope of the corresponding method. In fact, some arrows or other connectors can be used to simply indicate the logical flow of the method. For example, an arrow can indicate a waiting or monitoring phase of unspecified duration between the enumerated steps of the described method. In addition, the order in which a particular method occurs may or may not be strictly in accordance with the order of the corresponding steps shown.

在以下描述中参考附图在几个不同的实施方案中呈现了本发明，在附图中，相同的附图标记代表相同或类似的元件。贯穿本说明书对“一个实施方案”、“实施方案”的提及或类似的语言是指，结合所述实施方案描述的特定部件、结构或特征被包括在本发明的至少一个实施方案中。因而，短语“在一个实施方案中”、“在实施方案中”和类似的语言在本说明书中的出现可能、但是不一定都表示相同的实施方案。In the following description, the present invention is presented in several different embodiments with reference to the accompanying drawings, in which like reference numerals represent like or similar elements. Reference throughout this specification to "one embodiment," "an embodiment," or similar language means that a particular component, structure, or feature described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearance of the phrases "in one embodiment," "in an embodiment," and similar language in this specification may, but do not necessarily, all refer to the same embodiment.

本发明的所述部件、结构或特征可以在一个或多个实施方案中以任意合适的方式组合。在以下描述中，列举了众多具体细节来提供所述系统的实施方案的透彻理解。但是，相关领域的技术人员会认识到，所述系统和方法可以在没有一个或多个具体细节的情况下实践，或用其它方法、组分、材料等实践。在其它情况下，没有详细地显示或描述众所周知的结构、材料或操作，以避免使本发明的方面含糊。因此，前述描述意在成为示例性的，且不限制本发明构思的范围。The components, structures or features of the present invention can be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are listed to provide a thorough understanding of the embodiments of the system. However, those skilled in the relevant art will recognize that the systems and methods can be practiced without one or more specific details, or with other methods, components, materials, etc. In other cases, well-known structures, materials or operations are not shown or described in detail to avoid obscuring aspects of the present invention. Therefore, the foregoing description is intended to be exemplary and does not limit the scope of the inventive concept.

在本申请中指出的每篇参考文献通过引用整体并入本文。Each reference cited in this application is incorporated herein by reference in its entirety.

Claims

1. A substrate tag for microbial transglutaminase of *Kuznerella whiteis* as shown in SEQ ID NO:6, said substrate tag comprising one of an acyl-donor tag peptide sequence YRYRQ (SEQ ID NO:1) and an amine donor tag peptide sequence RYESK (SEQ ID NO:2).

2. The substrate label of claim 1, further comprising a detectable marker.

3. The substrate label of claim 2, wherein the detectable marker is selected from biotinylate, radioactive markers, and chemiluminescent markers.

4. The substrate label of claim 3, wherein the radioactive marker is a ruthenium marker.

5. The substrate label of claim 3, wherein the chemiluminescent label is a fluorescent dye.

6. The substrate tag of claim 1, wherein the acyl-donor tag has the peptide sequence APRYRQRAA (SEQ ID NO: 24).

7. A method for forming heteropeptide bonds in the presence of *Kuznerella whiteis* microbial transglutaminase as shown in SEQ ID NO:6, the method comprising:

Microbial transglutaminase was exposed to a first substrate and a second substrate, the first substrate comprising the acyl-donor-tagged peptide sequence YRYRQ (SEQ ID NO:1) and the second substrate comprising the amine-donor-tagged peptide sequence RYESK (SEQ ID NO:2).

The first substrate and the second substrate are crosslinked to form an isopeptide bond between the acyl donor tag and the amine donor tag.

8. The method of claim 7, wherein the step of crosslinking the first substrate and the second substrate forms an isopeptide bond between the γ-formamide group of the acyl-donor tag and the ε-amino group of the amine-donor tag.

9. The method of claim 7, wherein at least one of the first substrate and the second substrate comprises a detectable marker.

10. The method of claim 9, wherein the detectable marker is selected from biotinylate, radioactive markers, and chemiluminescent markers.

11. The method of claim 10, wherein the radioactive label is a ruthenium label.

12. The method of claim 10, wherein the chemiluminescent label is a fluorescent dye.

13. The method of claim 7, wherein the acyl-donor tag has the peptide sequence APRYRQRAA (SEQ ID NO:24).

14. The method of claim 7, wherein the crosslinking of the first substrate with the second substrate is achieved in a yield of at least 70%.

15. The method of claim 14, wherein the yield is achieved within approximately 30 minutes.

16. A kit for forming heteropeptide bonds in the presence of microbial transglutaminase, the kit comprising purified *Kuznerella alba* microbial transglutaminase as shown in SEQ ID NO:6, the kit further comprising one of a first substrate and a second substrate, the first substrate comprising an acyl-donor-tagged peptide sequence YRYRQ (SEQ ID NO:1), and the second substrate comprising an amine-donor-tagged peptide sequence RYESK (SEQ ID NO:2).

17. The kit of claim 16, wherein at least one of the first substrate and the second substrate comprises a detectable marker.

18. The kit of claim 17, wherein the detectable label is selected from biotinylate, radiolabels, and chemiluminescent labels.

19. The kit of claim 18, wherein the radioactive label is ruthenium labeling.

20. The kit of claim 18, wherein the chemiluminescent label is a fluorescent dye.

21. The kit according to claim 16, wherein the acyl-donor tag has the peptide sequence APRYRQRAA (SEQ ID NO: 24).

22. The kit of claim 16, wherein the kit further comprises another of the first substrate and the second substrate.

23. An enzyme product for forming isopeptide bonds, said enzyme product comprising isolated Kuznerella white microbial transglutaminase as shown in SEQ ID NO:6 and ammonium.

24. The enzyme product according to claim 23, wherein the ammonium is present at a concentration of at least about 10 μM.

25. An acyl-donor substrate for transglutaminase, said acyl-donor substrate being selected from YRYRQ (SEQ ID NO:1), FRQRG (SEQ ID NO:8), RYRQR (SEQ ID NO:14), RYSQR (SEQ ID NO:15), FRQRQ (SEQ ID NO:16), RQRQR (SEQ ID NO:17), QRQRQ (SEQ ID NO:19), YKYRQ (SEQ ID NO:20), QYRQR (SEQ ID NO:21), YRQSR (SEQ ID NO:32), LRYRQ (SEQ ID NO:33), YRQRA (SEQ ID NO:34), VRYRQ (SEQ ID NO:35), QRQTR (SEQ ID NO:36), YRQTR (SEQ ID NO:37), PRYRQ (SEQ ID NO:38), RFSQR (SEQ ID NO:39), WQRQR (SEQ ID NO:38), LRYRQ (SEQ ID NO:39), RFSQR (SEQ ID NO:39), and WQRQR (SEQ ID NO:30). NO:40), VRQRQ (SEQ ID NO:41), RYTQR (SEQ ID NO:42), AYRQR (SEQ ID NO:43) or YQRQR (SEQ ID NO:44).

26. An amine-donor substrate for transglutaminase, said amine-donor substrate being selected from RYESK (SEQ ID NO:2), RYSKY (SEQ ID NO:25), AYRTK (SEQ ID NO:26), RYRSK (SEQ ID NO:27), RYGKS (SEQ ID NO:28), YKGRG (SEQ ID NO:29), ARSKL (SEQ ID NO:30), NYRFK (SEQ ID NO:45), YQKWK (SEQ ID NO:46), YKYKY (SEQ ID NO:47), RWKFK (SEQ ID NO:48), RFYSK (SEQ ID NO:49), YKYAK (SEQ ID NO:50), YRYAK (SEQ ID NO:51), RYSYK (SEQ ID NO:52), YKSFK (SEQ ID NO:53), YKSWK (SEQ ID NO:54), KYRYK (SEQ ID NO:55), YKYNK (SEQ ID NO:56), RYSKY (SEQ ID NO:57), RYSKY (SEQ ID NO:58), RYK (SEQ ID NO:59), RYK (SEQ ID NO:50), RYK (SEQ ID NO:51), RYSYK (SEQ ID NO:52), YKSFK (SEQ ID NO:53), YKSWK (SEQ ID NO:54), KYRYK (SEQ ID NO:55), YKYNK (SEQ ID NO:56), RYSKY (SEQ ID NO:57), RYK (SEQ ID NO:58), RYK (SEQ ID NO:59), RYK (SEQ ID NO:50), RYK (SEQ ID NO:51), RYK (SEQ ID NO:52), RYK (SEQ ID NO:53), RYK (SEQ ID NO:54), RYK (SEQ ID NO:55), RYK (SEQ ID NO:56), RYK (SEQ ID NO:57), RYK (SEQ ID NO:58), RYK (SEQ ID NO:5 NO:56), PYKYK (SEQ ID NO:57), FYKYK (SEQ ID NO:58) or FYESK (SEQ ID NO:59).