[go: up one dir, main page]

HK1229003B - Pancreatic cancer biomarkers and uses thereof - Google Patents

Pancreatic cancer biomarkers and uses thereof Download PDF

Info

Publication number
HK1229003B
HK1229003B HK17102552.7A HK17102552A HK1229003B HK 1229003 B HK1229003 B HK 1229003B HK 17102552 A HK17102552 A HK 17102552A HK 1229003 B HK1229003 B HK 1229003B
Authority
HK
Hong Kong
Prior art keywords
biomarker
individual
biomarkers
pancreatic cancer
cancer
Prior art date
Application number
HK17102552.7A
Other languages
Chinese (zh)
Other versions
HK1229003A1 (en
Inventor
S.A.威廉森
M.里尔-米恩
R.M.奥斯特罗夫
Original Assignee
私募蛋白质体运营有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 私募蛋白质体运营有限公司 filed Critical 私募蛋白质体运营有限公司
Publication of HK1229003A1 publication Critical patent/HK1229003A1/en
Publication of HK1229003B publication Critical patent/HK1229003B/en

Links

Description

胰腺癌生物标记及其用途Pancreatic cancer biomarkers and their uses

本申请是2011年8月12日提交的题为“胰腺癌生物标记及其用途”的中国专利申请201180049408.2的分案申请。This application is a divisional application of Chinese patent application 201180049408.2, filed on August 12, 2011, entitled “Pancreatic Cancer Biomarkers and Uses Thereof”.

相关申请Related applications

本申请要求于2010年8月13日提交的美国临时申请系列号61/373,687、于2010年12月1日提交的美国临时申请系列号61/418,689、于2011年5月4日提交的美国临时申请系列号61/482,347以及于2011年5月4日提交的美国临时申请系列号61/482,480的权益,这些申请每个整体援引加入本文。This application claims the benefit of U.S. Provisional Application Serial No. 61/373,687, filed August 13, 2010, U.S. Provisional Application Serial No. 61/418,689, filed December 1, 2010, U.S. Provisional Application Serial No. 61/482,347, filed May 4, 2011, and U.S. Provisional Application Serial No. 61/482,480, filed May 4, 2011, each of which is incorporated herein by reference in its entirety.

发明领域Field of the Invention

本申请一般涉及个体中生物标记的检测和癌症的诊断,并且更具体地涉及用于诊断个体的癌症,更特别是胰腺癌的一种或多种生物标记、方法、装置、试剂、系统和试剂盒。The present application relates generally to the detection of biomarkers and the diagnosis of cancer in an individual, and more specifically to one or more biomarkers, methods, devices, reagents, systems, and kits for diagnosing cancer, more particularly pancreatic cancer, in an individual.

背景技术Background Art

下面的描述提供了本申请相关信息的概述,并非承认任何本文提供的信息或引用的出版物是本申请的现有技术。The following description provides a summary of information relevant to the present application and is not an admission that any of the information provided or publications cited herein are prior art to the present application.

胰腺癌是美国癌症相关死亡的第四主要原因。虽然5年生存率仅5%,但是据显示这随着早期手术干预而增加:在20%适合“根治性”切除的个体中,生存率增加至15-20%。在诊断时,超过一半的患者患有远端疾病,而另外25%具有区域传播。这是因为该疾病在其早期阶段出名地难以诊断。约20%患有“可手术”疾病[阶段IIb或更少]的患者进行“根治性”切除,并且5年生存率从低于5%增加至15-20%。Pancreatic cancer is the fourth leading cause of cancer-related death in the United States. Although the 5-year survival rate is only 5%, it has been shown to increase with early surgical intervention: in the 20% of individuals who are suitable for "radical" resection, the survival rate increases to 15-20%. At the time of diagnosis, more than half of patients have distant disease, while another 25% have regional spread. This is because the disease is notoriously difficult to diagnose in its early stages. Approximately 20% of patients with "operable" disease (stage IIb or less) undergo "radical" resection, and the 5-year survival rate increases from less than 5% to 15-20%.

胰腺癌可以起因于胰的外分泌和内分泌部分。在胰腺肿瘤中,95%发展自胰的外分泌部分,包括导管上皮、腺泡细胞、结缔组织和淋巴组织。所有胰腺癌的约75%发生在胰的头部或颈部内,15-20%发生在胰体中,并且5-10%发生在尾部中。Pancreatic cancer can arise from both the exocrine and endocrine parts of the pancreas. Of pancreatic tumors, 95% develop from the exocrine part of the pancreas, which includes ductal epithelium, acinar cells, connective tissue, and lymphoid tissue. Approximately 75% of all pancreatic cancers occur in the head or neck of the pancreas, 15-20% occur in the body, and 5-10% occur in the tail.

复发可以是局部的(在其开始的相同地方或附近)或远端的(扩散至诸如肝、肺或骨的器官)。当胰腺外分泌癌复发时,其主要用与转移性癌症相同的方式治疗,并且如果患者可以忍受,可能包括化疗。通常,胰腺癌首先转移至区域淋巴结,然后至肝,并且较不常见的,至肺。其还可以直接侵入周围内脏器官,例如十二指肠、胃和结肠,或者通过腹膜扩散转移至腹腔中的任何表面。可以导致腹水,并且这具有不祥预后。胰腺癌可以扩散至皮肤,作为痛性结节转移。胰腺癌罕见地转移至骨。Recurrence can be local (same place or near its beginning) or distal (diffusion to organs such as liver, lung or bone).When exocrine pancreatic cancer recurs, it is mainly treated in the same manner as metastatic cancer, and if the patient can tolerate, may include chemotherapy.Usually, pancreatic cancer first metastasizes to regional lymph nodes, then to liver, and less commonly, to lung. It can also directly invade surrounding visceral organs, such as duodenum, stomach and colon, or be transferred to any surface in the abdominal cavity by peritoneal diffusion. Ascites can be caused, and this has a poor prognosis. Pancreatic cancer can spread to the skin as a painful nodule metastasis. Pancreatic cancer rarely metastasizes to bone.

基于血液的胰腺癌测试的两个临床应用是用于无症状的高风险群体中的临床前诊断以及有症状的群体中的鉴别诊断。这两种指征的临床用途如下文所列。Two clinical applications of blood-based pancreatic cancer testing are for preclinical diagnosis in asymptomatic high-risk populations and for differential diagnosis in symptomatic populations. The clinical uses for these two indications are listed below.

无症状的高风险群体中的筛选:2010年在美国有估计43,140例胰腺癌新病例和36,800例死亡。遗传、家族史、慢性胰腺炎、吸烟和大量饮酒增加胰腺癌的风险,囊性纤维化也是如此。风险的增加报道为:Screening in asymptomatic high-risk groups: In 2010, there were an estimated 43,140 new cases of pancreatic cancer and 36,800 deaths in the United States. Genetics, family history, chronic pancreatitis, smoking, and heavy alcohol consumption increase the risk of pancreatic cancer, as does cystic fibrosis. Increased risk is reported for:

吸烟:<25/天为2x风险,>25/天为3x风险Smoking: <25/day is 2x risk, >25/day is 3x risk

酒精:超过3杯/天产生1.6倍风险增加Alcohol: More than 3 drinks/day resulted in a 1.6-fold increased risk

家族史:与疾病相关的第一程度产生5x增加Family history: First degree association with the disease yields a 5x increase

患有囊性纤维化的成年:31x风险Adults with cystic fibrosis: 31x risk

BRCA2基因突变10x风险BRCA2 gene mutation 10x risk

在有效筛选范式不存在的情况下,在无症状但是有风险的群体中,癌症在症状出现时简单地检测。这可能晚了。早期检测测试的存在会增加适合根治性手术的患者的比例。目前20%早期检测个体中20%的治愈率仅为总群体的4%。如果根治性手术的适用性通过早期检测在无症状的群体中从目前的20%增加,则可治愈的总数会增加,每年拯救的生命的数量也会增加。因为胰腺癌为低流行性疾病,即使在这种高风险群体中,高特异性也是筛选测试的重要属性。低假阳性率对于减少不必要的后续程序所致的费用和减少患者的焦虑极为重要。In the absence of an effective screening paradigm, cancer is simply detected in asymptomatic but at-risk individuals when symptoms appear. This can be too late. The availability of early detection tests would increase the proportion of patients suitable for radical surgery. Currently, the cure rate for 20% of individuals with early detection is only 4% of the overall population. If the eligibility for radical surgery were increased from the current 20% in the asymptomatic population through early detection, the total number of patients who can be cured would increase, and the number of lives saved each year would also increase. Because pancreatic cancer is a low-prevalence disease, high specificity is an important attribute of a screening test, even in this high-risk population. A low false-positive rate is crucial for reducing the costs of unnecessary follow-up procedures and alleviating patient anxiety.

有症状的患者中的鉴别诊断。胰腺癌可能难以与良性疾病状况如胰腺炎或胃肠病症区分。原发性外分泌胰腺癌的鉴别诊断包括慢性胰腺炎、胰腺内分泌肿瘤、自身免疫性胰腺炎、淋巴瘤以及各种其他罕见疾病状况。与胰腺癌相关的常见但非特异性症状包括:Differential diagnosis in symptomatic patients. Pancreatic cancer can be difficult to distinguish from benign conditions such as pancreatitis or gastrointestinal disorders. The differential diagnosis of primary exocrine pancreatic cancer includes chronic pancreatitis, pancreatic endocrine tumors, autoimmune pancreatitis, lymphoma, and various other rare conditions. Common but nonspecific symptoms associated with pancreatic cancer include:

腹痛-特别是在放射至背部时Abdominal pain – especially if it radiates to the back

阻塞性黄疸Obstructive jaundice

突然原因不明的糖尿病Sudden unexplained diabetes

体重减轻weight loss

厌食、疲劳Anorexia, fatigue

恶心、呕吐Nausea and vomiting

急性或慢性胰腺炎Acute or chronic pancreatitis

下表示出具有这些相关症状中的至少两种的急诊室和医院出现的患者的数量;第一症状是所列症状中的任一种,而第二症状是表中所列的症状。急诊室数据来自:(http://hcupnet.ahrq.gov/),而门诊数据来自CDC 2008National Ambulatory Medical CareSurvey 2006(number 8)。The table below shows the number of patients who presented to emergency rooms and hospitals with at least two of these related symptoms; the first symptom is any one of the symptoms listed, and the second symptom is a symptom listed in the table. The emergency room data is from: (http://hcupnet.ahrq.gov/), and the outpatient data is from the CDC 2008 National Ambulatory Medical Care Survey 2006 (number 8).

可切除的疾病的灵敏检测对于这种指征的临床应用非常重要。胰腺癌的迅速检测增加可治愈的疾病的诊断机会。胰腺癌的诊断通常通过发现胰内的包块来放射显影进行,所述包块常阻塞胰管或胆道。然而,成像可以是侵入性和昂贵的。确定哪些患者需要包括诊断成像在内的后续的血液测试会有益于患者并简化诊断。Sensitive detection of resectable disease is very important for the clinical application of this indication. Rapid detection of pancreatic cancer increases the chances of diagnosing a curable disease. Diagnosis of pancreatic cancer is usually performed by radiographic imaging of a mass found in the pancreas, which often obstructs the pancreatic duct or bile duct. However, imaging can be invasive and expensive. Determining which patients require follow-up blood tests, including diagnostic imaging, would benefit the patients and simplify diagnosis.

特定疾病状态的生物标记选择包括首先鉴定与对照群体相比在疾病群体中具有可测量和统计上显著的差异的标记用于特定医学应用。生物标记可以包括分泌或脱落(shed)的分子,其与疾病发展或进程平行,并且容易对肿瘤响应而从胰腺癌组织或者从周围组织和循环细胞扩散入血流。鉴定的生物标记或生物标记的集合(set)通常临床上进行验证,或者证实为对其所选的原始预期用途是可靠的指示物。生物标记可以包括小分子、肽、蛋白和核酸。影响生物标记鉴定的一些关键问题包括可用数据的过拟合(over-fitting)及数据偏差。Biomarker selection for a specific disease state involves first identifying markers that have measurable and statistically significant differences in the disease population compared to the control population for a specific medical application. Biomarkers can include secreted or shed molecules that parallel disease development or progression and readily diffuse into the bloodstream from pancreatic cancer tissue or from surrounding tissues and circulating cells in response to the tumor. The identified biomarker or set of biomarkers is typically clinically validated or confirmed to be a reliable indicator for the original intended use for which it was selected. Biomarkers can include small molecules, peptides, proteins, and nucleic acids. Some key issues affecting biomarker identification include over-fitting of the available data and data bias.

已使用各种方法来试图鉴定生物标记和诊断疾病。对于基于蛋白的标记,这些方法包括二维电泳、质谱和免疫测定方法。对于核酸标记,这些方法包括mRNA表达谱、微RNA谱、FISH、基因表达系列分析(SAGE)和大规模基因表达阵列。Various methods have been used to attempt to identify biomarkers and diagnose diseases. For protein-based markers, these methods include two-dimensional electrophoresis, mass spectrometry, and immunoassays. For nucleic acid markers, these methods include mRNA expression profiling, microRNA profiling, FISH, serial analysis of gene expression (SAGE), and large-scale gene expression arrays.

二维电泳的应用由于以下问题而受限:低检测灵敏度;与蛋白溶解性、电荷及疏水性相关的问题;凝胶再现性;以及单个斑点代表多种蛋白的可能性。对于质谱,取决于所用形式,限制围绕样品加工和分离、对低丰度蛋白的灵敏性、信噪比考虑以及不能立即鉴定检测的蛋白而出现。免疫测定方法发现生物标记的限制集中在基于抗体的多重测定不能测量大量分析物。可以简单地印刷高质量抗体的阵列,并且无需夹心而测量与这些抗体结合的分析物。(这会是使用全基因组核酸序列通过杂交测量有机体或细胞中的全部DNA或RNA序列的方式上的等同物。因为杂交可以是同一性的严紧测试,所以杂交实验可行。甚至非常好的抗体在选择它们的结合配偶体中也并非足够严紧来在血液或甚至是细胞提取物环境中工作,因为那些基质中的蛋白总体(ensemble)具有极其不同的丰度。)因此,必须使用不同的基于免疫测定的方法以发现生物标记-需要使用多重ELISA测定(即夹心)以获得足够严紧性来同时测量许多分析物,从而决定哪些分析物的确是生物标记。夹心免疫测定不放大至高含量,因此使用标准阵列形式不能用严紧夹心免疫测定发现生物标记。最后,抗体试剂产生相当大的批次差异和试剂不稳定性。本发明的蛋白生物标记发现平台克服了这个问题。The application of two-dimensional electrophoresis is limited by the following problems: low detection sensitivity; problems related to protein solubility, charge and hydrophobicity; gel reproducibility; and the possibility that a single spot represents multiple proteins. For mass spectrometry, depending on the format used, limitations arise around sample processing and separation, sensitivity to low-abundance proteins, signal-to-noise considerations, and the inability to immediately identify the detected protein. The limitation of immunoassay methods for biomarker discovery centers on the inability of antibody-based multiplexed assays to measure a large number of analytes. Arrays of high-quality antibodies can be easily printed and analytes bound to these antibodies can be measured without the need for sandwiches. (This would be the equivalent of using whole genome nucleic acid sequences to measure all DNA or RNA sequences in an organism or cell by hybridization. Hybridization experiments are possible because hybridization can be a stringent test of identity. Even very good antibodies are not stringent enough in the selection of their binding partners to work in a blood or even cell extract environment because the protein ensembles in those matrices have extremely different abundances.) Therefore, a different immunoassay-based approach must be used to discover biomarkers - multiplexed ELISA assays (i.e., sandwiches) are needed to obtain sufficient stringency to measure many analytes simultaneously and thus determine which analytes are indeed biomarkers. Sandwich immunoassays do not scale up to high levels, so biomarkers cannot be discovered with stringent sandwich immunoassays using standard array formats. Finally, antibody reagents are subject to considerable batch variability and reagent instability. The protein biomarker discovery platform of the present invention overcomes this problem.

许多这些方法依赖或需要在分析前一些类型的样品的分级。因此进行设计为在一系列良好限定的样品群体中鉴定和发现统计学相关生物标记的足够有效的研究所需的样品制备是极其困难、昂贵和耗时的。在分级期间,大范围的变异性可以被引入各种样品。例如,一种潜在的标记可能对于方法是不稳定的,标记的浓度可能变化,不合适的聚集或解聚可能发生,无意的样品污染可能发生,并因此掩盖预期的早期疾病中的微小变化。Many of these methods rely on or require the classification of some type of sample before analysis. Therefore, the sample preparation required for conducting sufficiently effective studies designed to identify and discover statistically relevant biomarkers in a series of well-defined sample populations is extremely difficult, expensive, and time-consuming. During the classification period, a wide range of variability can be introduced into the various samples. For example, a potential marker may be unstable for the method, the concentration of the marker may vary, inappropriate aggregation or disaggregation may occur, and inadvertent sample contamination may occur, thereby masking subtle changes in expected early-stage disease.

广泛接受的是使用这些技术的生物标记发现和检测方法对于鉴定诊断性生物标记具有严重限制。这些限制包括不能检测低丰度生物标记,不能持续覆盖蛋白质组的完整动态范围,样品加工和分级中的不可再现性,以及方法的整体不可再现性和缺乏稳健性(robustness)。另外,这些研究在数据中引入了偏差,针对鉴定和验证靶疾病群体内的生物标记所需的分布和随机化方面,没有充分解决包括适当对照在内的样品群体的复杂性。It is widely accepted that biomarker discovery and detection methods using these technologies have serious limitations for identifying diagnostic biomarkers. These limitations include the inability to detect low-abundance biomarkers, the inability to consistently cover the full dynamic range of the proteome, irreproducibility in sample processing and fractionation, and overall irreproducibility and lack of robustness of the methods. In addition, these studies have introduced biases in the data and have not adequately addressed the complexity of the sample populations, including appropriate controls, in terms of the distribution and randomization required to identify and validate biomarkers within target disease populations.

尽管旨在发现新的和有效的生物标记的努力已进行了几十年,但是这些努力大部分是不成功的。针对各种疾病的生物标记通常在学术性实验室中鉴定,通常通过进行一些疾病过程的基础研究时偶然发现。基于所述发现和少量临床数据,发表的论文提示鉴定了新的生物标记。然而大多数这些建议的生物标记未证实是真实或有用的生物标记,这主要是因为测试的少量临床样品对于已确实发现有效的生物标记仅提供弱统计学证据。也就是说,最初的鉴定对于统计学的基本元素是不严格的。在1994-2003年的每一年中,检索科学文献显示公开了上千篇关于生物标记的参考文献。然而,在同时期内,FDA一年最多批准3种新蛋白生物标记的诊断应用,并且在若干年中没有批准新的蛋白生物标记。Although efforts to discover new and effective biomarkers have been underway for decades, these efforts have largely been unsuccessful. Biomarkers for various diseases are typically identified in academic laboratories, often discovered by chance while conducting basic research on some disease process. Based on these discoveries and a small amount of clinical data, published papers suggest the identification of new biomarkers. However, most of these proposed biomarkers have not been proven to be true or useful biomarkers, primarily because the small amount of clinical samples tested provides only weak statistical evidence for the discovery of effective biomarkers. In other words, the initial identification was not rigorous with respect to the basic elements of statistics. In each year from 1994 to 2003, a search of the scientific literature revealed that thousands of references to biomarkers were published. However, during the same period, the FDA approved a maximum of three new protein biomarkers for diagnostic use per year, and no new protein biomarkers were approved in several years.

基于失败的生物标记发现努力的历史,已建议了数学理论以进一步促进通常理解,即针对疾病的生物标记很少且难以发现。基于2D凝胶或质谱的生物标记研究支持这些观点。通过这些方法鉴定了非常少的有用生物标记。然而,通常忽视2D凝胶和质谱测量血液中存在的约1nM或更高浓度的蛋白,这种蛋白的总体很可能是最不可能随疾病变化的。除了本发明的生物标记发现平台,尚不存在能够精确测量低得多的浓度的蛋白表达水平的蛋白质组生物标记发现平台。Based on the history of failed biomarker discovery efforts, mathematical theories have been proposed to further the general understanding that biomarkers for disease are few and difficult to find. Biomarker studies based on 2D gels or mass spectrometry support these views. Very few useful biomarkers have been identified by these methods. However, 2D gels and mass spectrometry are generally overlooked for measuring proteins present in blood at concentrations of approximately 1 nM or higher, the population of which is likely the least likely to vary with disease. With the exception of the biomarker discovery platform of the present invention, no proteomic biomarker discovery platform exists that can accurately measure protein expression levels at much lower concentrations.

关于复杂的人生物学的生物化学途径已知许多。许多生物化学途径以在病理学内局部发挥作用的分泌的蛋白达到顶点或开始,例如分泌生长因子以刺激病理学中其他细胞的复制,以及分泌其他因子以避开免疫系统等。尽管许多这些分泌的蛋白以旁分泌方式发挥作用,但是一些在身体的远端运行。具有生物化学途径基本了解的本领域技术人员会理解,许多病理学特异性蛋白应当以低于(甚至远低于)2D凝胶和质谱检测极限的浓度存在于血液中。在这种相对丰富数目的疾病生物标记的鉴定之前必须有一种蛋白质组平台,其可以分析低于2D凝胶或质谱可以检测的浓度的蛋白。Many biochemical pathways are known about the complexities of human biology. Many biochemical pathways culminate or begin with secreted proteins that act locally within the pathology, such as secreting growth factors to stimulate replication of other cells in the pathology, and secreting other factors to evade the immune system. While many of these secreted proteins act in a paracrine manner, some operate at distal locations in the body. Those skilled in the art with a basic understanding of biochemical pathways will understand that many pathology-specific proteins should be present in the blood at concentrations below (or even far below) the detection limits of 2D gels and mass spectrometry. Prior to the identification of this relatively abundant number of disease biomarkers, a proteomic platform must be available that can analyze proteins at concentrations below those detectable by 2D gels or mass spectrometry.

因此,亟需生物标记、方法、装置、试剂、系统和试剂盒,其允许(a)区分胰腺癌与良性疾病状况;(b)筛选胰腺癌的无症状的高风险个体;(c)检测胰腺癌生物标记;以及(d)诊断胰腺癌。Therefore, there is a need for biomarkers, methods, devices, reagents, systems, and kits that allow for (a) differentiation of pancreatic cancer from benign disease states; (b) screening of asymptomatic individuals at high risk for pancreatic cancer; (c) detection of pancreatic cancer biomarkers; and (d) diagnosis of pancreatic cancer.

发明概述SUMMARY OF THE INVENTION

本申请包括用于检测和诊断癌症,更特别是胰腺癌的生物标记、方法、试剂、装置、系统和试剂盒。本申请的生物标记使用实施例1详述的基于多重适配体的测定来鉴定。通过使用本文所述的基于适配体的生物标记鉴定方法,本申请描述了可用于检测和诊断胰腺癌的惊人的大量胰腺癌生物标记以及可用于检测和诊断更一般的癌症的大量癌症生物标记。在鉴定这些生物标记中,测量了来自数百个个体样品的超过800种蛋白,其中一些的浓度在低毫微微摩尔(femtomolar)范围。这比用2D凝胶和/或质谱进行的生物标记发现实验低约4个数量级。This application includes biomarkers, methods, reagents, devices, systems, and kits for detecting and diagnosing cancer, more particularly pancreatic cancer. The biomarkers of this application were identified using the multiplex aptamer-based assay detailed in Example 1. By using the aptamer-based biomarker identification methods described herein, this application describes a surprisingly large number of pancreatic cancer biomarkers that can be used to detect and diagnose pancreatic cancer, as well as a large number of cancer biomarkers that can be used to detect and diagnose more general cancers. In identifying these biomarkers, over 800 proteins were measured from hundreds of individual samples, some of which had concentrations in the low femtomolar range. This is approximately four orders of magnitude lower than biomarker discovery experiments performed using 2D gels and/or mass spectrometry.

尽管某些所述胰腺癌生物标记可单独用于检测和诊断胰腺癌,但是本文所述的方法用于分组用作一组生物标记的胰腺癌生物标记的多个子集。一旦鉴定了单独的生物标记或生物标记的子集,则个体中胰腺癌的检测或诊断可以使用能够测量生物学样品中所选生物标记或多种生物标记的水平差异的任何测定平台或形式来完成。While some of the pancreatic cancer biomarkers described herein can be used individually to detect and diagnose pancreatic cancer, the methods described herein are used to group multiple subsets of pancreatic cancer biomarkers for use as a panel of biomarkers. Once an individual biomarker or subset of biomarkers has been identified, detection or diagnosis of pancreatic cancer in an individual can be accomplished using any assay platform or format capable of measuring differences in the levels of the selected biomarker or biomarkers in a biological sample.

然而,仅仅通过使用本文所述的基于适配体的生物标记鉴定方法,其中超过800个单独的潜在生物标记值从先前已经诊断为患有或不患有胰腺癌的大量个体中逐个进行了筛选,才可能鉴定本文公开的胰腺癌生物标记。这种发现方法与从条件培养基或裂解的细胞发现生物标记截然相反,因为其询问无需翻译为人病理学的更加患者相关的系统。However, the identification of the pancreatic cancer biomarkers disclosed herein was only possible by using the aptamer-based biomarker identification method described herein, in which over 800 individual potential biomarker values were individually screened from a large number of individuals who had previously been diagnosed with or without pancreatic cancer. This discovery approach is in stark contrast to biomarker discovery from conditioned media or lysed cells because it interrogates a more patient-relevant system that does not require translation to human pathology.

因此,本申请一方面提供一种或多种生物标记以用于单独或以各种组合来诊断胰腺癌,或者允许鉴别诊断胰腺癌与良性胃肠(GI)疾病状况如急性或慢性胰腺炎(或两者)、胰腺梗阻、GERD、胆石或后来发现为良性的异常成像。示例性实施方案包括表1第2列提供的生物标记,如上所述,这些生物标记用实施例1中一般描述并在实施例2中更具体描述的基于多重适配体的测定鉴定。表1提供的标记可用于诊断高风险的无症状的群体中的胰腺癌以及区分急性或慢性胰腺炎(或两者)、胰腺梗阻、GERD、胆石或后来发现为良性的异常成像与胰腺癌。Thus, in one aspect, the present application provides one or more biomarkers for use alone or in various combinations to diagnose pancreatic cancer, or to allow for the differential diagnosis of pancreatic cancer from benign gastrointestinal (GI) conditions such as acute or chronic pancreatitis (or both), pancreatic obstruction, GERD, gallstones, or abnormal imaging that is later found to be benign. Exemplary embodiments include the biomarkers provided in column 2 of Table 1, which, as described above, were identified using the multiplex aptamer-based assays generally described in Example 1 and more specifically described in Example 2. The markers provided in Table 1 can be used to diagnose pancreatic cancer in a high-risk asymptomatic population and to differentiate acute or chronic pancreatitis (or both), pancreatic obstruction, GERD, gallstones, or abnormal imaging that is later found to be benign from pancreatic cancer.

尽管某些所述胰腺癌生物标记可以单独用于检测和诊断胰腺癌,但是本文所述的方法还用于分组胰腺癌生物标记的多个子集,其各自可用作两个或更多个生物标记的组。因此,本申请的各个实施方案提供了包含N个生物标记的组合,其中N是至少2个生物标记。在其他实施方案中,N选自2-65个生物标记中的任意数。While some of the pancreatic cancer biomarkers described herein can be used individually to detect and diagnose pancreatic cancer, the methods described herein can also be used to group multiple subsets of pancreatic cancer biomarkers, each of which can be used as a panel of two or more biomarkers. Thus, various embodiments of the present application provide a combination comprising N biomarkers, where N is at least 2 biomarkers. In other embodiments, N is selected from any number between 2 and 65 biomarkers.

仍然在其他实施方案中,N选自2-7、2-10、2-15、2-20、2-25、2-30、2-35、2-40、2-45、2-50、2-55或2-65中的任意数。在其他实施方案中,N选自3-7、3-10、3-15、3-20、3-25、3-30、3-35、3-40、3-45、3-50、3-55或3-65中的任意数。在其他实施方案中,N选自4-7、4-10、4-15、4-20、4-25、4-30、4-35、4-40、4-45、4-50、4-55或4-65中的任意数。在其他实施方案中,N选自5-7、5-10、5-15、5-20、5-25、5-30、5-35、5-40、5-45、5-50、5-55或5-65中的任意数。在其他实施方案中,N选自6-10、6-15、6-20、6-25、6-30、6-35、6-40、6-45、6-50、6-55或6-65中的任意数。在其他实施方案中,N选自7-10、7-15、7-20、7-25、7-30、7-35、7-40、7-45、7-50、7-55或7-65中的任意数。在其他实施方案中,N选自8-10、8-15、8-20、8-25、8-30、8-35、8-40、8-45、8-50、8-55或8-65中的任意数。在其他实施方案中,N选自9-15、9-20、9-25、9-30、9-35、9-40、9-45、9-50、9-55或9-65中的任意数。在其他实施方案中,N选自10-15、10-20、10-25、10-30、10-35、10-40、10-45、10-50、10-55或10-65中的任意数。应当理解N可以选自包含类似但更高级(order)的范围。In still other embodiments, N is selected from any number between 2 and 7, 2 and 10, 2 and 15, 2 and 20, 2 and 25, 2 and 30, 2 and 35, 2 and 40, 2 and 45, 2 and 50, 2 and 55, or 2 and 65. In other embodiments, N is selected from any number between 3 and 7, 3 and 10, 3 and 15, 3 and 20, 3 and 25, 3 and 30, 3 and 35, 3 and 40, 3 and 45, 3 and 50, 3 and 55, or 3 and 65. In other embodiments, N is selected from any number between 4 and 7, 4 and 10, 4 and 15, 4 and 20, 4 and 25, 2 and 30, 2 and 35, 2 and 40, 4 and 45, 4 and 50, 4 and 55, or 4 and 65. In other embodiments, N is selected from any number between 5 and 7, 5 and 10, 5 and 15, 5 and 20, 5 and 25, 5 and 30, 5 and 35, 5 and 40, 5 and 45, 5 and 50, 5 and 55, or 5 and 65. In other embodiments, N is selected from any number between 6 and 10, 6 and 15, 6 and 20, 6 and 25, 6 and 30, 6 and 35, 6 and 40, 6 and 45, 6 and 50, 6 and 55, or 6 and 65. In other embodiments, N is selected from any number between 7 and 10, 7 and 15, 7 and 20, 7 and 25, 7 and 30, 7 and 35, 7 and 40, 7 and 45, 7 and 50, 7 and 55, or 7 and 65. In other embodiments, N is selected from any number between 8 and 10, 8 and 15, 8 and 20, 8 and 25, 8 and 30, 8 and 35, 8 and 40, 8 and 45, 8 and 50, 8 and 55, or 8 and 65. In other embodiments, N is selected from any number between 9 and 15, 9 and 20, 9 and 25, 9 and 30, 9 and 35, 9 and 40, 9 and 45, 9 and 50, 9 and 55, or 9 and 65. In other embodiments, N is selected from any number between 10 and 15, 10 and 20, 10 and 25, 10 and 30, 10 and 35, 10 and 40, 10 and 45, 10 and 50, 10 and 55, or 10 and 65. It should be understood that N can be selected from ranges containing similar but higher order ranges.

在另一方面,本发明提供了一种诊断个体的胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测至少一个生物标记值,所述至少一个生物标记值对应于选自表1第2列提供的生物标记的组的至少一个生物标记,其中所述个体基于所述至少一个生物标记值分类为患有胰腺癌。In another aspect, the present invention provides a method of diagnosing pancreatic cancer in an individual, the method comprising detecting in a biological sample from the individual at least one biomarker value, the at least one biomarker value corresponding to at least one biomarker selected from the group of biomarkers provided in Table 1, Col. 2, wherein the individual is classified as having pancreatic cancer based on the at least one biomarker value.

在另一方面,本发明提供了一种诊断个体的胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于选自表1第2列所列的生物标记的组的至少N个生物标记之一,其中基于所述生物标记值确定所述个体患有胰腺癌的似然性(likelihood)。In another aspect, the present invention provides a method of diagnosing pancreatic cancer in an individual, the method comprising detecting, in a biological sample from the individual, biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 1, Col. 2, wherein a likelihood that the individual has pancreatic cancer is determined based on the biomarker values.

在另一方面,本发明提供了一种诊断个体的胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于选自表1第2列所列的生物标记的组的至少N个生物标记之一,其中基于所述生物标记值将所述个体分类为患有胰腺癌,并且其中N=2-10。In another aspect, the present invention provides a method of diagnosing pancreatic cancer in an individual, the method comprising detecting, in a biological sample from the individual, biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 1, Col. 2, wherein the individual is classified as having pancreatic cancer based on the biomarker values, and wherein N = 2-10.

在另一方面,本发明提供了一种诊断个体的胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于选自表1第2列所列的生物标记的组的至少N个生物标记之一,其中基于所述生物标记值确定所述个体患有胰腺癌的似然性,并且其中N=2-10。In another aspect, the present invention provides a method of diagnosing pancreatic cancer in an individual, the method comprising detecting, in a biological sample from the individual, biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 1, Col. 2, wherein a likelihood that the individual has pancreatic cancer is determined based on the biomarker values, and wherein N = 2-10.

在另一方面,本发明提供了一种诊断个体不患有胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测至少一个生物标记值,所述至少一个生物标记值对应于选自表1第2列所列的生物标记的组的至少一个生物标记,其中基于所述至少一个生物标记值将所述个体分类为不患有胰腺癌。In another aspect, the present invention provides a method of diagnosing a person who does not have pancreatic cancer, the method comprising detecting in a biological sample from the person at least one biomarker value, the at least one biomarker value corresponding to at least one biomarker selected from the group of biomarkers listed in Table 1, Col. 2, wherein the person is classified as not having pancreatic cancer based on the at least one biomarker value.

在另一方面,本发明提供了一种诊断个体不患有胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于选自表1第2列所列的生物标记的组的至少N个生物标记之一,其中基于所述生物标记值将所述个体分类为不患有胰腺癌,并且其中N=2-10。In another aspect, the present invention provides a method of diagnosing that an individual does not have pancreatic cancer, the method comprising detecting, in a biological sample from the individual, biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 1, Col. 2, wherein the individual is classified as not having pancreatic cancer based on the biomarker values, and wherein N = 2-10.

在另一方面,本发明提供了一种诊断胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组N个生物标记中的生物标记,其中所述生物标记选自表1第2列所列的生物标记的组,其中所述生物标记值的分类指示所述个体患有胰腺癌,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing pancreatic cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker from a set of N biomarkers, wherein the biomarkers are selected from the group of biomarkers listed in Table 1, Col. 2, wherein a classification of the biomarker values indicates that the individual has pancreatic cancer, and wherein N = 3-10.

在另一方面,本发明提供了一种诊断胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组N个生物标记中的生物标记,其中所述生物标记选自表1第2列所列的生物标记的组,其中所述生物标记值的分类指示所述个体患有胰腺癌,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing pancreatic cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker from a set of N biomarkers, wherein the biomarkers are selected from the group of biomarkers listed in Table 1, Col. 2, wherein a classification of the biomarker values indicates that the individual has pancreatic cancer, and wherein N = 3-10.

在另一方面,本发明提供了一种诊断胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组生物标记中的生物标记,所述一组生物标记选自表2-11所列的生物标记的组,其中所述生物标记值的分类指示所述个体患有胰腺癌。In another aspect, the present invention provides a method of diagnosing pancreatic cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker in a panel of biomarkers selected from the group of biomarkers listed in Tables 2-11, wherein a classification of the biomarker values indicates that the individual has pancreatic cancer.

在另一方面,本发明提供了一种诊断胰腺癌不存在的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组N个生物标记中的生物标记,其中所述生物标记选自表1第2列所列的生物标记的组,其中所述生物标记值的分类指示所述个体中不存在胰腺癌,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing the absence of pancreatic cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker from a set of N biomarkers, wherein the biomarkers are selected from the group of biomarkers listed in Table 1, Col. 2, wherein a classification of the biomarker value indicates the absence of pancreatic cancer in the individual, and wherein N = 3-10.

在另一方面,本发明提供了一种诊断胰腺癌不存在的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组N个生物标记中的生物标记,其中所述生物标记选自表1第2列所列的生物标记的组,其中所述生物标记值的分类指示所述个体中不存在胰腺癌,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing the absence of pancreatic cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker from a set of N biomarkers, wherein the biomarkers are selected from the group of biomarkers listed in Table 1, Col. 2, wherein a classification of the biomarker value indicates the absence of pancreatic cancer in the individual, and wherein N = 3-10.

在另一方面,本发明提供了一种诊断不存在胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组生物标记中的生物标记,所述一组生物标记选自表2-11提供的生物标记的组,其中所述生物标记值的分类指示所述个体中不存在胰腺癌。In another aspect, the present invention provides a method of diagnosing the absence of pancreatic cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker in a panel of biomarkers selected from the group of biomarkers provided in Tables 2-11, wherein a classification of the biomarker values indicates the absence of pancreatic cancer in the individual.

在另一方面,本发明提供了一种诊断个体的胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值对应于选自表1第2列所列的生物标记的组的至少N个生物标记之一,其中基于偏离预定阈值的分类评分将所述个体分类为患有胰腺癌,并且其中N=2-10。In another aspect, the present invention provides a method of diagnosing pancreatic cancer in an individual, the method comprising detecting, in a biological sample from the individual, a biomarker value corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 1, Col. 2, wherein the individual is classified as having pancreatic cancer based on a classification score that deviates from a predetermined threshold, and wherein N = 2-10.

在另一方面,本发明提供了一种诊断个体中不存在胰腺癌的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值对应于选自表1第2列所列的生物标记的组的至少N个生物标记之一,其中基于偏离预定阈值的分类评分将所述个体分类为不患有胰腺癌,并且其中N=2-10。In another aspect, the present invention provides a method of diagnosing the absence of pancreatic cancer in an individual, the method comprising detecting, in a biological sample from the individual, a biomarker value corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 1, Col. 2, wherein the individual is classified as not having pancreatic cancer based on a classification score that deviates from a predetermined threshold, and wherein N = 2-10.

在另一方面,本发明提供了一种指示胰腺癌的似然性的计算机执行方法。所述方法包括:在计算机上检索个体的生物标记信息,其中所述生物标记信息包括生物标记值,所述生物标记值每个对应于选自表1第2列所列的生物标记的组的至少N个生物标记之一,其中N如上定义;用计算机对每个所述生物标记值进行分类;以及基于多个分类指示所述个体患有胰腺癌的似然性。In another aspect, the present invention provides a computer-implemented method for indicating a likelihood of pancreatic cancer. The method comprises: retrieving, on a computer, biomarker information for an individual, wherein the biomarker information comprises biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 1, Col. 2, where N is as defined above; performing, on a computer, a classification on each of the biomarker values; and indicating a likelihood that the individual has pancreatic cancer based on a plurality of classifications.

在另一方面,本发明提供了一种将个体分类为患有或不患有胰腺癌的计算机执行方法。所述方法包括:在计算机上检索个体的生物标记信息,其中所述生物标记信息包括生物标记值,所述生物标记值每个对应于选自表1第2列提供的生物标记的组的至少N个生物标记之一;用计算机对每个所述生物标记值进行分类;以及基于多个分类指示所述个体是否患有胰腺癌。In another aspect, the present invention provides a computer-implemented method for classifying an individual as having or not having pancreatic cancer. The method comprises: retrieving, on a computer, biomarker information for the individual, wherein the biomarker information comprises biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers provided in Table 1, Col. 2; performing, on a computer, a classification on each of the biomarker values; and indicating whether the individual has pancreatic cancer based on a plurality of classifications.

在另一方面,本发明提供了一种指示胰腺癌的似然性的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:检索归因于来自个体的生物学样品的数据的代码,其中所述数据包括生物标记值,所述生物标记值每个对应于所述生物学样品中选自表1第2列所列的生物标记的组的至少N个生物标记之一,其中N如上定义;以及执行分类方法的代码,所述分类方法将所述个体患有胰腺癌的似然性指示为所述生物标记值的函数。In another aspect, the present invention provides a computer program product that indicates the likelihood of pancreatic cancer. The computer program product includes a computer-readable medium containing program code, the program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values, each of the biomarker values corresponding to one of at least N biomarkers in the biological sample selected from the group of biomarkers listed in Table 1, Col. 2, where N is as defined above; and code for executing a classification method that indicates the likelihood that the individual has pancreatic cancer as a function of the biomarker values.

在另一方面,本发明提供了一种指示个体的胰腺癌状态的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:检索归因于来自个体的生物学样品的数据的代码,其中所述数据包括生物标记值,所述生物标记值每个对应于所述生物学样品中选自表1第2列提供的生物标记的组的至少N个生物标记之一;以及执行分类方法的代码,所述分类方法将所述个体的胰腺癌状态指示为所述生物标记值的函数。In another aspect, the present invention provides a computer program product for indicating pancreatic cancer status in an individual. The computer program product includes a computer-readable medium containing program code, the program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values, each of the biomarker values corresponding to one of at least N biomarkers in the biological sample selected from the group of biomarkers provided in Table 1, Col. 2; and code for executing a classification method that indicates pancreatic cancer status in the individual as a function of the biomarker values.

在另一方面,本发明提供了一种指示胰腺癌的似然性的计算机执行方法。所述方法包括:在计算机上检索个体的生物标记信息,其中所述生物标记信息包括生物标记值,所述生物标记值对应于选自表1第2列所列的生物标记的组的生物标记;用计算机对所述生物标记值进行分类;以及基于所述分类指示所述个体患有胰腺癌的似然性。In another aspect, the present invention provides a computer-implemented method for indicating a likelihood of pancreatic cancer. The method comprises: retrieving, on a computer, biomarker information for an individual, wherein the biomarker information comprises a biomarker value corresponding to a biomarker selected from the group of biomarkers listed in Table 1, Col. 2; classifying, on a computer, the biomarker value; and indicating a likelihood that the individual has pancreatic cancer based on the classification.

在另一方面,本发明提供了一种将个体分类为患有或不患有胰腺癌的计算机执行方法。所述方法包括:从计算机检索个体的生物标记信息,其中所述生物标记信息包括生物标记值,所述生物标记值对应于选自表1第2列提供的生物标记的组的生物标记;用计算机对所述生物标记值进行分类;以及基于所述分类指示所述个体是否患有胰腺癌。In another aspect, the present invention provides a computer-implemented method for classifying an individual as having or not having pancreatic cancer. The method comprises: retrieving biomarker information for the individual from a computer, wherein the biomarker information comprises a biomarker value corresponding to a biomarker selected from the group of biomarkers provided in Table 1, Col. 2; classifying the biomarker value using the computer; and indicating whether the individual has pancreatic cancer based on the classification.

在另一方面,本发明提供了一种指示胰腺癌的似然性的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:检索归因于来自个体的生物学样品的数据的代码,其中所述数据包括生物标记值,所述生物标记值对应于所述生物学样品中选自表1第2列所列的生物标记的组的生物标记;以及执行分类方法的代码,所述分类方法将所述个体患有胰腺癌的似然性指示为所述生物标记值的函数。In another aspect, the present invention provides a computer program product that indicates the likelihood of pancreatic cancer. The computer program product includes a computer-readable medium containing program code, the program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values corresponding to biomarkers in the biological sample selected from the group of biomarkers listed in Table 1, Col. 2; and code for executing a classification method that indicates the likelihood that the individual has pancreatic cancer as a function of the biomarker values.

在另一方面,本发明提供了一种指示个体的胰腺癌状态的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:检索归因于来自个体的生物学样品的数据的代码,其中所述数据包括生物标记值,所述生物标记值对应于所述生物学样品中选自表1第2列提供的生物标记的组的生物标记;以及执行分类方法的代码,所述分类方法将所述个体的胰腺癌状态指示为所述生物标记值的函数。In another aspect, the present invention provides a computer program product for indicating pancreatic cancer status in an individual. The computer program product includes a computer-readable medium containing program code, the program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values corresponding to biomarkers in the biological sample selected from the group of biomarkers provided in Table 1, Col. 2; and code for executing a classification method that indicates pancreatic cancer status in the individual as a function of the biomarker values.

尽管某些所述癌症生物标记可单独用于检测和诊断癌症,但是本文所述的方法用于分组用作一组生物标记的癌症生物标记的多个子集。一旦鉴定了单独的生物标记或生物标记的子集,则个体中癌症的检测或诊断可以使用能够测量生物学样品中所选生物标记或多种生物标记的水平差异的任何测定平台或形式来完成。While some of the described cancer biomarkers can be used individually to detect and diagnose cancer, the methods described herein are used to group multiple subsets of cancer biomarkers for use as a panel of biomarkers. Once an individual biomarker or subset of biomarkers has been identified, detection or diagnosis of cancer in an individual can be accomplished using any assay platform or format capable of measuring differences in the levels of the selected biomarker or biomarkers in a biological sample.

然而,仅仅通过使用本文所述的基于适配体的生物标记鉴定方法,其中超过800个单独的潜在生物标记值从先前已经诊断为患有或不患有癌症的大量个体中逐个进行了筛选,才可能鉴定本文公开的癌症生物标记。这种发现方法与从条件培养基或裂解的细胞发现生物标记截然相反,因为其询问无需翻译为人病理学的更加患者相关的系统。However, the identification of the cancer biomarkers disclosed herein was only possible by using the aptamer-based biomarker identification method described herein, in which over 800 individual potential biomarker values were individually screened from a large number of individuals who had been previously diagnosed with or without cancer. This discovery approach is in stark contrast to biomarker discovery from conditioned media or lysed cells because it interrogates a more patient-relevant system that does not require translation to human pathology.

因此,在本发明的一方面,提供了一种或多种生物标记用于单独或在各种组合中使用以诊断癌症。示例性实施方案包括表19提供的生物标记,这些生物标记用实施例1中一般描述并在实施例7中更具体描述的基于多重适配体的测定鉴定。表19提供的标记可用于区分患有癌症的个体与不患有癌症的个体。Thus, in one aspect of the present invention, one or more biomarkers are provided for use alone or in various combinations to diagnose cancer. Exemplary embodiments include the biomarkers provided in Table 19, which were identified using the multiplex aptamer-based assays generally described in Example 1 and more specifically described in Example 7. The markers provided in Table 19 can be used to distinguish individuals with cancer from individuals without cancer.

虽然某些所述癌症生物标记可以单独用于检测和诊断癌症,但是本文所述的方法还用于分组癌症生物标记的多个子集,其各自用作三个或更多个生物标记的组。因此,本申请的各个实施方案提供了包含N个生物标记的组合,其中N是至少三个生物标记。在其他实施方案中,N选自3-65个生物标记中的任意数。While some of the described cancer biomarkers can be used individually to detect and diagnose cancer, the methods described herein are also useful for grouping multiple subsets of cancer biomarkers, each of which can be used as a panel of three or more biomarkers. Thus, various embodiments of the present application provide a combination comprising N biomarkers, where N is at least three biomarkers. In other embodiments, N is selected from any number between 3 and 65 biomarkers.

在其他实施方案中,N选自3-7、3-10、3-15、3-20、3-25、3-30、3-35、3-40、3-45、3-50、3-55、3-60或3-65中的任意数。在其他实施方案中,N选自4-7、4-10、4-15、4-20、4-25、4-30、4-35、4-40、4-45、4-50、4-55、4-60或4-65中的任意数。在其他实施方案中,N选自5-7、5-10、5-15、5-20、5-25、5-30、5-35、5-40、5-45、5-50、5-55、5-60或5-65中的任意数。在其他实施方案中,N选自6-10、6-15、6-20、6-25、6-30、6-35、6-40、6-45、6-50、6-55、6-60或6-65中的任意数。在其他实施方案中,N选自7-10、7-15、7-20、7-25、7-30、7-35、7-40、7-45、7-50、7-55、7-60或7-65中的任意数。在其他实施方案中,N选自8-10、8-15、8-20、8-25、8-30、8-35、8-40、8-45、8-50、8-55、8-60或8-65中的任意数。在其他实施方案中,N选自9-15、9-20、9-25、9-30、9-35、9-40、9-45、9-50、9-55、9-60或9-65中的任意数。在其他实施方案中,N选自10-15、10-20、10-25、10-30、10-35、10-40、10-45、10-50、10-55、10-60或10-65中的任意数。应当理解N可以选自包含类似但更高级的范围。In other embodiments, N is selected from any number between 3 and 7, 3 and 10, 3 and 15, 3 and 20, 3 and 25, 3 and 30, 3 and 35, 3 and 40, 3 and 45, 3 and 50, 3 and 55, 3 and 60, or 3 and 65. In other embodiments, N is selected from any number between 4 and 7, 4 and 10, 4 and 15, 4 and 20, 4 and 25, 4 and 30, 4 and 35, 4 and 40, 4 and 45, 4 and 50, 4 and 55, 4 and 60, or 4 and 65. In other embodiments, N is selected from any number between 5 and 7, 5 and 10, 5 and 15, 5 and 20, 5 and 25, 5 and 30, 5 and 35, 5 and 40, 5 and 45, 5 and 50, 5 and 55, 5 and 60, or 5 and 65. In other embodiments, N is selected from any number between 6 and 10, 6 and 15, 6 and 20, 6 and 25, 6 and 30, 6 and 35, 6 and 40, 6 and 45, 6 and 50, 6 and 55, 6 and 60, or 6 and 65. In other embodiments, N is selected from any number between 7 and 10, 7 and 15, 7 and 20, 7 and 25, 7 and 30, 7 and 35, 7 and 40, 7 and 45, 7 and 50, 7 and 55, 7 and 60, or 7 and 65. In other embodiments, N is selected from any number between 8 and 10, 8 and 15, 8 and 20, 8 and 25, 8 and 30, 8 and 35, 8 and 40, 8 and 45, 8 and 50, 8 and 55, 8 and 60, or 8 and 65. In other embodiments, N is selected from any number between 9 and 15, 9 and 20, 9 and 25, 9 and 30, 9 and 35, 9 and 40, 9 and 45, 9 and 50, 9 and 55, 9 and 60, or 9 and 65. In other embodiments, N is selected from any number between 10 and 15, 10 and 20, 10 and 25, 10 and 30, 10 and 35, 10 and 40, 10 and 45, 10 and 50, 10 and 55, 10 and 60, or 10 and 65. It should be understood that N can be selected from a range that includes similar but higher order ranges.

在另一方面,本发明提供了一种诊断个体的癌症的方法,所述方法包括在来自个体的生物学样品中检测至少一个生物标记值,所述至少一个生物标记值对应于选自表19提供的生物标记的组的至少一个生物标记,其中基于所述至少一个生物标记值就所述个体分类为患有癌症。In another aspect, the present invention provides a method of diagnosing cancer in an individual, the method comprising detecting in a biological sample from the individual at least one biomarker value, the at least one biomarker value corresponding to at least one biomarker selected from the group of biomarkers provided in Table 19, wherein the individual is classified as having cancer based on the at least one biomarker value.

在另一方面,本发明提供了一种诊断个体的癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于选自表19所列的生物标记的组的至少N个生物标记之一,其中基于所述生物标记值确定所述个体患有癌症的似然性。In another aspect, the present invention provides a method of diagnosing cancer in an individual, the method comprising detecting, in a biological sample from the individual, biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 19, wherein a likelihood that the individual has cancer is determined based on the biomarker values.

在另一方面,本发明提供了一种诊断个体的癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于选自表19所列的生物标记的组的至少N个生物标记之一,其中基于所述生物标记值将所述个体分类为患有癌症,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing cancer in an individual, the method comprising detecting, in a biological sample from the individual, biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 19, wherein the individual is classified as having cancer based on the biomarker values, and wherein N=3-10.

在另一方面,本发明提供了一种诊断个体的癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于选自表19所列的生物标记的组的至少N个生物标记之一,其中所述个体患有癌症的似然性基于所述生物标记值来确定,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing cancer in an individual, the method comprising detecting, in a biological sample from the individual, biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 19, wherein a likelihood that the individual has cancer is determined based on the biomarker values, and wherein N=3-10.

在另一方面,本发明提供了一种诊断个体不患有癌症的方法,所述方法包括在来自个体的生物学样品中检测至少一个生物标记值,所述至少一个生物标记值对应于选自表19所列的生物标记的组的至少一个生物标记,其中基于所述至少一个生物标记值将所述个体分类为不患有癌症。In another aspect, the present invention provides a method of diagnosing that an individual does not have cancer, the method comprising detecting in a biological sample from the individual at least one biomarker value, the at least one biomarker value corresponding to at least one biomarker selected from the group of biomarkers listed in Table 19, wherein the individual is classified as not having cancer based on the at least one biomarker value.

在另一方面,本发明提供了一种诊断个体不患有癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于选自表19所列的生物标记的组的至少N个生物标记之一,其中基于所述生物标记值将所述个体分类为不患有癌症,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing that an individual does not have cancer, the method comprising detecting, in a biological sample from the individual, biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 19, wherein the individual is classified as not having cancer based on the biomarker values, and wherein N=3-10.

在另一方面,本发明提供了一种诊断癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组N个生物标记中的生物标记,其中所述生物标记选自表19所列的生物标记的组,其中所述生物标记值的分类指示所述个体患有癌症,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker from a set of N biomarkers, wherein the biomarkers are selected from the group of biomarkers listed in Table 19, wherein a classification of the biomarker value indicates that the individual has cancer, and wherein N=3-10.

在另一方面,本发明提供了一种诊断癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组N个生物标记中的生物标记,其中所述生物标记选自表19所列的生物标记的组,其中所述生物标记值的分类指示所述个体患有癌症,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker from a set of N biomarkers, wherein the biomarkers are selected from the group of biomarkers listed in Table 19, wherein a classification of the biomarker value indicates that the individual has cancer, and wherein N=3-10.

在另一方面,本发明提供了一种诊断癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组生物标记中的生物标记,所述一组生物标记选自表20-29所列的生物标记的组,其中所述生物标记值的分类指示所述个体患有癌症。In another aspect, the present invention provides a method of diagnosing cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker in a panel of biomarkers selected from the group of biomarkers listed in Tables 20-29, wherein a classification of the biomarker values indicates that the individual has cancer.

在另一方面,本发明提供了一种诊断癌症不存在的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组N个生物标记中的生物标记,其中所述生物标记选自表19所列的生物标记的组,其中所述生物标记值的分类指示所述个体中不存在癌症,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing the absence of cancer, the method comprising detecting in a biological sample from an individual biomarker values, the biomarker values each corresponding to a biomarker from a set of N biomarkers, wherein the biomarkers are selected from the group of biomarkers listed in Table 19, wherein a classification of the biomarker value indicates the absence of cancer in the individual, and wherein N=3-10.

在另一方面,本发明提供了一种诊断癌症不存在的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组N个生物标记中的生物标记,其中所述生物标记选自表19所列的生物标记的组,其中所述生物标记值的分类指示所述个体中不存在癌症,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing the absence of cancer, the method comprising detecting in a biological sample from an individual biomarker values, the biomarker values each corresponding to a biomarker from a set of N biomarkers, wherein the biomarkers are selected from the group of biomarkers listed in Table 19, wherein a classification of the biomarker value indicates the absence of cancer in the individual, and wherein N=3-10.

在另一方面,本发明提供了一种诊断不存在癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值每个对应于一组生物标记中的生物标记,所述一组生物标记选自表20-29提供的生物标记的组,其中所述生物标记值的分类指示所述个体中不存在癌症。In another aspect, the present invention provides a method of diagnosing the absence of cancer, the method comprising detecting, in a biological sample from an individual, biomarker values, each of the biomarker values corresponding to a biomarker in a panel of biomarkers selected from the group of biomarkers provided in Tables 20-29, wherein a classification of the biomarker values indicates the absence of cancer in the individual.

在另一方面,本发明提供了一种诊断个体的癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值对应于选自表19所列的生物标记的组的至少N个生物标记之一,其中基于偏离预定阈值的分类评分将所述个体分类为患有癌症,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing cancer in an individual, the method comprising detecting, in a biological sample from the individual, a biomarker value corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 19, wherein the individual is classified as having cancer based on a classification score that deviates from a predetermined threshold, and wherein N=3-10.

在另一方面,本发明提供了一种诊断个体中不存在癌症的方法,所述方法包括在来自个体的生物学样品中检测生物标记值,所述生物标记值对应于选自表19所列的生物标记的组的至少N个生物标记之一,其中基于偏离预定阈值的分类评分将所述个体分类为不患有癌症,并且其中N=3-10。In another aspect, the present invention provides a method of diagnosing the absence of cancer in an individual, the method comprising detecting, in a biological sample from the individual, a biomarker value corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 19, wherein the individual is classified as not having cancer based on a classification score that deviates from a predetermined threshold, and wherein N=3-10.

在另一方面,本发明提供了一种指示癌症的似然性的计算机执行方法。所述方法包括:在计算机上检索个体的生物标记信息,其中所述生物标记信息包括生物标记值,所述生物标记值每个对应于选自表19所列的生物标记的组的至少N个生物标记之一,其中N如上定义;用计算机对每个所述生物标记值进行分类;以及基于多个分类指示所述个体患有癌症的似然性。In another aspect, the present invention provides a computer-implemented method for indicating a likelihood of having cancer. The method comprises: retrieving, on a computer, biomarker information for an individual, wherein the biomarker information comprises biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers listed in Table 19, where N is as defined above; performing, on a computer, a classification on each of the biomarker values; and indicating a likelihood that the individual has cancer based on a plurality of classifications.

在另一方面,本发明提供了一种将个体分类为患有或不患有癌症的计算机执行方法。所述方法包括:在计算机上检索个体的生物标记信息,其中所述生物标记信息包括生物标记值,所述生物标记值每个对应于选自表19提供的生物标记的组的至少N个生物标记之一;用计算机对每个所述生物标记值进行分类;以及基于多个分类指示所述个体是否患有癌症。In another aspect, the present invention provides a computer-implemented method for classifying an individual as having or not having cancer. The method comprises: retrieving, on a computer, biomarker information for the individual, wherein the biomarker information comprises biomarker values, each of the biomarker values corresponding to one of at least N biomarkers selected from the group of biomarkers provided in Table 19; performing, on a computer, a classification on each of the biomarker values; and indicating whether the individual has cancer based on a plurality of classifications.

在另一方面,本发明提供了一种指示癌症的似然性的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:检索归因于来自个体的生物学样品的数据的代码,其中所述数据包括生物标记值,所述生物标记值每个对应于所述生物学样品中选自表19所列的生物标记的组的至少N个生物标记之一,其中N如上定义;以及执行分类方法的代码,所述分类方法将所述个体患有癌症的似然性指示为所述生物标记值的函数。In another aspect, the present invention provides a computer program product that indicates a likelihood of having cancer. The computer program product includes a computer-readable medium containing program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values, each of the biomarker values corresponding to one of at least N biomarkers in the biological sample selected from the group of biomarkers listed in Table 19, where N is as defined above; and code for executing a classification method that indicates a likelihood that the individual has cancer as a function of the biomarker values.

在另一方面,本发明提供了一种指示个体的癌症状态的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:检索归因于来自个体的生物学样品的数据的代码,其中所述数据包括生物标记值,所述生物标记值每个对应于所述生物学样品中选自表19提供的生物标记的组的至少N个生物标记之一;以及执行分类方法的代码,所述分类方法将所述个体的癌症状态指示为所述生物标记值的函数。In another aspect, the present invention provides a computer program product for indicating a cancer status in an individual. The computer program product includes a computer-readable medium containing program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values, each of the biomarker values corresponding to one of at least N biomarkers in the biological sample selected from the group of biomarkers provided in Table 19; and code for executing a classification method that indicates the cancer status of the individual as a function of the biomarker values.

在另一方面,本发明提供了一种指示癌症的似然性的计算机执行方法。所述方法包括:在计算机上检索个体的生物标记信息,其中所述生物标记信息包括生物标记值,所述生物标记值对应于选自表19所列的生物标记的组的生物标记;用计算机对所述生物标记值进行分类;以及基于所述分类指示所述个体患有癌症的似然性。In another aspect, the present invention provides a computer-implemented method for indicating a likelihood of having cancer. The method comprises: retrieving, on a computer, biomarker information for an individual, wherein the biomarker information comprises biomarker values corresponding to a biomarker selected from the group of biomarkers listed in Table 19; classifying, on a computer, the biomarker values; and indicating a likelihood that the individual has cancer based on the classification.

在另一方面,本发明提供了一种将个体分类为患有或不患有癌症的计算机执行方法。所述方法包括:从计算机检索个体的生物标记信息,其中所述生物标记信息包括生物标记值,所述生物标记值对应于选自表19提供的生物标记的组的生物标记;用计算机对所述生物标记值进行分类;以及基于所述分类指示所述个体是否患有癌症。In another aspect, the present invention provides a computer-implemented method for classifying an individual as having or not having cancer. The method comprises: retrieving biomarker information for the individual from a computer, wherein the biomarker information comprises biomarker values corresponding to biomarkers selected from the group of biomarkers provided in Table 19; classifying the biomarker values using a computer; and indicating whether the individual has cancer based on the classification.

在另一方面,本发明提供了一种指示癌症的似然性的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:对归因于来自个体的生物学样品的数据进行检索的代码,其中所述数据包括生物标记值,所述生物标记值对应于所述生物学样品中选自表19所列的生物标记的组的生物标记;以及执行分类方法的代码,所述分类方法将所述个体患有癌症的似然性指示为所述生物标记值的函数。In another aspect, the present invention provides a computer program product that indicates a likelihood of having cancer. The computer program product includes a computer-readable medium containing program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values corresponding to biomarkers in the biological sample selected from the group of biomarkers listed in Table 19; and code for executing a classification method that indicates a likelihood that the individual has cancer as a function of the biomarker values.

在另一方面,本发明提供了一种指示个体的癌症状态的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:检索归因于来自个体的生物学样品的数据的代码,其中所述数据包括生物标记值,所述生物标记值对应于所述生物学样品中选自表19提供的生物标记的组的生物标记;以及执行分类方法的代码,所述分类方法将所述个体的癌症状态指示为所述生物标记值的函数。In another aspect, the present invention provides a computer program product for indicating a cancer status in an individual. The computer program product includes a computer-readable medium containing program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values corresponding to biomarkers in the biological sample selected from the group of biomarkers provided in Table 19; and code for executing a classification method that indicates the cancer status of the individual as a function of the biomarker values.

在另一方面,本发明提供了一种诊断胰腺癌的方法,所述方法包括在来自个体的生物学样品中除了生物标记值,检测肿瘤标记CA 19-9,所述生物标记值每个对应于一组生物标记中的生物标记,所述一组生物标记选自表1所列的生物标记的组,其中组合的CA 19-9和生物标记值的分类指示所述个体患有胰腺癌。In another aspect, the present invention provides a method of diagnosing pancreatic cancer, the method comprising detecting the tumor marker CA 19-9 in a biological sample from an individual in addition to biomarker values, the biomarker values each corresponding to a biomarker in a panel of biomarkers selected from the group of biomarkers listed in Table 1, wherein a classification of the combined CA 19-9 and biomarker values indicates that the individual has pancreatic cancer.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1A是检测生物学样品中的胰腺癌的示例方法的流程图。1A is a flow chart of an example method for detecting pancreatic cancer in a biological sample.

图1B是用朴素贝叶斯(Bayes)分类方法检测生物学样品中的胰腺癌的示例方法的流程图。FIG. 1B is a flow chart of an example method for detecting pancreatic cancer in a biological sample using a naive Bayesian classification method.

图2示出单个生物标记CTSB的ROC曲线,其使用用于检测胰腺癌的测试的朴素贝叶斯分类器(classifier)。FIG2 shows the ROC curve of the single biomarker CTSB using a naive Bayes classifier for the test to detect pancreatic cancer.

图3示出2至10个生物标记的生物标记组的ROC曲线,其使用用于检测胰腺癌的测试的朴素贝叶斯分类器。FIG3 shows ROC curves for biomarker panels of 2 to 10 biomarkers using a naive Bayes classifier for a test to detect pancreatic cancer.

图4说明当生物标记的数目从1增加至10时分类评分(AUC)的增加,其使用用于胰腺癌组的朴素贝叶斯分类。FIG4 illustrates the increase in classification score (AUC) when the number of biomarkers increases from 1 to 10 using Naive Bayes classification for a pancreatic cancer panel.

图5示出对于组合的GI和正常对照(实线)和胰腺癌疾病组(虚线),作为log转化的RFU形式的累积分布函数(cdf)的CTSB的测量的生物标记分布,以及它们的曲线拟合为正态cdf(短划线),以用于训练(train)朴素贝叶斯分类器。Figure 5 shows the measured biomarker distribution of CTSB as cumulative distribution function (cdf) in log-transformed RFU format for the combined GI and normal controls (solid line) and pancreatic cancer disease group (dashed line), and their curve fit to the normal cdf (dashed line) for training the naive Bayes classifier.

图6说明与本文所述的各种计算机执行方法一起使用的示例计算机系统。FIG6 illustrates an example computer system for use with the various computer-implemented methods described herein.

图7是一实施方案的指示个体患有胰腺癌的似然性的方法的流程图。7 is a flow chart of a method of indicating the likelihood that an individual has pancreatic cancer, according to an embodiment.

图8是一实施方案的指示个体患有胰腺癌的似然性的方法的流程图。8 is a flow chart of a method of indicating the likelihood that an individual has pancreatic cancer, according to an embodiment.

图9说明可以用于检测生物学样品中一个或多个胰腺癌生物标记的示例适配体测定。9 illustrates an example aptamer assay that can be used to detect one or more pancreatic cancer biomarkers in a biological sample.

图10示出从聚集的潜在生物标记的集合使用哪些生物标记来构建分类器以区分胰腺癌与GI和正常对照的频率的柱状图。FIG10 shows a histogram of frequencies of which biomarkers were used from an aggregated set of potential biomarkers to build a classifier to distinguish pancreatic cancer from GI and normal controls.

图11A示出一对柱状图,其总结了使用表1所列的生物标记(实线)和随机标记的集合(虚线)的所有可能的单蛋白朴素贝叶斯分类器评分(AUC)。FIG11A shows a pair of bar graphs summarizing all possible single-protein naive Bayes classifier scores (AUC) using the biomarkers listed in Table 1 (solid lines) and a collection of random markers (dashed lines).

图11B示出一对柱状图,其总结了使用表1所列的生物标记(实线)和随机标记的集合(虚线)的所有可能的二蛋白蛋白朴素贝叶斯分类器评分(AUC)。FIG11B shows a pair of bar graphs summarizing all possible two-protein naive Bayes classifier scores (AUC) using the biomarkers listed in Table 1 (solid lines) and a collection of random markers (dashed lines).

图11C示出一对柱状图,其总结了使用表1所列的生物标记(实线)和随机标记的集合(虚线)的所有可能的三蛋白朴素贝叶斯分类器评分(AUC)。FIG11C shows a pair of bar graphs summarizing all possible three-protein naive Bayes classifier scores (AUC) using the biomarkers listed in Table 1 (solid lines) and a collection of random markers (dashed lines).

图12示出使用选自完全组(菱形)的2-10个标记的朴素贝叶斯分类器的AUC,以及通过在分类器产生期间放弃最好的5个、10个和15个标记而获得的评分。Figure 12 shows the AUC of the Naive Bayes classifier using 2-10 markers selected from the full set (diamonds), as well as the scores obtained by discarding the best 5, 10, and 15 markers during classifier generation.

图13示出三个不同分类器的性能:单独CA19-9、SOMAmer组以及SOMAmer和CA19-9的组合。FIG13 shows the performance of three different classifiers: CA19-9 alone, the SOMAmer panel, and the combination of SOMAmers and CA19-9.

图14示出CA19-9加上1个(HAMP)或2个(HAMP和CTSB)SOMAmer生物标记的性能。Figure 14 shows the performance of CA19-9 plus one (HAMP) or two (HAMP and CTSB) SOMAmer biomarkers.

图15示出10个标记的随机森林分类器(random forest classifier)的性能。FIG15 shows the performance of a 10-label random forest classifier.

图16A示出一组ROC曲线,其从表14中的数据对1至5个标记的组建模。FIG16A shows a set of ROC curves modeled from the data in Table 14 for panels of 1 to 5 markers.

图16B示出一组ROC曲线,其从图12的2至5个标记的组的训练数据计算。FIG. 16B shows a set of ROC curves calculated from the training data for the 2 to 5 labeled groups of FIG. 12 .

图17A和17B示出通过贪婪选择方法选择的10个生物标记(表19)与10个“非标记”生物标记的1,000个随机样品集合之间性能的比较。表19中10个生物标记的平均AUC显示为虚线垂直线。在图17A中,10个生物标记的集合随机选自未通过贪婪方法选择的所有3个癌症研究中存在的所有10种分析物。在图17B中,使用与17A相同的方法;然而,采样仅限于未通过贪婪方法选择的来自表1的剩余的55个生物标记。Figures 17A and 17B show a comparison of the performance of 10 biomarkers selected by the greedy selection method (Table 19) and a 1,000-sample random set of 10 "non-marker" biomarkers. The average AUC for the 10 biomarkers in Table 19 is shown as a dashed vertical line. In Figure 17A, the set of 10 biomarkers was randomly selected from all 10 analytes present in all three cancer studies that were not selected by the greedy method. In Figure 17B, the same method as in Figure 17A was used; however, sampling was limited to the remaining 55 biomarkers from Table 1 that were not selected by the greedy method.

图18示出表19所列的3个朴素贝叶斯分类器的接受者操作特征(ROC)曲线。对于每个研究,曲线下面积(AUC)也在图例旁边示出。Figure 18 shows the receiver operating characteristic (ROC) curves for the three Naive Bayes classifiers listed in Table 19. For each study, the area under the curve (AUC) is also shown next to the figure legend.

发明详述Detailed Description of the Invention

现在详细描述本发明的代表性实施方案。虽然本发明结合列举的实施方案进行描述,但是应理解本发明并不限于这些实施方案。相反,本发明旨在涵盖可以包括在如权利要求书所限定的本发明范围内的所有替代、修饰和等价物。Representative embodiments of the present invention will now be described in detail. Although the present invention is described in conjunction with the embodiments listed, it should be understood that the present invention is not limited to these embodiments. On the contrary, the present invention is intended to encompass all substitutions, modifications, and equivalents that may be included within the scope of the invention as defined in the claims.

本领域技术人员会知道与本文所述的方法和材料相似或等价的许多方法和材料,其可以使用并在本发明的实践的范围内。本发明不以任何方式限制于所述方法和材料。One skilled in the art will recognize many methods and materials similar or equivalent to those described herein, which could be used and are within the scope of the practice of the present invention. The present invention is not limited in any way to the methods and materials described.

除非另有定义,本文所用的技术和科学术语具有本发明所属领域技术人员通常理解的相同含义。尽管与本文所述的方法、装置和材料相似或等价的任何方法、装置和材料可以用于实施或测试本发明,但是现在描述优选的方法、装置和材料。Unless otherwise defined, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention belongs. Although any methods, devices, and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods, devices, and materials are now described.

本申请中引用的所有出版物、公开的专利文件和专利申请指示本申请所属领域的技术水平。本文引用的所有出版物、公开的专利文件和专利申请援引加入本文,与每个单独的出版物、公开的专利文件或专利申请具体地和单独地指明援引加入本文的程度相同。All publications, published patent documents, and patent applications cited in this application are indicative of the state of the art in the field to which this application pertains. All publications, published patent documents, and patent applications cited herein are incorporated by reference to the same extent as if each individual publication, published patent document, or patent application was specifically and individually indicated to be incorporated by reference.

如在包括所附权利要求书在内的本申请中所用,除非特别说明,单数形式“一个(a)”、“一个(an)”和“这个(the)”包括复数形式,且与“至少一个”和“一个或多个”可互换使用。因此,提及的“一个适配体”包括适配体的混合物,提及的“探针”包括探针的混合物等。As used in this application, including the appended claims, unless otherwise indicated, the singular forms "a," "an," and "the" include plural forms and are used interchangeably with "at least one" and "one or more." Thus, reference to "an aptamer" includes a mixture of aptamers, reference to "a probe" includes a mixture of probes, and so on.

如本文所用,术语“约”表示数值的不明显更改或变化,由此该数值所涉及的项目的基本功能未改变。As used herein, the term "about" represents an insignificant modification or variation in a value such that the basic function of the item to which the value relates is not changed.

如本文所用,术语“包含(comprises)”、“包含(comprising)”、“包括(includes)”、“包括(including)”、“含有(contains)”、“含有(containing)”及它们的任何变体意图覆盖非排他的包含,由此包含、包括或含有一个元件或者一系列元件的过程、方法、方法限定产品或组成(composition of matter)不仅包括这些元件,而且可以包括未明确列举或这样的过程、方法、方法限定产品或组成固有的其他元件。As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “contains,” “containing” and any variations thereof are intended to cover a non-exclusive inclusion whereby a process, method, method-defined product or composition of matter that comprises, includes or contains an element or series of elements includes not only those elements but may include other elements not expressly listed or inherent to such process, method, method-defined product or composition of matter.

本申请包括用于检测和诊断胰腺癌和更一般的癌症的生物标记、方法、装置、试剂、系统和试剂盒。The present application includes biomarkers, methods, devices, reagents, systems and kits for detecting and diagnosing pancreatic cancer, and cancer more generally.

在一方面,本发明提供了一种或多种生物标记,其单独或以各种组合用于诊断胰腺癌,允许鉴别诊断胰腺癌与非恶性GI疾病状况,包括急性或慢性胰腺炎(或两者)、胰腺梗阻、GERD、胆石或后来发现为良性的异常成像,监测胰腺癌复发或者寻址(address)其他临床指征。如下文详细描述,示例性实施方案包括表1第2列提供的生物标记,这些生物标记使用基于多重适配体的测定来鉴定,所述测定在实施例1中一般描述,并且在实施例2中更具体地描述。In one aspect, the present invention provides one or more biomarkers that are useful alone or in various combinations for diagnosing pancreatic cancer, allowing for differential diagnosis of pancreatic cancer from non-malignant GI conditions, including acute or chronic pancreatitis (or both), pancreatic obstruction, GERD, gallstones, or abnormal imaging that is later found to be benign, monitoring for pancreatic cancer recurrence, or addressing other clinical indications. As described in detail below, exemplary embodiments include the biomarkers provided in Table 1, column 2, which were identified using a multiplex aptamer-based assay, which is generally described in Example 1 and more specifically described in Example 2.

表1第2列列出获得自以下分析的发现:来自胰腺癌病例的几百个个体血液样品,以及来自GI和正常对照的几百个等价个体血液样品。GI和正常对照组设计为匹配胰腺癌诊断测试可以具有最大益处的群体,包括无症状的个体和有症状的个体。正常对照组代表具有胰腺癌的高风险的无症状的个体。胰腺癌的高风险包括胰腺癌的家族史、肥胖、吸烟、糖尿病、囊性纤维化、慢性或遗传性胰腺炎、BRCA突变载体、p16突变以及波-杰综合征(BrandE et al.Gut 2007:56:1460)。GI对照组包括非特异性腹部症状如急性或慢性胰腺炎(或两者)、胰腺梗阻、GERD、胆石或者后来发现为良性的异常成像。将来自正常对照的样品与GI对照合并以发现可用于筛选高风险无症状的个体和有症状的个体的鉴别诊断的生物标记。潜在的生物标记在单独的样品而不是在混合疾病和对照血液中测量;这允许更好地理解与疾病(在这种情况下是胰腺癌)的存在和不存在相关的表型中个体和组的变化。由于对每个样品进行823个蛋白测量,并且单独测量来自每个疾病和对照群体的几百个样品,所以表1第2列得自非常大的数据集合的分析。使用本文“生物标记的分类和疾病评分计算”章节中描述的方法分析测量结果。表1第2列列出发现可用于区分得自患有胰腺癌的个体的样品与得自GI和正常对照的样品的65个生物标记。GI对照包括具有急性或慢性胰腺炎(或两者)、胰腺梗阻、GERD、胆石或者后来发现为良性的异常成像的个体。Table 1, column 2 lists the findings obtained from the following analysis: hundreds of individual blood samples from pancreatic cancer cases, and hundreds of equivalent individual blood samples from GI and normal controls. The GI and normal control groups were designed to match the populations that a pancreatic cancer diagnostic test would have the greatest benefit from, including asymptomatic individuals and symptomatic individuals. The normal control group represents asymptomatic individuals at high risk for pancreatic cancer. High risks for pancreatic cancer include family history of pancreatic cancer, obesity, smoking, diabetes, cystic fibrosis, chronic or hereditary pancreatitis, BRCA mutation carriers, p16 mutations, and Pohl-Jeghers syndrome (Brand E et al. Gut 2007:56:1460). The GI control group included nonspecific abdominal symptoms such as acute or chronic pancreatitis (or both), pancreatic obstruction, GERD, gallstones, or abnormal imaging that was later found to be benign. Samples from the normal controls were combined with the GI controls to discover biomarkers that can be used to screen for differential diagnosis of high-risk asymptomatic individuals and symptomatic individuals. Potential biomarkers were measured in separate samples rather than in mixed disease and control blood; this allows for a better understanding of individual and group variations in phenotypes associated with the presence and absence of disease (in this case, pancreatic cancer). Since 823 protein measurements were made for each sample, and several hundred samples from each disease and control population were measured individually, Table 1, column 2, results from analysis of a very large data set. The measurements were analyzed using the methods described in the "Classification of Biomarkers and Calculation of Disease Scores" section herein. Table 1, column 2, lists 65 biomarkers found to be useful in distinguishing samples from individuals with pancreatic cancer from samples from GI and normal controls. GI controls included individuals with acute or chronic pancreatitis (or both), pancreatic obstruction, GERD, gallstones, or abnormal imaging that was later found to be benign.

虽然某些所述胰腺癌生物标记可单独用于检测和诊断胰腺癌,但是本文还描述了胰腺癌生物标记的多个子集的分组方法,其中每个分组或者子集选择可作为一组三个或更多个生物标记使用,这在本文中可互换地称为“生物标记组”和一组。因此,本申请的各个实施方案提供了包含N个生物标记的组合,其中N是至少2个生物标记。在其他实施方案中,N选自2-65个生物标记。While some of the pancreatic cancer biomarkers described herein can be used individually to detect and diagnose pancreatic cancer, methods are also described herein for grouping multiple subsets of pancreatic cancer biomarkers, wherein each grouping or subset selection can be used as a set of three or more biomarkers, which are interchangeably referred to herein as a "biomarker panel" and a set. Thus, various embodiments of the present application provide a combination comprising N biomarkers, where N is at least 2 biomarkers. In other embodiments, N is selected from 2 to 65 biomarkers.

在其他实施方案中,N选自2-7、2-10、2-15、2-20、2-25、2-30、2-35、2-40、2-45、2-50、2-55或2-65中的任意数。在其他实施方案中,N选自3-7、3-10、3-15、3-20、3-25、3-30、3-35、3-40、3-45、3-50、3-55或3-65中的任意数。在其他实施方案中,N选自4-7、4-10、4-15、4-20、4-25、4-30、4-35、4-40、4-45、4-50、4-55或4-65中的任意数。在其他实施方案中,N选自5-7、5-10、5-15、5-20、5-25、5-30、5-35、5-40、5-45、5-50、5-55或5-65中的任意数。在其他实施方案中,N选自6-10、6-15、6-20、6-25、6-30、6-35、6-40、6-45、6-50、6-55或6-65中的任意数。在其他实施方案中,N选自7-10、7-15、7-20、7-25、7-30、7-35、7-40、7-45、7-50、7-55或7-65中的任意数。在其他实施方案中,N选自8-10、8-15、8-20、8-25、8-30、8-35、8-40、8-45、8-50、8-55或8-65中的任意数。在其他实施方案中,N选自9-15、9-20、9-25、9-30、9-35、9-40、9-45、9-50、9-55或9-65中的任意数。在其他实施方案中,N选自10-15、10-20、10-25、10-30、10-35、10-40、10-45、10-50、10-55或10-65中的任意数。应当理解N可以选自包含类似但更高级的范围。In other embodiments, N is selected from any number between 2 and 7, 2 and 10, 2 and 15, 2 and 20, 2 and 25, 2 and 30, 2 and 35, 2 and 40, 2 and 45, 2 and 50, 2 and 55, or 2 and 65. In other embodiments, N is selected from any number between 3 and 7, 3 and 10, 3 and 15, 3 and 20, 3 and 25, 3 and 30, 3 and 35, 3 and 40, 3 and 45, 3 and 50, 3 and 55, or 3 and 65. In other embodiments, N is selected from any number between 4 and 7, 4 and 10, 4 and 15, 4 and 20, 4 and 25, 2 and 30, 2 and 35, 2 and 40, 4 and 45, 4 and 50, 4 and 55, or 4 and 65. In other embodiments, N is selected from any number between 5 and 7, 5 and 10, 5 and 15, 5 and 20, 5 and 25, 5 and 30, 5 and 35, 5 and 40, 5 and 45, 5 and 50, 5 and 55, or 5 and 65. In other embodiments, N is selected from any number between 6 and 10, 6 and 15, 6 and 20, 6 and 25, 6 and 30, 6 and 35, 6 and 40, 6 and 45, 6 and 50, 6 and 55, or 6 and 65. In other embodiments, N is selected from any number between 7 and 10, 7 and 15, 7 and 20, 7 and 25, 7 and 30, 7 and 35, 7 and 40, 7 and 45, 7 and 50, 7 and 55, or 7 and 65. In other embodiments, N is selected from any number between 8 and 10, 8 and 15, 8 and 20, 8 and 25, 8 and 30, 8 and 35, 8 and 40, 8 and 45, 8 and 50, 8 and 55, or 8 and 65. In other embodiments, N is selected from any number between 9 and 15, 9 and 20, 9 and 25, 9 and 30, 9 and 35, 9 and 40, 9 and 45, 9 and 50, 9 and 55, or 9 and 65. In other embodiments, N is selected from any number between 10 and 15, 10 and 20, 10 and 25, 10 and 30, 10 and 35, 10 and 40, 10 and 45, 10 and 50, 10 and 55, or 10 and 65. It should be understood that N can be selected from a range that includes similar but higher order ranges.

在一实施方案中,可用于生物标记子集或组的生物标记的数目基于生物标记值的特定组合的灵敏性和特异性值。本文所用术语“灵敏性”和“特异性”是关于基于在个体的生物学样品中检测的一个或多个生物标记值来正确分类个体患有胰腺癌或不患有胰腺癌的能力。“灵敏性”指生物标记或多个生物标记关于正确分类患有胰腺癌的个体的性能。“特异性”指生物标记或多个生物标记关于正确分类不患有胰腺癌的个体的性能。例如,用于测试一组对照样品和胰腺癌样品的一组标记的85%特异性和90%灵敏性指85%的对照样品由该组正确分类为对照样品,并且90%的胰腺癌样品由该组正确分类为胰腺癌样品。期望或优选的最小值可以如实施例3所述确定。代表性组如表4-11所示,其示出一系列3-10个生物标记的100个不同的组,其具有所示的每组的特异性和灵敏性水平。这些组的每个中每个标记出现的总数目在每个表的底部示出。In one embodiment, the number of biomarkers that can be used in a biomarker subset or panel is based on the sensitivity and specificity values for a particular combination of biomarker values. As used herein, the terms "sensitivity" and "specificity" refer to the ability to correctly classify an individual as having or not having pancreatic cancer based on the values of one or more biomarkers detected in a biological sample from that individual. "Sensitivity" refers to the performance of a biomarker or multiple biomarkers in correctly classifying individuals as having pancreatic cancer. "Specificity" refers to the performance of a biomarker or multiple biomarkers in correctly classifying individuals as not having pancreatic cancer. For example, a panel of markers with a specificity of 85% and a sensitivity of 90% for testing a panel of control samples and pancreatic cancer samples means that 85% of the control samples were correctly classified as control samples by the panel, and 90% of the pancreatic cancer samples were correctly classified as pancreatic cancer samples by the panel. Desired or preferred minimum values can be determined as described in Example 3. Representative panels are shown in Tables 4-11, which illustrate a range of 100 different panels of 3-10 biomarkers, each with the indicated specificity and sensitivity levels. The total number of occurrences of each marker in each of these groups is shown at the bottom of each table.

在一方面,在个体中通过以下方法检测或诊断胰腺癌:对来自所述个体的生物学样品进行测定并检测生物标记值,所述生物标记值每个对应于生物标记CTSB、C5a或C5中的至少一个以及选自表1第2列的生物标记列表的至少N个额外的生物标记,其中N等于2、3、4、5、6、7、8或9。在另一方面,在个体中通过以下方法检测或诊断胰腺癌:对来自所述个体的生物学样品进行测定并检测生物标记值,所述生物标记值每个对应于生物标记CTSB、C5a或C5以及选自表1第2列的生物标记列表的至少N个额外的生物标记之一,其中N等于1、2、3、4、5、6或7。在另一方面,在个体中通过以下方法检测或诊断胰腺癌:对来自所述个体的生物学样品进行测定并检测生物标记值,所述生物标记值每个对应于生物标记CTSB以及选自表1第2列的生物标记列表的至少N个额外的生物标记之一,其中N等于2、3、4、5、6、7、8或9。在另一方面,在个体中通过以下方法检测或诊断胰腺癌:对来自所述个体的生物学样品进行测定并检测生物标记值,所述生物标记值每个对应于生物标记C5a以及选自表1第2列的生物标记列表的至少N个额外的生物标记之一,其中N等于2、3、4、5、6、7、8或9。在另一方面,在个体中通过以下方法检测或诊断胰腺癌:对来自所述个体的生物学样品进行测定并检测生物标记值,所述生物标记值每个对应于生物标记C5以及选自表1第2列的生物标记列表的至少N个额外的生物标记之一,其中N等于2、3、4、5、6、7、8或9。In one aspect, pancreatic cancer is detected or diagnosed in an individual by assaying a biological sample from the individual and detecting biomarker values, each of which corresponds to at least one of the biomarkers CTSB, C5a, or C5 and at least N additional biomarkers selected from the list of biomarkers in Table 1, Col. 2, wherein N is equal to 2, 3, 4, 5, 6, 7, 8, or 9. In another aspect, pancreatic cancer is detected or diagnosed in an individual by assaying a biological sample from the individual and detecting biomarker values, each of which corresponds to one of the biomarkers CTSB, C5a, or C5 and at least N additional biomarkers selected from the list of biomarkers in Table 1, Col. 2, wherein N is equal to 1, 2, 3, 4, 5, 6, or 7. In another aspect, pancreatic cancer is detected or diagnosed in an individual by assaying a biological sample from the individual and detecting biomarker values, each of which corresponds to the biomarker CTSB and one of at least N additional biomarkers selected from the list of biomarkers in Table 1, Col. 2, wherein N is equal to 2, 3, 4, 5, 6, 7, 8, or 9. In another aspect, pancreatic cancer is detected or diagnosed in an individual by assaying a biological sample from the individual and detecting biomarker values, each of which corresponds to the biomarker C5a and one of at least N additional biomarkers selected from the list of biomarkers in Table 1, Col. 2, wherein N is equal to 2, 3, 4, 5, 6, 7, 8, or 9. In another aspect, pancreatic cancer is detected or diagnosed in an individual by assaying a biological sample from the individual and detecting biomarker values, each of which corresponds to the biomarker C5 and one of at least N additional biomarkers selected from the list of biomarkers in Table 1, Col. 2, wherein N is equal to 2, 3, 4, 5, 6, 7, 8, or 9.

本文鉴定的胰腺癌生物标记表示较大数目的可以用于有效检测或诊断胰腺癌的生物标记的子集或组的选择。期望数目的这类生物标记的选择取决于所选生物标记的特定组合。重要的是记住:用于检测或诊断胰腺癌的生物标记的组还可以包括在表1第2列中未发现的生物标记,并且包括在表1第2列中未发现的额外的生物标记可以减少选自表1第2列的特定子集或组中的生物标记的数目。如果额外的生物医学信息与生物标记值联合用于建立对于给定测定可接受的灵敏性和特异性值,则用于子集或组的来自表1第2列的生物标记的数目也可以减少。The pancreatic cancer biomarkers identified herein represent a selection of a larger number of subsets or panels of biomarkers that can be used to effectively detect or diagnose pancreatic cancer. The desired number of such biomarkers will depend on the specific combination of biomarkers selected. It is important to remember that a panel of biomarkers for detecting or diagnosing pancreatic cancer may also include biomarkers not found in Table 1, Column 2, and that the inclusion of additional biomarkers not found in Table 1, Column 2 may reduce the number of biomarkers selected from a particular subset or panel of biomarkers from Table 1, Column 2. The number of biomarkers from Table 1, Column 2 used in a subset or panel may also be reduced if additional biomedical information is used in conjunction with the biomarker values to establish acceptable sensitivity and specificity values for a given assay.

可以影响用于生物标记的子集或组的生物标记数目的另一因素是用于从进行胰腺癌诊断的个体中获得生物学样品的方法。在精心控制的样品获取环境中,满足期望的灵敏性和特异性值所必需的生物标记的数目会低于在样品收集、处理和贮存中可以存在更多变化的情况中的数目。在研究表1第2列所列的生物标记列表中,利用多个样品收集位点来收集数据以进行分类器训练。这提供了更稳健的生物标记,其对于样品收集、处理和贮存中的变化较不敏感,但是如果训练数据全部在非常相似的条件下获得,则还可以要求子集或组中更大的生物标记数目。Another factor that can influence the number of biomarkers used in a subset or panel of biomarkers is the method used to obtain biological samples from individuals undergoing pancreatic cancer diagnosis. In carefully controlled sample acquisition environments, the number of biomarkers required to meet desired sensitivity and specificity values may be lower than in situations where there may be more variation in sample collection, processing, and storage. In studying the list of biomarkers listed in column 2 of Table 1, multiple sample collection sites were used to collect data for classifier training. This provides more robust biomarkers that are less sensitive to variations in sample collection, processing, and storage, but may also require a larger number of biomarkers in a subset or panel if the training data were all obtained under very similar conditions.

本申请的一方面可以参考图1A和1B来一般性描述。生物学样品获得自所关注的一个或多个个体。然后测定该生物学样品以检测所关注的一个或多个(N个)生物标记的存在,并且确定所述N个生物标记的每一个的生物标记值(在图1B中称为标记RFU)。一旦检测生物标记并指定生物标记值,则如本文详细描述地对每个标记进行评分或者分类。然后组合标记评分以提供总诊断评分,其表示获取样品的个体患有胰腺癌的似然性。One aspect of the present application can be generally described with reference to Figures 1A and 1B. A biological sample is obtained from one or more individuals of interest. The biological sample is then assayed to detect the presence of one or more (N) biomarkers of interest, and a biomarker value (referred to as the marker RFU in Figure 1B) is determined for each of the N biomarkers. Once the biomarkers are detected and assigned a biomarker value, each marker is scored or classified as described in detail herein. The marker scores are then combined to provide an overall diagnostic score that represents the likelihood that the individual from whom the sample was obtained has pancreatic cancer.

“生物学样品”、“样品”和“测试样品”在本文中可互换使用,指获得自或以另外的方式源自个体的任何材料、生物液体、组织或者细胞。这包括血液(包括全血、白细胞、外周血单核细胞、血沉棕黄层(buffy coat)、血浆和血清)、痰、泪液、粘液、洗鼻液(wash)、鼻抽吸物(aspirate)、呼吸物(breath)、尿、精液、唾液、腹腔灌洗液、腹水、囊液、脑膜液(meningeal fluid)、羊水、腺体液(glandular fluid)、胰液、淋巴液、胸膜液、乳头抽吸物、支气管抽吸物、支气管刷检(brushing)、滑液、关节抽吸物、器官分泌物、细胞、细胞提取物和脑脊液。其还包括上述所有材料的实验分离级分。例如,血液样品可以分级分离为血清、血浆或者含有诸如红细胞或白细胞(white blood cell)(白细胞(leukocyte))的特定类型血细胞的级分。如果需要,样品可以是来自个体的样品的组合,如组织与液体样品的组合。术语“生物学样品”还包括含有均质固体材料的材料,如来自粪便样品、组织样品或组织活检样品的材料。术语“生物学样品”还包括源自组织培养或者细胞培养的材料。可以采用获得生物学样品的任何合适方法;示例性方法包括如静脉切开放血术、拭子(如口腔拭子)以及细针抽吸活检方法。易受细针抽吸影响的示例性组织包括淋巴结、肺、肺洗液、BAL(支气管肺泡灌洗液)、甲状腺、乳腺、胰和肝。样品还可以通过显微切割(如激光捕获显微切割(LCM)或激光显微切割(LMD))、膀胱冲洗、涂片(如PAP涂片)或导管灌洗收集。获得自或源自个体的“生物学样品”包括在获得自所述个体之后已经通过任何合适方式处理的任何此类样品。"Biological sample," "sample," and "test sample" are used interchangeably herein to refer to any material, biological fluid, tissue, or cell obtained from or otherwise derived from an individual. This includes blood (including whole blood, leukocytes, peripheral blood mononuclear cells, buffy coat, plasma, and serum), sputum, tears, mucus, nasal washes, nasal aspirates, breath, urine, semen, saliva, peritoneal lavage fluid, ascites, cystic fluid, meningeal fluid, amniotic fluid, glandular fluid, pancreatic juice, lymphatic fluid, pleural fluid, nipple aspirates, bronchial aspirates, bronchial brushings, synovial fluid, joint aspirates, organ secretions, cells, cell extracts, and cerebrospinal fluid. It also includes experimentally separated fractions of all the above materials. For example, a blood sample can be fractionated into serum, plasma, or fractions containing a specific type of blood cell, such as red blood cells or white blood cells (leukocytes). If necessary, the sample can be a combination of samples from an individual, such as a combination of tissue and liquid samples. The term "biological sample" also includes materials containing homogeneous solid materials, such as materials from fecal samples, tissue samples or tissue biopsy samples. The term "biological sample" also includes materials derived from tissue culture or cell culture. Any suitable method for obtaining a biological sample can be used; exemplary methods include phlebotomy, swabs (such as oral swabs) and fine needle aspiration biopsy methods. Exemplary tissues susceptible to fine needle aspiration include lymph nodes, lungs, lung washes, BAL (bronchoalveolar lavage fluid), thyroid, breast, pancreas and liver. Samples can also be collected by microdissection (such as laser capture microdissection (LCM) or laser microdissection (LMD)), bladder washing, smears (such as PAP smears) or ductal lavage. "Biological samples" obtained from or derived from an individual include any such samples that have been processed by any suitable means after being obtained from the individual.

此外,应当认识到生物学样品可以通过从许多个体中取得生物学样品并将它们混合或混合每个个体的生物学样品的等份而获得。混合的样品可以作为来自单个个体的样品进行处理,并且如果在混合的样品中确定癌症的存在,则可以将每个个体的生物学样品再进行测试以确定哪个/哪些个体患有胰腺癌。Furthermore, it should be appreciated that a biological sample can be obtained by taking biological samples from a number of individuals and pooling them or pooling aliquots of each individual's biological sample. The pooled sample can be treated as a sample from a single individual, and if the presence of cancer is determined in the pooled sample, the biological sample from each individual can then be tested to determine which individual(s) have pancreatic cancer.

为了本说明书的目的,短语“归因于来自个体的生物学样品的数据”指所述数据以某种形式源自所述个体的生物学样品或利用所述个体的生物学样品产生。数据在产生后可以被重新格式化、修改或以数学方式改变至某种程度,例如通过从一种测量系统中的单位转变为另一测量系统中的单位;但是应当理解,数据源自所述生物学样品或利用所述生物学样品产生。For the purposes of this specification, the phrase "data attributed to a biological sample from an individual" means that the data is derived in some form from or generated using the individual's biological sample. The data may be reformatted, modified, or mathematically altered to some extent after generation, for example, by conversion from units in one measurement system to units in another measurement system; however, it is understood that the data is derived from or generated using the biological sample.

“靶标”、“靶分子”和“分析物”在本文中可互换使用,指可能存在于生物学样品中的任何所关注的分子。“所关注的分子”包括特定分子的任何微小变化,如在蛋白的情况下,例如氨基酸序列的微小变化、二硫键形成、糖基化、脂质化、乙酰化、磷酸化或者任何其他操作或修饰,如与基本不改变分子同一性的标记组分偶联。“靶分子”、“靶标”或“分析物”是一种类型或种类的分子或多分子结构的一组拷贝。“靶分子”、“靶标”和“分析物”指一组以上这样的分子。示例性靶分子包括蛋白、多肽、核酸、碳水化合物、脂质、多糖、糖蛋白、激素、受体、抗原、抗体、affybodies、抗体模拟物(mimic)、病毒、病原体、毒性物质、底物、代谢物、过渡态类似物、辅因子、抑制剂、药物、染料、营养素、生长因子、细胞、组织以及前述任何物质的任何片段或部分。"Target," "target molecule," and "analyte" are used interchangeably herein to refer to any molecule of interest that may be present in a biological sample. A "molecule of interest" includes any minor change in a particular molecule, such as, in the case of a protein, a minor change in the amino acid sequence, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification, such as coupling with a labeling component that does not substantially change the identity of the molecule. A "target molecule," "target," or "analyte" is a set of copies of a type or kind of molecule or multimolecular structure. "Target molecule," "target," and "analyte" refer to one or more such molecules. Exemplary target molecules include proteins, polypeptides, nucleic acids, carbohydrates, lipids, polysaccharides, glycoproteins, hormones, receptors, antigens, antibodies, affybodies, antibody mimics, viruses, pathogens, toxic substances, substrates, metabolites, transition state analogs, cofactors, inhibitors, drugs, dyes, nutrients, growth factors, cells, tissues, and any fragment or portion of any of the foregoing.

如本文所用,“多肽”、“肽”和“蛋白”在本文中可互换使用,指任何长度的氨基酸聚合物。聚合物可以是线性或支化的,其可以包含修饰的氨基酸,并且其可以被非氨基酸中断。该术语还涵盖已经被天然修饰或者通过干预修饰的氨基酸聚合物;例如,二硫键形成、糖基化、脂质化、乙酰化、磷酸化或任何其他操作或修饰,如与标记组分偶联。该定义还包括例如含有一个或多个氨基酸类似物(包括例如非天然氨基酸等)以及本领域已知的其他修饰的多肽。多肽可以是单链或缔合(associated)链。该定义还包括前蛋白和完整的成熟蛋白;衍生自成熟蛋白的肽或多肽;蛋白的片段;剪接变体;蛋白的重组形式;具有氨基酸修饰、缺失或取代的蛋白变体;消化;以及翻译后修饰,如糖基化、乙酰化、磷酸化等。As used herein, "polypeptide," "peptide," and "protein" are used interchangeably herein to refer to amino acid polymers of any length. The polymer may be linear or branched, it may contain modified amino acids, and it may be interrupted by non-amino acids. The term also encompasses amino acid polymers that have been modified naturally or by intervention; for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification, such as conjugation with a labeling component. The definition also includes, for example, polypeptides containing one or more analogs of amino acids (including, for example, non-natural amino acids, etc.), as well as other modifications known in the art. The polypeptide may be a single chain or an associated chain. The definition also includes preproteins and intact mature proteins; peptides or polypeptides derived from mature proteins; fragments of proteins; splice variants; recombinant forms of proteins; protein variants with amino acid modifications, deletions, or substitutions; digestion; and post-translational modifications, such as glycosylation, acetylation, phosphorylation, and the like.

如本文所用,“标记”和“生物标记”可互换使用,指指示个体中正常或异常过程或者个体中疾病或其他疾病状况的迹象或者是个体中正常或异常过程或者个体中疾病或其他疾病状况的迹象的靶分子。更具体地,“标记”或“生物标记”是与无论正常与否的特定生理状态或过程的存在相关的解剖学、生理学、生物化学或分子参数,并且如果是异常的,则无论是慢性或急性的。生物标记可以通过各种方法检测和测量,包括实验室测定和医学成像。当生物标记是蛋白时,还可以使用相应基因的表达作为生物学样品中相应蛋白生物标记的量或存在或不存在或者编码该生物标记的基因或控制该生物标记表达的蛋白的甲基化状态的替代测量。As used herein, "marker" and "biomarker" are used interchangeably to refer to a target molecule that indicates or is a sign of a normal or abnormal process in an individual, or a disease or other disease condition in an individual. More specifically, a "marker" or "biomarker" is an anatomical, physiological, biochemical, or molecular parameter that is associated with the presence of a specific physiological state or process, whether normal or abnormal, and, if abnormal, whether chronic or acute. Biomarkers can be detected and measured by various methods, including laboratory assays and medical imaging. When the biomarker is a protein, the expression of the corresponding gene can also be used as a surrogate measure of the amount or presence or absence of the corresponding protein biomarker in a biological sample, or the methylation state of the gene encoding the biomarker or the protein that controls the expression of the biomarker.

如本文所用,“生物标记值”、“值”、“生物标记水平”和“水平”可互换使用,指使用任何分析方法来检测生物学样品中的生物标记而进行的测量,其示出所述生物学样品中的生物标记、对于所述生物标记或对应于所述生物标记的存在、不存在、绝对量或浓度、相对量或浓度、效价、水平、表达水平、测量水平的比率等。所述“值”或“水平”的确切性质取决于用于检测生物标记的特定分析方法的具体设计和组分。As used herein, "biomarker value," "value," "biomarker level," and "level" are used interchangeably to refer to a measurement made using any analytical method to detect a biomarker in a biological sample that indicates the presence, absence, absolute amount or concentration, relative amount or concentration, titer, level, expression level, ratio of measured levels, etc. of, for, or corresponding to a biomarker in the biological sample. The precise nature of the "value" or "level" depends on the specific design and components of the particular analytical method used to detect the biomarker.

当生物标记表示个体中异常过程或疾病或其他疾病状况或者是个体中异常过程或疾病或其他疾病状况的迹象时,该生物标记通常描述为与表示个体中正常过程或不存在疾病或其他疾病状况或者是个体中正常过程或不存在疾病或其他疾病状况的迹象的生物标记的表达水平或值相比时是过表达或低表达的。“上调”、“上调的”、“过表达”、“过表达的”及其任何变体可互换使用,指生物学样品中生物标记的值或水平高于通常在来自健康或正常个体的相似生物学样品中检测的所述生物标记的值或水平(或者值或水平的范围)。该术语还可以指生物学样品中生物标记的值或水平高于在特定疾病的不同阶段检测的生物标记的值或水平(或者值或水平的范围)。When a biomarker is indicative of an abnormal process or disease or other disease condition in an individual, or is a sign of an abnormal process or disease or other disease condition in an individual, the biomarker is generally described as being overexpressed or underexpressed when compared to the expression level or value of a biomarker that is indicative of a normal process or absence of a disease or other disease condition in the individual, or is a sign of a normal process or absence of a disease or other disease condition in the individual. "Upregulated," "upregulated," "overexpressed," "overexpressed," and any variations thereof are used interchangeably to refer to a value or level of a biomarker in a biological sample that is higher than the value or level (or range of values or levels) of the biomarker typically detected in a similar biological sample from a healthy or normal individual. The term may also refer to a value or level of a biomarker in a biological sample that is higher than the value or level (or range of values or levels) of the biomarker detected at different stages of a particular disease.

“下调”、“下调的”、“低表达”或“低表达的”及其任何变体可互换使用,指生物学样品中生物标记的值或水平低于通常在来自健康或正常个体的相似生物学样品中检测的生物标记的值或水平(或者值或水平的范围)。该术语还可以指生物学样品中生物标记的值或水平低于在特定疾病的不同阶段检测的所述生物标记的值或水平(或者值或水平的范围)。"Downregulated," "downregulated," "underexpression," or "underexpressed," and any variations thereof, are used interchangeably to refer to a value or level of a biomarker in a biological sample that is lower than the value or level (or range of values or levels) of the biomarker typically detected in a similar biological sample from a healthy or normal individual. The term may also refer to a value or level of a biomarker in a biological sample that is lower than the value or level (or range of values or levels) of the biomarker detected at different stages of a particular disease.

此外,过表达的或低表达的生物标记还可以指与所述生物标记的“正常”表达水平或值相比是“差异表达的”或者具有“不同水平”或“不同值”,所述“正常”表达水平或值表示个体中正常过程或不存在疾病或其他疾病状况或者是个体中正常过程或不存在疾病或其他疾病状况的迹象。因此,生物标记的“差异表达”还可以指与所述生物标记的“正常”表达水平不同。Additionally, an overexpressed or underexpressed biomarker can also refer to being "differentially expressed" or having a "different level" or "different value" compared to a "normal" expression level or value of the biomarker, which is indicative of or indicative of a normal process or absence of a disease or other condition in an individual. Thus, "differential expression" of a biomarker can also refer to a difference from a "normal" expression level of the biomarker.

术语“不同的基因表达”和“差异表达”可互换使用,指在患有指定疾病的对象中基因(或其相应的蛋白表达产物)的表达被激活至相对于其在正常或对照对象中的表达较高或较低的水平。该术语还包括基因(或其相应的蛋白表达产物)的表达在相同疾病的不同阶段被激活至较高或较低水平。还应当理解差异表达的基因可以在核酸水平或蛋白水平激活或抑制,或者可以进行可变剪接以获得不同的多肽产物。这样的差异可以通过许多改变来证实,包括多肽的mRNA水平、表面表达、分泌或其他分配(partitioning)。不同的基因表达可以包括比较两个或更多个基因或者它们的基因产物之间的表达;或者比较两个或更多个基因或者它们的基因产物之间的表达的比率;或者甚至比较相同基因的两种不同加工的产物,其在正常对象与患病对象之间或者在相同疾病的不同阶段之间是不同的。差异表达包括在例如正常和患病细胞或者经历不同疾病事件或疾病阶段的细胞中的基因或其表达产物在时间或细胞表达模式中的定量以及定性的差异。The terms "differential gene expression" and "differential expression" are used interchangeably and refer to the expression of a gene (or its corresponding protein expression product) being activated to a higher or lower level relative to its expression in a normal or control subject in a subject suffering from a specified disease. The term also includes the expression of a gene (or its corresponding protein expression product) being activated to a higher or lower level at different stages of the same disease. It should also be understood that differentially expressed genes can be activated or inhibited at the nucleic acid level or protein level, or can undergo alternative splicing to obtain different polypeptide products. Such differences can be confirmed by many changes, including mRNA levels, surface expression, secretion or other partitioning of the polypeptide. Differential gene expression can include comparing the expression between two or more genes or their gene products; or comparing the ratio of expression between two or more genes or their gene products; or even comparing two differently processed products of the same gene, which are different between normal subjects and diseased subjects or between different stages of the same disease. Differential expression includes quantitative and qualitative differences in the temporal or cellular expression pattern of genes or their expression products, for example, in normal and diseased cells or cells experiencing different disease events or disease stages.

如本文所用,“个体”指测试对象或患者。个体可以是哺乳动物或非哺乳动物。在许多实施方案中,个体是哺乳动物。哺乳动物个体可以是人或非人。在许多实施方案中,个体是人。健康或正常个体是其中通过常规诊断方法不可检测出所关注的疾病或疾病状况(包括例如胰腺疾病、胰腺相关疾病或其他胰腺疾病状况)的个体。As used herein, "subject" refers to a test subject or patient. The subject can be a mammal or a non-mammal. In many embodiments, the subject is a mammal. The mammalian subject can be human or non-human. In many embodiments, the subject is human. A healthy or normal subject is one in which the disease or condition of interest (including, for example, pancreatic disease, pancreatic-related disease, or other pancreatic disease condition) is not detectable by conventional diagnostic methods.

“诊断(Diagnose)”、“诊断(diagnosing)”、“诊断(diagnosis)”及其变体指基于个体相关的一种或多种迹象、症状、数据或其他信息对所述个体的健康状态或疾病状况的检测、确定或识别。个体的健康状态可以诊断为健康/正常(即诊断为不存在疾病或疾病状况)或者诊断为患病/异常(即诊断为存在疾病或疾病状况或者对疾病或疾病状况的特征的评价)。对于特定疾病或疾病状况,术语“诊断(diagnose)”、“诊断(diagnosing)”、“诊断(diagnosis)”等涵盖对疾病的初始检测;对疾病的表征或分类;疾病的进展、缓解或复发的检测;以及在给予个体治疗或疗法后疾病应答的检测。胰腺癌的诊断包括区分患有癌症与不患有癌症的个体。其还包括区分GI和正常对照与胰腺癌。"Diagnose," "diagnosing," "diagnosis," and variations thereof refer to the detection, determination, or identification of a health or disease state of an individual based on one or more signs, symptoms, data, or other information associated with the individual. The health state of an individual can be diagnosed as healthy/normal (i.e., diagnosed as the absence of a disease or disease state) or diagnosed as diseased/abnormal (i.e., diagnosed as the presence of a disease or disease state or an evaluation of the characteristics of a disease or disease state). For a particular disease or disease state, the terms "diagnose," "diagnosing," "diagnosis," and the like encompass the initial detection of the disease; the characterization or classification of the disease; the detection of progression, remission, or recurrence of the disease; and the detection of disease response after treatment or therapy is administered to an individual. Diagnosis of pancreatic cancer includes distinguishing individuals with cancer from those without cancer. It also includes distinguishing GI and normal controls from pancreatic cancer.

预后“(Prognose)”、“预后(prognosing)”、“预后(prognosis)”及其变体指预测患有疾病或疾病状况的个体中所述疾病或疾病状况的未来进程(如预测患者存活),并且这类术语涵盖在给予个体治疗或疗法后评价疾病的应答。"Prognose," "prognosing," "prognosis," and variations thereof refer to predicting the future course of a disease or condition in an individual having the disease or condition (e.g., predicting patient survival), and such terms encompass evaluating the response of a disease after administration of treatment or therapy to an individual.

“评价(Evaluate)”、“评价(evaluating)”、“评价(evaluation)”及其变体涵盖“诊断”和“预后”,并且还涵盖对不患病个体的疾病或疾病状况的未来进程的确定或预测以及确定或预测在表面上已经治愈疾病的个体中所述疾病或疾病状况复发的似然性。术语“评价”还包括评价个体对疗法的应答,例如预测个体是否可能对治疗剂顺利地应答,或者不大可能对治疗剂应答(或者会例如经历毒性或其他不期望的副作用);选择给予个体的治疗剂;或者监测或确定个体对已经给予该个体的疗法的应答。因此,“评价”胰腺癌可以包括例如以下任何方面:预后个体中胰腺癌的未来进程;预测表面上已经治愈胰腺癌的个体中胰腺癌的复发;或者确定或预测个体对于胰腺癌治疗的应答;或者基于确定源自个体生物学样品的生物标记值来选择给予该个体的胰腺癌治疗。"Evaluate," "evaluating," "evaluation," and variations thereof encompass "diagnosis" and "prognosis," and also encompass determining or predicting the future course of a disease or condition in an individual who is not ill and determining or predicting the likelihood of recurrence of the disease or condition in an individual who has apparently been cured of the disease. The term "evaluate" also includes evaluating an individual's response to a therapy, e.g., predicting whether an individual is likely to respond favorably to a therapeutic agent, or is unlikely to respond to a therapeutic agent (or will, for example, experience toxicity or other undesirable side effects); selecting a therapeutic agent to administer to an individual; or monitoring or determining an individual's response to a therapy that has already been administered to the individual. Thus, "evaluating" pancreatic cancer can include, for example, any of the following: prognosing the future course of pancreatic cancer in an individual; predicting the recurrence of pancreatic cancer in an individual who has apparently been cured of pancreatic cancer; or determining or predicting an individual's response to a pancreatic cancer therapy; or selecting a pancreatic cancer therapy to administer to an individual based on determining a biomarker value derived from a biological sample from the individual.

任何如下实例均可以称作“诊断”或“评价”胰腺癌:最初检测胰腺癌的存在或不存在;确定胰腺癌的具体阶段、类型或亚型或者其他分类或特征;确定可疑包块是否为良性病变或恶性胰腺肿瘤;或者检测/监测胰腺癌进展(如监测肿瘤生长或转移扩散)、缓解或复发。Any of the following instances may be referred to as "diagnosing" or "evaluating" pancreatic cancer: initially detecting the presence or absence of pancreatic cancer; determining the specific stage, type, or subtype or other classification or characteristics of pancreatic cancer; determining whether a suspicious mass is a benign lesion or a malignant pancreatic tumor; or detecting/monitoring pancreatic cancer progression (e.g., monitoring tumor growth or metastatic spread), remission, or recurrence.

如本文所用,“额外的生物医学信息”指除了使用本文所述的任何生物标记之外的对个体所做的与癌症风险或更具体地与胰腺癌风险相关的一个或多个评价。“额外的生物医学信息”包括任何以下方面:个体的物理描述(physical descriptor),包括通过任何具有三维重建的对比增强的多层(多检测器)螺旋计算机断层扫描(CT)、经皮或内镜超声(US或EUS)、内镜逆行胰胆管造影(ERCP)、磁共振成像(MRI)、MR胰胆管造影(MRCP)或者腹部超声观察到的胰腺包块;个体的身高和/或体重;体重变化;个体的种族;职业史;胰腺癌(或其他癌症)的家族史;个体或家族成员中与胰腺癌(或其他癌症)的高风险相关的遗传标记的存在;胰腺包块或其他腹部包块的存在或不存在;包块的大小;包块的位置;包块和相关腹部区域的形态学(例如通过成像观察的);临床症状如腹部疼痛、体重减轻、厌食、早饱、腹泻、或脂肪泻、黄疸、最近发生的非典型糖尿病、最近但原因不明的血栓静脉炎历史、或者以前的胰腺炎攻击等;基因表达值;个体的物理描述,包括通过放射成像观察到的物理描述;个体的身高和/或体重;个体的性别;个体的种族;吸烟史;酒精使用历史;职业史;暴露于已知的致癌物(如暴露于任何石棉、氡气、化学品、来自火的烟以及空气污染,这可以包括来自静止或移动来源的排放物,如工业/工厂或汽车/海运/飞机排放物);暴露于二手烟;以及胰腺癌或其他癌症的家族史。生物标记水平的测试联合任何额外的生物医学信息的评价,包括其他实验室测试(例如,CA19-9、血清胆红素浓度、碱性磷酸酶活性、贫血的存在)与单独测试生物标记或单独评价额外的生物医学信息的任何特定项目(例如,单独的超声成像)相比,可以例如改善检测胰腺癌(或其他胰腺癌相关用途)的灵敏性、特异性和/或AUC。额外的生物医学信息可以通过使用本领域已知的常规技术得自个体,如通过使用常规患者问卷调查或健康史问卷调查等得自个体自身,或者得自医学从业人员等。生物标记水平的测试联合任何额外的生物医学信息的评价与单独测试生物标记或单独评价额外的生物医学信息的任何特定项目(如单独的CT成像)相比,可以例如改善检测胰腺癌(或其他胰腺癌相关用途)的灵敏性、特异性和/或AUC。As used herein, "additional biomedical information" refers to one or more assessments of an individual's risk for cancer, or more specifically, pancreatic cancer, made in addition to the use of any of the biomarkers described herein. "Additional biomedical information" includes any of the following: a physical description of the individual, including a pancreatic mass as observed by any contrast-enhanced multislice (multi-detector) spiral computed tomography (CT) with three-dimensional reconstruction, percutaneous or endoscopic ultrasound (US or EUS), endoscopic retrograde cholangiopancreatography (ERCP), magnetic resonance imaging (MRI), MR cholangiopancreatography (MRCP), or abdominal ultrasound; the individual's height and/or weight; weight change; the individual's race; occupational history; family history of pancreatic cancer (or other cancers); the presence of genetic markers in the individual or family members that are associated with an increased risk for pancreatic cancer (or other cancers); the presence or absence of a pancreatic or other abdominal mass; the size of the mass; the location of the mass; the morphology of the mass and associated abdominal area (e.g., by imaging observations); clinical symptoms such as abdominal pain, weight loss, anorexia, early satiety, diarrhea, or steatorrhea, jaundice, recent atypical diabetes mellitus, recent but unexplained history of thrombophlebitis, or previous attacks of pancreatitis; gene expression values; physical description of the individual, including physical descriptions observed by radiographic imaging; height and/or weight of the individual; sex of the individual; race of the individual; smoking history; alcohol use history; occupational history; exposure to known carcinogens (such as exposure to any asbestos, radon, chemicals, smoke from fires, and air pollution, which can include emissions from stationary or mobile sources, such as industrial/factory or automotive/shipping/aircraft emissions); exposure to secondhand smoke; and family history of pancreatic or other cancers. Testing of biomarker levels in conjunction with evaluation of any additional biomedical information, including other laboratory tests (e.g., CA19-9, serum bilirubin concentration, alkaline phosphatase activity, presence of anemia) can, for example, improve the sensitivity, specificity, and/or AUC for detecting pancreatic cancer (or other pancreatic cancer-related uses) compared to testing of the biomarker alone or evaluating any specific item of additional biomedical information alone (e.g., ultrasound imaging alone). The additional biomedical information can be obtained from the individual using conventional techniques known in the art, such as from the individual themselves through the use of routine patient questionnaires or health history questionnaires, or from a medical practitioner, etc. Testing of biomarker levels in conjunction with evaluation of any additional biomedical information can, for example, improve the sensitivity, specificity, and/or AUC for detecting pancreatic cancer (or other pancreatic cancer-related uses) compared to testing of the biomarker alone or evaluating any specific item of additional biomedical information alone (e.g., CT imaging alone).

癌症相关抗原19-9(CA 19-9)是已知的胰腺癌的血液标记。报道的CA 19-9对胰腺癌的灵敏性和特异性分别为80-90%。然而,这些值与肿瘤大小密切相关。CA 19-9鉴定患有小型手术可切除的癌症的患者的准确性有限。CA 19-9需要Lewis血型抗原(糖基转移酶)的存在以表达。在具有Lewis阴性表型的个体(估计5-10%的群体)中,CA 19-9水平不是可用的肿瘤标记。CA 19-9的特异性也是有限的。CA 19-9经常在患有各种良性胆胰病症的患者中升高。CA 19-9升高的程度(在最初出现和手术后环境中)与长期预后有关。此外,在看来患有潜在的可切除的疾病的患者中,CA 19-9水平的数量级还可以有助于预测放射显影隐匿性转移性疾病的存在。CA 19-9水平的系列监测可用于随访潜在的根治性手术后的患者以及接受晚期疾病的化疗的患者。升高的CA 19-9水平通常在复发性疾病的放射显影出现之前,但是疾病发展的证实应当通过成像研究和/或活组织检查来进行。生物标记水平的检测联合CA 19-9与单独CA 19-9相比,可以例如改善检测胰腺癌(或其他胰腺癌相关用途)的灵敏性、特异性和/或AUC。Cancer-associated antigen 19-9 (CA 19-9) is a known blood marker for pancreatic cancer. The reported sensitivity and specificity of CA 19-9 for pancreatic cancer are 80-90%, respectively. However, these values are closely related to tumor size. The accuracy of CA 19-9 in identifying patients with small, surgically resectable cancers is limited. CA 19-9 requires the presence of Lewis blood group antigens (glycosyltransferases) for expression. In individuals with a Lewis-negative phenotype (an estimated 5-10% of the population), CA 19-9 levels are not a useful tumor marker. The specificity of CA 19-9 is also limited. CA 19-9 is often elevated in patients with various benign pancreatic and biliary conditions. The degree of CA 19-9 elevation (at initial presentation and in the postoperative setting) is associated with long-term prognosis. In addition, in patients who appear to have potentially resectable disease, the magnitude of CA 19-9 levels can also help predict the presence of radiographically occult metastatic disease. Serial monitoring of CA 19-9 levels can be used to follow up patients after potentially curative surgery and for patients receiving chemotherapy for advanced disease. Elevated CA 19-9 levels often precede the appearance of radiographic recurrent disease, but confirmation of disease progression should be performed by imaging studies and/or biopsy. Detection of biomarker levels in combination with CA 19-9 can, for example, improve sensitivity, specificity, and/or AUC for detecting pancreatic cancer (or other pancreatic cancer-related uses) compared to CA 19-9 alone.

术语“曲线下面积”或“AUC”指接受者操作特征(ROC)曲线下的面积,这两个术语均为本领域所熟知。AUC测量可用于比较完整数据范围内的分类器的精确性。具有较大AUC的分类器具有较大的能力来正确分类两个所关注的组(如胰腺癌样品与正常或对照样品)之间的未知情况。ROC曲线可用于对特定特征的性能作图(如本文描述的任何生物标记和/或任何额外的生物医学信息项目),以在两个群体之间进行区分(如患有胰腺癌的病例与无胰腺癌的对照)。通常,整个群体(如病例与对照)的特征数据基于单个特征的值递增分类。然后,对于该特征的每个值,计算数据的真阳性率和假阳性率。真阳性率通过计数高于该特征值的病例数,然后除以病例总数来确定。假阳性率通过计数高于该特征值的对照数,然后除以对照总数来确定。尽管这个定义指其中特征在病例中与在对照中相比升高的情况,但是这个定义还适用于其中特征在病例中与在对照中相比降低的情况(在这种情况下,计数低于该特征值的样品)。ROC曲线可以对单个特征以及其他单个输出产生,例如两个或更多个特征的组合可以是数学组合(如加、减、乘等)以提供单个的和值,并且该单个的和值可以在ROC曲线中绘制。此外,其中组合产生单个输出值的多个特征的任意组合可以在ROC曲线中绘制。这些特征的组合可以包括测试。ROC曲线是测试的真阳性率(灵敏性)对测试的假阳性率(1-特异性)的作图。The term "area under the curve" or "AUC" refers to the area under the receiver operating characteristic (ROC) curve, both of which are well known in the art. The AUC measurement can be used to compare the accuracy of a classifier within the complete data range. A classifiers with larger AUCs have greater ability to correctly classify unknown cases between two groups of interest (such as pancreatic cancer samples and normal or control samples). ROC curves can be used to plot the performance of a specific feature (such as any biomarker described herein and/or any additional biomedical information project) to distinguish between two populations (such as cases with pancreatic cancer and controls without pancreatic cancer). Typically, the feature data for an entire population (such as cases and controls) is incrementally classified based on the value of a single feature. Then, for each value of that feature, the true positive rate and false positive rate of the data are calculated. The true positive rate is determined by counting the number of cases above the feature value and then dividing it by the total number of cases. The false positive rate is determined by counting the number of controls above the feature value and then dividing it by the total number of controls. In some embodiments, the present invention relates to a method for measuring the sensitivity of a patient to a specific disease or condition. The method of measuring the sensitivity of a patient to a specific disease or condition is used to determine the specific disease or condition. The method of measuring the sensitivity of a patient to a specific disease or condition is used to determine the specific disease or condition. The method of measuring the sensitivity of a patient to a specific disease or condition is used to determine the specific disease or condition. The method of measuring the sensitivity of a patient to a specific disease or condition is used to determine the specific disease or condition.

如本文所用,“检测”或“确定”生物标记值包括使用观察和记录对应于生物标记值的信号所需的设备以及产生该信号所需的材料。在各种实施方案中,生物标记值使用任何合适的方法检测,包括荧光、化学发光、表面等离子共振、表面声波、质谱、红外线光谱、拉曼光谱、原子力显微术、扫描隧道显微术、电子化学检测方法、核磁共振、量子点等。As used herein, "detecting" or "determining" a biomarker value includes the use of equipment necessary to observe and record a signal corresponding to the biomarker value and materials necessary to generate the signal. In various embodiments, biomarker values are detected using any suitable method, including fluorescence, chemiluminescence, surface plasmon resonance, surface acoustic waves, mass spectrometry, infrared spectroscopy, Raman spectroscopy, atomic force microscopy, scanning tunneling microscopy, electrochemical detection methods, nuclear magnetic resonance, quantum dots, and the like.

“固体支持物”在本文中指具有分子可以直接或间接,通过共价键或非共价键附着的表面的任何支持物。“固体支持物”可以具有各种物理形式,可以包括例如膜;芯片(如蛋白芯片);玻片(如载玻片或盖玻片);柱;空心、固体、半固体、有孔或有腔的颗粒,例如珠;凝胶;纤维,包括光学纤维材料;基质;以及样品容器。示例性样品容器包括样品孔、管、毛细管、小瓶以及能够容纳样品的任何其他容器、沟槽或凹陷。样品容器可以包含于多样品平台上,如微量滴定板、玻片、微流体装置等。支持物可以由天然或合成材料、有机或无机材料组成。其上附着捕获剂的固体支持物的成分通常取决于附着方法(如共价附着)。其他示例性容器包括微滴和微流体控制的或大量的油/水性乳液,在其中可以进行测定和相关操作。合适的固体支持物包括例如塑料、树脂、多糖、硅石或基于硅石的材料、官能化玻璃、改性的硅、碳、金属、无机玻璃、膜、尼龙、天然纤维(例如丝、羊毛和棉)、聚合物等。包含固体支持物的材料可以包含反应基团,例如羧基、氨基或羟基以用于捕获剂的附着。聚合固体支持物可以包括如聚苯乙烯、聚对苯二甲酸乙二醇酯、聚乙酸乙烯酯、聚氯乙烯、聚乙烯吡咯烷酮、聚丙烯腈、聚甲基丙烯酸甲酯、聚四氟乙烯、丁基橡胶、苯乙烯丁二烯橡胶、天然橡胶、聚乙烯、聚丙烯、(聚)四氟乙烯、(聚)偏氟乙烯、聚碳酸酯和聚甲基戊烯。可以使用的合适的固体支持物颗粒包括例如编码的颗粒,如Luminex-型编码的颗粒、磁性颗粒以及玻璃颗粒。"Solid support" as used herein refers to any support having a surface to which molecules can be attached, directly or indirectly, by covalent or non-covalent bonds. "Solid support" can have various physical forms and can include, for example, membranes; chips (such as protein chips); slides (such as slides or coverslips); columns; hollow, solid, semi-solid, porous or cavity particles, such as beads; gels; fibers, including optical fiber materials; matrices; and sample containers. Exemplary sample containers include sample wells, tubes, capillaries, vials, and any other container, groove, or depression capable of holding a sample. Sample containers can be contained on multi-sample platforms, such as microtiter plates, slides, microfluidic devices, and the like. The support can be composed of natural or synthetic materials, organic or inorganic materials. The composition of the solid support to which the capture agent is attached generally depends on the attachment method (such as covalent attachment). Other exemplary containers include droplets and microfluidic controlled or bulk oil/aqueous emulsions in which assays and related operations can be performed. Solid supports can be used for example to prepare solid supports.Suitable solid supports include, for example, plastics, resins, polysaccharides, silica or silica-based materials, functionalized glass, modified silicon, carbon, metal, inorganic glass, film, nylon, natural fiber (such as silk, wool and cotton), polymer etc. The material comprising solid supports can include reactive groups, such as carboxyl, amino or hydroxyl for the attachment of trapping agents.Polymeric solid supports can include, for example, polystyrene, polyethylene terephthalate, polyvinyl acetate, polyvinyl chloride, polyvinyl pyrrolidone, polyacrylonitrile, polymethyl methacrylate, polytetrafluoroethylene, butyl rubber, styrene butadiene rubber, natural rubber, polyethylene, polypropylene, (poly) tetrafluoroethylene, (poly) vinylidene fluoride, polycarbonate and polymethylpentene.Operable suitable solid support particles include, for example, coded particles, such as Luminex-type coded particles, magnetic particles and glass particles.

生物标记的示例性用途Exemplary Uses of Biomarkers

在许多示例性实施方案中,本发明提供了诊断个体的胰腺癌的方法,所述方法通过检测对应于个体的循环如血清或血浆中存在的一个或多个生物标记的一个或多个生物标记值来进行,并且通过任何数目的分析方法来进行,包括本文所述的任何分析方法。这些生物标记例如在胰腺癌个体中与在无胰腺癌个体相比差异表达。生物标记在个体中的差异表达的检测可以用于例如允许胰腺癌的早期诊断,区分良性和恶性包块(例如,在计算机断层扫描(CT)、MRI或超声上观察到的包块),监测胰腺癌复发,或者用于鉴别诊断其他临床疾病状况如急性或慢性胰腺炎(或两者)、胰腺梗阻、GERD、胆石或者后来发现为良性的异常成像。In many exemplary embodiments, the present invention provides methods for diagnosing pancreatic cancer in an individual by detecting one or more biomarker values corresponding to one or more biomarkers present in the individual's circulation, such as serum or plasma, and by any number of analytical methods, including any analytical methods described herein. These biomarkers are, for example, differentially expressed in individuals with pancreatic cancer compared to individuals without pancreatic cancer. Detection of differential expression of biomarkers in an individual can be used, for example, to allow early diagnosis of pancreatic cancer, to distinguish between benign and malignant masses (e.g., masses observed on computed tomography (CT), MRI, or ultrasound), to monitor for pancreatic cancer recurrence, or to differentially diagnose other clinical conditions such as acute or chronic pancreatitis (or both), pancreatic obstruction, GERD, gallstones, or abnormal imaging that is later found to be benign.

本文所述的任何生物标记可以用于胰腺癌的各种临床指征,包括以下任何方面:检测胰腺癌(例如在高风险个体或群体中);表征胰腺癌(例如,确定胰腺癌类型、亚型或阶段),如通过区分胰腺癌(胰腺癌)与急性或慢性胰腺炎(或两种)、胰腺梗阻、GERD、胆石或者后来发现为良性的异常成像和/或区分腺癌与其他恶性细胞类型(或者以其他方式促进组织病理学);确定胰腺包块是否为良性或恶性胰腺肿瘤;确定胰腺癌预后;监测胰腺癌进展或缓解;监测胰腺癌复发;监测转移;治疗选择;监测对治疗剂或其他治疗的应答;对个体的内镜超声(EUS)筛查分层(stratification)(例如,鉴定面临较高胰腺癌风险从而最可能受益于放射学筛查的那些个体,因此增加EUS的阳性预测值);组合生物标记测试与额外的生物医学信息如吸烟或饮酒史等,或者CA 19-9水平,指示胰腺癌的高风险的遗传标记的存在等,或者与包块大小,形态学,腹水的存在等(例如以提供与CA 19-9测试或其他生物标记测试或者与包块大小、形态学等相比具有增加的诊断性能的测定);促进腹部包块为恶性或良性的诊断;促进一旦在CT、MRI、PET或EUS上观察到腹部包块则作出临床决定(例如,如果认为腹部包块是低风险的,例如如果基于生物标记的测试是阴性的,有或无包块大小的分类,则进行重复放射扫描,或者如果认为该包块是中高风险的,例如如果基于生物标记的测试是阳性的,有或无包块大小或组织侵袭程度的分类,则考虑活组织检查);或者促进关于临床随访的决定(例如,在成像上观察到腹部包块后是否进行重复放射成像扫描、细针活组织检测或手术)。生物标记测试可以单独提高高风险个体的EUS筛查的阳性预测值(PPV)。除了联合EUS筛查之外,本文所述的生物标记还可以与用于与胰腺癌癌的任何其他成像方式如CT、MRI或PET扫描联合使用。此外,所述生物标记还可用于在通过成像方式或其他临床相关性检测胰腺癌指征之前或者在症状出现之前允许这些应用的某一些。其还包括区分急性或慢性胰腺炎(或两者)、胰腺梗阻、GERD、胆石或者后来发现为良性的异常成像与胰腺癌。Any of the biomarkers described herein can be used for various clinical indications of pancreatic cancer, including any of the following: detecting pancreatic cancer (e.g., in high-risk individuals or populations); characterizing pancreatic cancer (e.g., determining the type, subtype, or stage of pancreatic cancer), such as by distinguishing pancreatic cancer (cancer) from acute or chronic pancreatitis (or both), pancreatic obstruction, GERD, gallstones, or abnormal imaging that is later found to be benign and/or distinguishing adenocarcinoma from other malignant cell types (or otherwise facilitating histopathology); determining whether a pancreatic mass is a benign or malignant pancreatic tumor; determining pancreatic cancer prognosis; monitoring pancreatic cancer progression or remission; monitoring pancreatic cancer recurrence; monitoring metastasis; treatment selection; monitoring response to therapeutic agents or other treatments; stratification of individuals for endoscopic ultrasound (EUS) screening (e.g., identifying those individuals at higher risk for pancreatic cancer who would be most likely to benefit from radiographic screening, thereby increasing the positive predictive value of EUS); combining biomarker testing with additional biomedical information such as smoking or alcohol history, or CA 19-9 level, the presence of a genetic marker indicating a high risk of pancreatic cancer, etc., or with mass size, morphology, the presence of ascites, etc. (e.g., to provide an assay with increased diagnostic performance compared to a CA 19-9 test or other biomarker test or with mass size, morphology, etc.); facilitate the diagnosis of an abdominal mass as malignant or benign; facilitate clinical decisions once an abdominal mass is observed on CT, MRI, PET, or EUS (e.g., if the abdominal mass is considered low risk, e.g., if the biomarker-based test is negative, with or without classification of mass size, then repeat radiographic scanning, or if the mass is considered intermediate-to-high risk, e.g., if the biomarker-based test is positive, with or without classification of mass size or degree of tissue invasion, then consider biopsy); or facilitate decisions regarding clinical follow-up (e.g., whether to perform a repeat radiographic scan, fine needle biopsy, or surgery after an abdominal mass is observed on imaging). Biomarker testing alone can improve the positive predictive value (PPV) of EUS screening in high-risk individuals. In addition to combined EUS screening, the biomarkers described herein can also be used in conjunction with any other imaging modality used to detect pancreatic cancer, such as CT, MRI, or PET scans. Furthermore, the biomarkers can be used to allow certain applications before signs of pancreatic cancer are detected by imaging or other clinical correlates, or before symptoms develop. This also includes differentiating acute or chronic pancreatitis (or both), pancreatic obstruction, GERD, gallstones, or abnormal imaging that is later found to be benign from pancreatic cancer.

本文所述的任何生物标记可以用于诊断胰腺癌的示例性方式是:未知患有胰腺癌的个体中一个或多个所述生物标记的差异表达可以表明该个体患有胰腺癌,从而使得可以在治疗最有效的疾病早期检测胰腺癌,也许在通过其他方式检测胰腺癌之前或者在症状出现之前检测胰腺癌。胰腺癌期间一个或多个生物标记的过表达可以指示胰腺癌的进展,如胰腺肿瘤生长和/或转移(并且因此表示不良预后);而一个或多个生物标记差异表达程度的降低(即在随后的生物标记测试中,个体中的表达水平趋向或接近“正常”表达水平)可以指示胰腺癌的缓解,如胰腺肿瘤缩小(并且因此提示良好或较好的预后)。相似地,在胰腺癌治疗期间一个或多个生物标记差异表达的程度增加(即在随后的生物标记测试中,个体中的表达水平进一步远离“正常”表达水平)可以指示胰腺癌的进展,并因此表示所述治疗是无效的;而在胰腺癌治疗期间一个或多个生物标记的差异表达降低可以指示胰腺癌的缓解,并因此表示该治疗是成功的。此外,在个体看起来已经治愈胰腺癌之后一个或多个生物标记的差异表达的增加或降低可指示胰腺癌的复发。在这种情况下,例如可以在早期对个体重新启动治疗(或者如果个体维持治疗,则修改治疗方案以增加剂量和/或频率),否则直至晚期还未检测到胰腺癌的复发。此外,个体中一个或多个生物标记的差异表达水平可以预测个体对特定治疗剂的应答。在监测胰腺癌复发或进展中,生物标记表达水平的改变可以指示需要重复成像(如重复EUS),例如来确定胰腺癌活性或确定需要改变治疗方案。An exemplary manner in which any of the biomarkers described herein can be used to diagnose pancreatic cancer is that differential expression of one or more of the biomarkers in an individual not known to have pancreatic cancer can indicate that the individual has pancreatic cancer, thereby enabling detection of pancreatic cancer at an earlier stage of the disease when treatment is most effective, perhaps before pancreatic cancer is detected by other means or before symptoms appear. Overexpression of one or more biomarkers during pancreatic cancer can indicate progression of the cancer, such as pancreatic tumor growth and/or metastasis (and thus, a poor prognosis); whereas a decrease in the degree of differential expression of one or more biomarkers (i.e., expression levels in the individual trending toward or approaching "normal" expression levels on subsequent biomarker testing) can indicate remission of the cancer, such as pancreatic tumor shrinkage (and thus, a good or better prognosis). Similarly, an increase in the degree of differential expression of one or more biomarkers during pancreatic cancer treatment (i.e., the expression levels in the individual move further away from "normal" expression levels in subsequent biomarker tests) can indicate progression of the pancreatic cancer and, therefore, that the treatment is ineffective; whereas a decrease in the differential expression of one or more biomarkers during pancreatic cancer treatment can indicate remission of the pancreatic cancer and, therefore, that the treatment is successful. Furthermore, an increase or decrease in the differential expression of one or more biomarkers after an individual appears to have been cured of pancreatic cancer can indicate recurrence of pancreatic cancer. In such cases, for example, treatment can be restarted in the individual at an early stage (or, if the individual is maintained on treatment, the treatment regimen can be modified to increase the dose and/or frequency), where recurrence of pancreatic cancer would otherwise not be detected until a later stage. Furthermore, the differential expression levels of one or more biomarkers in an individual can be predictive of the individual's response to a particular therapeutic agent. In monitoring pancreatic cancer recurrence or progression, changes in biomarker expression levels can indicate the need for repeat imaging (e.g., repeat EUS), for example, to determine pancreatic cancer activity or to determine the need for a change in treatment regimen.

本文所述的任何生物标记的检测可以特别地在胰腺癌治疗后使用或者与胰腺癌治疗联合使用,如评价治疗的成功或者监测治疗后胰腺癌的缓解、复发和/或进展(包括转移)。胰腺癌治疗可以包括例如给予个体治疗剂、进行手术(如手术切除至少一部分胰腺肿瘤或者去除胰腺和周围组织)、给予放疗或本领域所用的任何其他类型的胰腺癌治疗方法以及这些治疗的任意组合。例如,任何生物标记可以在治疗后检测至少一次,或者可以在治疗后检测多次(如定期检测),或者可以在治疗之前和之后检测。个体中任何生物标记随时间的差异表达水平可以指示胰腺癌的进展、缓解或复发,其实例包括以下方面:生物标记的表达水平在治疗后与治疗前相比增加或降低;生物标记的表达水平在治疗后较晚时间点与治疗后较早时间点相比增加或降低;以及生物标记的表达水平在治疗后的一个时间点与该生物标记的正常水平相比不同。Detection of any of the biomarkers described herein can be particularly useful after or in conjunction with pancreatic cancer treatment, such as to assess the success of treatment or to monitor remission, recurrence, and/or progression (including metastasis) of pancreatic cancer after treatment. Treatment of pancreatic cancer can include, for example, administering a therapeutic agent to a subject, performing surgery (e.g., surgically removing at least a portion of a pancreatic tumor or removing the pancreas and surrounding tissue), administering radiation therapy, or any other type of pancreatic cancer treatment method used in the art, as well as any combination of these treatments. For example, any biomarker can be detected at least once after treatment, or can be detected multiple times after treatment (e.g., periodically), or can be detected before and after treatment. Differential expression levels of any biomarker in a subject over time can indicate progression, remission, or recurrence of pancreatic cancer, examples of which include: an increase or decrease in the expression level of the biomarker after treatment compared to before treatment; an increase or decrease in the expression level of the biomarker at a later time point after treatment compared to an earlier time point after treatment; and a difference in the expression level of the biomarker at a time point after treatment compared to a normal level of the biomarker.

作为具体的实例,本文所述的任何生物标记的生物标记水平可以在手术前和手术后(例如手术后2-8周)的血清或血浆样品中确定。手术后样品与手术前样品相比生物标记表达水平的增加可以指示胰腺癌的进展(如不成功的手术);而手术后样品与手术前样品相比生物标记表达水平的降低可以指示胰腺癌的消退(如成功除去胰腺肿瘤的手术)。生物标记水平的相似分析可以在其他形式的治疗之前和之后进行,如在放疗或者给予治疗剂或癌症疫苗之前和之后进行。As a specific example, biomarker levels of any of the biomarkers described herein can be determined in serum or plasma samples before and after surgery (e.g., 2-8 weeks after surgery). An increase in biomarker expression levels in the post-operative sample compared to the pre-operative sample can indicate progression of pancreatic cancer (e.g., an unsuccessful surgery); whereas a decrease in biomarker expression levels in the post-operative sample compared to the pre-operative sample can indicate regression of pancreatic cancer (e.g., a successful surgery to remove a pancreatic tumor). Similar analyses of biomarker levels can be performed before and after other forms of treatment, such as radiation therapy or before and after administration of a therapeutic agent or cancer vaccine.

除了作为独立运行的诊断测试的生物标记水平测试之外,生物标记水平还可以联合SNP或者指示疾病易感性风险增加的其他遗传病变或变异性的确定来进行。(参见,例如,Amos et al.,Nature Genetics 40,616-622(2009))。In addition to testing biomarker levels as a stand-alone diagnostic test, biomarker levels can also be combined with the determination of SNPs or other genetic lesions or variations that indicate increased risk of disease susceptibility (see, e.g., Amos et al., Nature Genetics 40, 616-622 (2009)).

除了作为独立运行的诊断测试的生物标记水平测试之外,生物标记水平还可以联合放射筛查进行。除了作为独立运行的诊断测试的生物标记水平测试之外,生物标记水平还可以联合相关症状或遗传测试进行。本文所述的任何生物标记的检测可以在已通过成像观察到胰腺包块之后用来辅助诊断胰腺癌并指导适当的个体临床护理,包括由适当的外科专家或者在不可切除的患者中通过姑息疗法来护理。除了联合相关症状或风险因素测试生物标记水平,关于生物标记的信息还可以联合其他类型的数据进行评价,特别是指示个体的胰腺癌风险的数据(例如,患者临床史、症状、胰腺癌家族史、吸烟或饮酒史、糖尿病的突然发生、黄疸、风险因素如存在遗传标记、和/或其他生物标记的状态等)。这些不同数据可以通过自动化方法评价,如计算机程序/软件,其可以在计算机或其他设备/装置中实施。In addition to testing biomarker levels as a stand-alone diagnostic test, biomarker levels can also be combined with radiological screening. In addition to testing biomarker levels as a stand-alone diagnostic test, biomarker levels can also be combined with relevant symptoms or genetic testing. The detection of any biomarker described herein can be used to assist in the diagnosis of pancreatic cancer and guide appropriate individual clinical care after a pancreatic mass has been observed by imaging, including care by appropriate surgical experts or through palliative care in unresectable patients. In addition to testing biomarker levels in combination with relevant symptoms or risk factors, information about biomarkers can also be evaluated in combination with other types of data, particularly data indicating individual pancreatic cancer risk (e.g., patient clinical history, symptoms, family history of pancreatic cancer, smoking or drinking history, sudden onset of diabetes, jaundice, risk factors such as the presence of genetic markers, and/or the status of other biomarkers, etc.). These different data can be evaluated by automated methods, such as computer programs/software, which can be implemented in computers or other devices/apparatuses.

除了在高风险个体中联合放射筛查测试生物标记水平之外(如联合在成像扫描上观察到的胰腺包块的大小或其他特征评价生物标记水平),关于生物标记的信息还可以联合其他类型的数据进行评价,特别是指示个体的胰腺癌风险的数据(例如,患者临床史、症状、癌症家族史、风险因素如个体是否是吸烟者、酗酒者和/或其他生物标记的状况等)。这些不同数据可以通过自动化方法评价,如计算机程序/软件,其可以在计算机或其他设备/装置中实施。In addition to combining biomarker levels with radiological screening tests in high-risk individuals (e.g., evaluating biomarker levels in combination with the size or other characteristics of a pancreatic mass observed on an imaging scan), information about biomarkers can also be evaluated in combination with other types of data, particularly data indicative of an individual's risk for pancreatic cancer (e.g., patient clinical history, symptoms, family history of cancer, risk factors such as whether the individual is a smoker, alcoholic, and/or other biomarker status, etc.). These various data can be evaluated by automated methods, such as computer programs/software, which can be implemented on a computer or other device/apparatus.

任何所述生物标记还可以用于成像测试。例如,显像剂可以与任何所述生物标记偶联,这可以用于辅助胰腺癌诊断、监测疾病进展/缓解或转移、监测疾病复发或者监测对治疗的应答等。Any of the biomarkers can also be used in imaging tests. For example, an imaging agent can be coupled to any of the biomarkers, which can be used to assist in pancreatic cancer diagnosis, monitor disease progression/remission or metastasis, monitor disease recurrence, or monitor response to treatment.

生物标记和生物标记值的检测和确定Detection and determination of biomarkers and biomarker values

本文所述的生物标记的生物标记值可以使用任何已知的分析方法来检测。在一实施方案中,生物标记值使用捕获试剂(capture reagent)检测。如本文所用,“捕获剂(capture agent)”或“捕获试剂”指能够特异性结合生物标记的分子。在许多实施方案中,捕获试剂可以在溶液中暴露于生物标记,或者可以暴露于生物标记,同时该捕获试剂固定在固体支持物上。在其他实施方案中,捕获试剂含有与固体支持物上的第二特征反应的特征。在这些实施方案中,捕获试剂可以在溶液中暴露于生物标记,然后该捕获试剂上的特征可以联合固体支持物上的第二特征来将所述生物标记固定在固体支持物上。捕获试剂基于进行的分析类型加以选择。捕获试剂包括但不限于适配体、抗体、adnectin、锚蛋白、其他抗体模拟物(mimetic)及其他蛋白支架、自身抗体、嵌合物、小分子、F(ab')2片段、单链抗体片段、Fv片段、单链Fv片段、核酸、凝集素、配体-结合受体、affybodies、纳米抗体(nanobodies)、印迹聚合物(imprinted polymer)、高亲合性多聚体(avimer)、肽模拟物(peptidomimetic)、激素受体、细胞因子受体及合成受体以及这些物质的修饰物和片段。The biomarker values of the biomarkers described herein can be detected using any known analytical method. In one embodiment, the biomarker values are detected using a capture reagent. As used herein, a "capture agent" or "capture reagent" refers to a molecule that is capable of specifically binding to a biomarker. In many embodiments, the capture reagent can be exposed to the biomarker in solution, or can be exposed to the biomarker while the capture reagent is immobilized on a solid support. In other embodiments, the capture reagent contains a feature that reacts with a second feature on the solid support. In these embodiments, the capture reagent can be exposed to the biomarker in solution, and then the feature on the capture reagent can combine with the second feature on the solid support to immobilize the biomarker on the solid support. The capture reagent is selected based on the type of analysis being performed. Capture agents include, but are not limited to, aptamers, antibodies, adnectins, ankyrins, other antibody mimetics and other protein scaffolds, autoantibodies, chimeras, small molecules, F(ab') 2 fragments, single-chain antibody fragments, Fv fragments, single-chain Fv fragments, nucleic acids, lectins, ligand-binding receptors, affybodies, nanobodies, imprinted polymers, avimers, peptidomimetics, hormone receptors, cytokine receptors, and synthetic receptors, as well as modifications and fragments of these substances.

在一些实施方案中,生物标记值使用生物标记/捕获试剂复合物来检测。In some embodiments, biomarker values are detected using a biomarker/capture reagent complex.

在其他实施方案中,生物标记值得自生物标记/捕获试剂复合物,并且例如作为生物标记/捕获试剂相互作用之后的反应结果间接检测,但是依赖于生物标记/捕获试剂复合物的形成。In other embodiments, the biomarker value is derived from a biomarker/capture reagent complex and is detected indirectly, for example, as a result of a reaction subsequent to the biomarker/capture reagent interaction, but dependent upon the formation of the biomarker/capture reagent complex.

在一些实施方案中,生物标记值从生物学样品中的生物学标记直接检测。In some embodiments, biomarker values are detected directly from the biomarker in the biological sample.

在一实施方案中,生物标记使用多重形式检测,这允许在生物学样品中同时检测两个或更多个生物标记。在多重形式的一实施方案中,捕获试剂直接或间接、共价或非共价地固定在固体支持物上分散的位置。在另一实施方案中,多重形式使用分离的固体支持物,其中每个固体支持物具有与该固体支持物相关的独特捕获试剂,例如量子点。在另一实施方案中,单独的装置用于检测生物学样品中待检测的多个生物标记的每一个。可以配置单独的装置以允许同时处理生物学样品中的每个生物标记。例如,可以使用微量滴定板,由此该板中的每个孔用于独特地分析生物学样品中待检测的多个生物标记之一。In one embodiment, biomarkers are detected using a multiplex format, which allows for the simultaneous detection of two or more biomarkers in a biological sample. In one embodiment of the multiplex format, capture reagents are directly or indirectly, covalently or non-covalently immobilized at discrete locations on a solid support. In another embodiment, the multiplex format utilizes separate solid supports, each having a unique capture reagent, such as quantum dots, associated therewith. In another embodiment, a separate device is used to detect each of the multiple biomarkers to be detected in a biological sample. The separate devices can be configured to allow for simultaneous processing of each biomarker in a biological sample. For example, a microtiter plate can be used, whereby each well in the plate is used to uniquely analyze one of the multiple biomarkers to be detected in a biological sample.

在一个或多个前述实施方案中,可以使用荧光标签(tag)来标记生物标记/捕获复合物的组分以允许检测生物标记值。在许多实施方案中,使用已知技术可以将荧光标记(fluorescent label)与对本文所述的任何生物标记特异性的捕获试剂偶联,然后该荧光标记可以用于检测相应的生物标记值。合适的荧光标记包括稀土元素螯合物、荧光素及其衍生物、罗丹明及其衍生物、丹磺酰、别藻蓝蛋白、PBXL-3、Qdot 605、丽丝胺(Lissamine)、藻红蛋白、德克萨斯红及其他这样的化合物。In one or more of the foregoing embodiments, a fluorescent tag can be used to label components of the biomarker/capture complex to allow detection of the biomarker value. In many embodiments, a fluorescent label can be coupled to a capture reagent specific for any of the biomarkers described herein using known techniques, and the fluorescent label can then be used to detect the corresponding biomarker value. Suitable fluorescent labels include rare earth element chelates, fluorescein and its derivatives, rhodamine and its derivatives, dansyl, allophycocyanin, PBXL-3, Qdot 605, Lissamine, phycoerythrin, Texas Red, and other such compounds.

在一实施方案中,荧光标记是荧光染料分子。在一些实施方案中,荧光染料分子包括至少一个取代的吲哚环(indolium ring)体系,其中吲哚环的3-碳上的取代基含有化学反应性基团或偶联的物质。在一些实施方案中,染料分子包括AlexFluor分子,例如AlexaFluor 488、AlexaFluor 532、AlexaFluor 647、AlexaFluor 680或AlexaFluor 700。在其他实施方案中,染料分子包括第一类型和第二类型的染料分子,如两种不同的AlexaFluor分子。在其他实施方案中,染料分子包括第一类型和第二类型的染料分子,并且两种染料分子具有不同的发射光谱。In one embodiment, fluorescent labeling is a fluorescent dye molecule. In some embodiments, fluorescent dye molecule includes at least one substituted indole ring (indolium ring) system, wherein the substituent on the 3-carbon of indole ring contains chemically reactive groups or coupled substances. In some embodiments, dye molecule includes AlexFluor molecule, such as AlexaFluor 488, AlexaFluor 532, AlexaFluor 647, AlexaFluor 680 or AlexaFluor 700. In other embodiments, dye molecule includes the dye molecule of the first type and the second type, such as two different AlexaFluor molecules. In other embodiments, dye molecule includes the dye molecule of the first type and the second type, and two kinds of dye molecules have different emission spectra.

荧光可以用与大范围的测定形式相容的许多仪器测量。例如,已经设计了分光荧光计来分析微量滴定板、显微镜载玻片、印刷阵列(printed array)、小杯等。参见Principles of Fluorescence Spectroscopy,by J.R.Lakowicz,Springer Science+Business Media,Inc.,2004。参见Bioluminescence&Chemiluminescence:Progress&Current Applications;Philip E.Stanley and Larry J.Kricka editors,WorldScientific Publishing Company,January 2002。Fluorescence can be measured using a variety of instruments compatible with a wide range of assay formats. For example, spectrofluorometers have been designed to analyze microtiter plates, microscope slides, printed arrays, cuvettes, and the like. See Principles of Fluorescence Spectroscopy, by J.R. Lakowicz, Springer Science+Business Media, Inc., 2004. See Bioluminescence & Chemiluminescence: Progress & Current Applications; Philip E. Stanley and Larry J. Kricka, editors, World Scientific Publishing Company, January 2002.

在一个或多个前述实施方案中,化学发光标签可以任选地用于标记生物标记/捕获复合物的组分以允许检测生物标记值。合适的化学发光材料包括任何草酰氯、Rodamin6G、Ru(bipy)3 2+、TMAE(四三(二甲基氨基)乙烯(tetrakis(dimethylamino)ethylene))、连苯三酚(1,2,3-三羟基苯(1,2,3-trihydroxibenzene))、光泽精、过氧草酸酯(peroxyoxalate)、芳基草酸酯、吖啶酯(acridinium ester)、二氧杂环丁烷(dioxetane)等。In one or more of the foregoing embodiments, a chemiluminescent tag can optionally be used to label components of the biomarker/capture complex to allow detection of the biomarker value. Suitable chemiluminescent materials include any of oxalyl chloride, Rodamin 6G, Ru(bipy) 3 2+ , TMAE (tetrakis(dimethylamino)ethylene), 1,2,3-trihydroxibenzene, lucigenin, peroxyoxalate, aryl oxalate, acridinium ester, dioxetane, and the like.

在其他实施方案中,检测方法包括酶/底物组合,其产生对应于生物标记值的可检测信号。通常,酶催化生色底物的化学改变,这种改变可以使用多种技术测量,包括分光光度法、荧光及化学发光。合适的酶包括例如萤光素酶、萤光素、苹果酸脱氢酶、脲酶、辣根过氧化物酶(HRPO)、碱性磷酸酶、β-半乳糖苷酶、葡糖淀粉酶、溶菌酶、葡萄糖氧化酶、半乳糖氧化酶及葡萄糖-6-磷酸脱氢酶、尿酸氧化酶、黄嘌呤氧化酶、乳过氧化物酶、微过氧化物酶等。In other embodiments, the detection method comprises an enzyme/substrate combination that produces a detectable signal corresponding to the biomarker value. Typically, the enzyme catalyzes a chemical change in a chromogenic substrate that can be measured using a variety of techniques, including spectrophotometry, fluorescence, and chemiluminescence. Suitable enzymes include, for example, luciferase, luciferin, malate dehydrogenase, urease, horseradish peroxidase (HRPO), alkaline phosphatase, β-galactosidase, glucoamylase, lysozyme, glucose oxidase, galactose oxidase, and glucose-6-phosphate dehydrogenase, urate oxidase, xanthine oxidase, lactoperoxidase, microperoxidase, and the like.

在其他实施方案中,检测方法可以是产生可测量信号的荧光、化学发光、放射性核素或酶/底物组合的组合。多种方式的信号在生物标记测定形式中可以具有独特且有利的特征。In other embodiments, the detection method can be a combination of fluorescence, chemiluminescence, radionuclide, or enzyme/substrate combinations that produce a measurable signal.Multiple modalities of signaling can have unique and advantageous characteristics in biomarker assay formats.

更特别地,本文所述的生物标记的生物标记值可以使用已知的分析方法来检测,包括单重适配体测定、多重适配体测定、单重或多重免疫测定、mRNA表达谱、miRNA表达谱、质谱分析、组织学/细胞学方法等,这在下文中详细地描述。More specifically, the biomarker values of the biomarkers described herein can be detected using known analytical methods, including single-plex aptamer assays, multiplex aptamer assays, single-plex or multiplex immunoassays, mRNA expression profiling, miRNA expression profiling, mass spectrometry analysis, histological/cytological methods, etc., which are described in detail below.

使用基于适配体的测定确定生物标记值Determining biomarker values using aptamer-based assays

检测和定量生物学样品及其他样品中生理学上有意义的分子的测定在科学研究和卫生保健领域是重要的工具。一类这样的测定包括使用包含固定在固体支持物上的一个或多个适配体的微阵列。所述适配体各自能够以高特异性方式和非常高的亲和力结合靶分子。参见例如题为“核酸配体”的美国专利第5,475,096号;还参见例如美国专利第6,242,246号、美国专利第6,458,543号和美国专利第6,503,715号,这些专利的题目均为“核酸配体诊断生物芯片”。一旦使微阵列与样品接触,则适配体结合所述样品中存在的它们各自的靶分子,从而允许确定对应于生物标记的生物标记值。Assays for detecting and quantifying physiologically significant molecules in biological and other samples are important tools in scientific research and healthcare. One class of such assays involves the use of microarrays comprising one or more aptamers immobilized on a solid support. Each of the aptamers is capable of binding to a target molecule with high specificity and very high affinity. See, for example, U.S. Patent No. 5,475,096, entitled "Nucleic Acid Ligands"; see also, for example, U.S. Patent No. 6,242,246, U.S. Patent No. 6,458,543, and U.S. Patent No. 6,503,715, all entitled "Nucleic Acid Ligand Diagnostic Biochips." Once the microarray is contacted with the sample, the aptamers bind to their respective target molecules present in the sample, thereby allowing the determination of a biomarker value corresponding to the biomarker.

如本文所用,“适配体”指对靶分子具有特异性结合亲和力的核酸。应当了解到亲和相互作用的问题关键是程度;然而在本文中,适配体对其靶标的“特异性结合亲和力”指适配体通常以与其结合测试样品中其他组分的亲和力相比更高程度的亲和力结合其靶标。“适配体”是一种类型或物种的核酸分子的一系列拷贝,其具有特定的核苷酸序列。适配体可以包含任何合适数目的核苷酸,包括任何数目的化学修饰的核苷酸。“适配体”指多于一个的这种系列的分子。不同的适配体可以具有相同或不同数目的核苷酸。适配体可以是DNA或RNA或化学修饰的核酸,并且可以是单链、双链的或者含有双链区,以及可以包含高级结构。适配体还可以是光适配体(photoaptamer),其中该适配体中包含光反应性或化学反应性官能团以允许其与其对应靶标共价连接。本文公开的任何适配体方法可以包括使用特异性结合相同靶分子的两种或更多种适配体。如下文进一步描述,适配体可以包含标签。如果适配体包含标签,则该适配体的所有拷贝不需要具有相同的标签。此外,如果不同的适配体各自包含标签,则这些不同的适配体可以具有相同的标签或者不同的标签。As used herein, "aptamer" refers to a nucleic acid with specific binding affinity for a target molecule. It should be understood that the key to affinity interactions is degree; however, in this article, the "specific binding affinity" of an aptamer to its target refers to the fact that the aptamer generally binds to its target with a higher degree of affinity than its affinity for other components in the test sample. An "aptamer" is a series of copies of a nucleic acid molecule of a type or species that has a specific nucleotide sequence. An aptamer can contain any suitable number of nucleotides, including any number of chemically modified nucleotides. An "aptamer" refers to more than one such series of molecules. Different aptamers can have the same or different numbers of nucleotides. An aptamer can be DNA or RNA or a chemically modified nucleic acid and can be single-stranded, double-stranded, or contain a double-stranded region, and can contain a higher-order structure. An aptamer can also be a photoaptamer, wherein the aptamer contains a photoreactive or chemically reactive functional group to allow it to be covalently linked to its corresponding target. Any of the aptamer methods disclosed herein can include the use of two or more aptamers that specifically bind to the same target molecule. As further described below, the aptamers can include a tag. If an aptamer includes a tag, all copies of the aptamer do not need to have the same tag. In addition, if different aptamers each include a tag, the different aptamers can have the same tag or different tags.

适配体可以使用任何已知方法鉴定,包括SELEX方法。一旦鉴定,则可以根据任何已知方法制备或合成适配体,这些已知方法包括化学合成方法和酶促合成方法。Aptamers can be identified using any known method, including the SELEX method. Once identified, aptamers can be prepared or synthesized according to any known method, including chemical synthesis methods and enzymatic synthesis methods.

如本文所用,“SOMAmer”或低解离速率修饰的适配体指具有改善的解离速率特征的适配体。SOMAmer可以使用题为“产生具有改善的解离速率的适配体的方法”的美国公开第2009/0004667号所述的改进的SELEX方法来产生。As used herein, "SOMAmer" or low off-rate modified aptamer refers to an aptamer with improved off-rate characteristics. SOMAmers can be produced using the improved SELEX method described in U.S. Publication No. 2009/0004667, entitled "Methods for Producing Aptamers with Improved Off-Rates."

术语“SELEX”和“SELEX方法”在本文中可互换使用,通常指(1)与(2)的组合,其中(1)是选择以期望的方式与靶分子相互作用的适配体,例如以高亲和力结合蛋白,(2)是扩增那些选择的核酸。SELEX方法可以用于鉴定对特定靶标或生物标记具有高亲和力的适配体。The terms "SELEX" and "SELEX method" are used interchangeably herein and generally refer to a combination of (1) selecting aptamers that interact with a target molecule in a desired manner, such as binding to a protein with high affinity, and (2) amplifying those selected nucleic acids. The SELEX method can be used to identify aptamers with high affinity for a specific target or biomarker.

SELEX通常包括制备核酸的候选混合物;使所述候选混合物与期望的靶分子结合以形成亲和复合物;分离所述亲和复合物与未结合的候选核酸;使核酸与所述亲和复合物分开并分离所述核酸;纯化所述核酸;以及鉴定特异性适配体序列。所述方法可以包括多次循环以进一步精制所选适配体的亲和力。所述方法可以包括在该方法的一个或多个点的扩增步骤。参见例如题为“核酸配体”的美国专利第5,475,096号。SELEX方法可以用于产生与适配体的靶标共价结合的适配体,以及与适配体的靶标非共价结合的适配体。参见例如题为“通过指数富集的核酸配体的系统进化:Chemi-SELEX”的美国专利第5,705,337号。SELEX generally comprises preparing a candidate mixture of nucleic acids; binding the candidate mixture to a desired target molecule to form an affinity complex; separating the affinity complex from unbound candidate nucleic acids; separating nucleic acids from the affinity complex and isolating the nucleic acids; purifying the nucleic acids; and identifying specific aptamer sequences. The method may include multiple cycles to further refine the affinity of the selected aptamer. The method may include an amplification step at one or more points in the method. See, for example, U.S. Patent No. 5,475,096, entitled “Nucleic Acid Ligands.” The SELEX method can be used to generate aptamers that covalently bind to the target of the aptamer, as well as aptamers that non-covalently bind to the target of the aptamer. See, for example, U.S. Patent No. 5,705,337, entitled “Systematic Evolution of Nucleic Acid Ligands by Exponential Enrichment: Chemi-SELEX.”

SELEX方法可以用于鉴定含有修饰的核苷酸的高亲和力适配体,所述修饰的核苷酸赋予该适配体改善的特征,例如改善的体内稳定性或改善的递送特征。此类修饰的实例包括核糖和/或磷酸和/或碱基位置的化学取代。通过SELEX方法鉴定的含有修饰的核苷酸的适配体描述于题为“含有修饰的核苷酸的高亲和力核酸配体”的美国专利第5,660,985号,其描述了含有在嘧啶的5'-和2'-位置处经化学修饰的核苷酸衍生物的寡核苷酸。见上文,美国专利第5,580,737号描述了高特异性适配体,其含有用2'-氨基(2'-NH2)、2'-氟(2'-F)和/或2'-O-甲基(2'-OMe)修饰的一个或多个核苷酸。还参见题为“SELEX和PHOTOSELEX”的美国专利申请公开20090098549,其描述了具有扩展的物理和化学性质的核酸文库及其在SELEX和photoSELEX中的用途。The SELEX method can be used to identify the high-affinity aptamers containing modified nucleotides, and the modified nucleotides give the aptamer improved features, such as improved in vivo stability or improved delivery characteristics. The example of such modifications includes the chemical replacement of ribose and/or phosphate and/or base positions. The aptamers containing modified nucleotides identified by the SELEX method are described in the U.S. Patent No. 5,660,985 entitled "High-affinity nucleic acid ligands containing modified nucleotides", which describes the oligonucleotides containing chemically modified nucleotide derivatives at the 5'-and 2'-positions of pyrimidines. See above, U.S. Patent No. 5,580,737 describes high-specificity aptamers, which contain one or more nucleotides modified with 2'-amino (2'-NH2), 2'-fluoro (2'-F) and/or 2'-O-methyl (2'-OMe). See also US Patent Application Publication 20090098549 entitled "SELEX and PHOTOSELEX," which describes nucleic acid libraries with extended physical and chemical properties and their use in SELEX and photoSELEX.

SELEX还可以用于鉴定具有期望的解离速率特征的适配体。参见题为“产生具有改善的解离速率的适配体的方法”的美国专利申请公开20090004667,其描述了产生可以结合靶分子的适配体的改进SELEX方法。描述了产生与各自的靶分子具有较慢解离速率的适配体和光适配体的方法。所述方法包括使候选混合物与靶分子接触;允许形成核酸-靶标复合物;以及进行缓慢解离速率富集过程,其中具有快解离速率的核酸-靶标复合物解离并不再形成,而具有慢解离速率的复合物会保持完整。此外,所述方法包括在产生候选核酸混合物中使用修饰的核苷酸,以产生具有改善的解离速率性能的适配体。SELEX can also be used to identify aptamers with desired off-rate characteristics. See U.S. Patent Application Publication 20090004667, entitled “Methods for Producing Aptamers with Improved Off-Rates,” which describes an improved SELEX method for producing aptamers that can bind to target molecules. Methods for producing aptamers and photoaptamers that have slower off-rates with their respective target molecules are described. The method includes contacting a candidate mixture with a target molecule; allowing a nucleic acid-target complex to form; and performing a slow off-rate enrichment process, wherein nucleic acid-target complexes with fast off-rates dissociate and no longer form, while complexes with slow off-rates remain intact. In addition, the method includes using modified nucleotides in generating a candidate nucleic acid mixture to produce aptamers with improved off-rate properties.

这种测定的变化使用包含光反应性官能团的适配体,这允许适配体与其靶分子共价结合或“光交联”。参见例如题为“核酸配体诊断生物芯片”的美国专利第6,544,776号。这些光反应性适配体也称作光适配体。参见例如美国专利第5,763,177号、美国专利第6,001,577号和美国专利第6,291,184号,所述专利的题目均是“通过指数富集的核酸配体的系统进化:核酸配体的光选择和溶液SELEX”;还参见例如题为“核酸配体的光选择”的美国专利第6,458,539号。在使微阵列与样品接触并使光适配体具有结合其靶分子的机会之后,将该光适配体光激活并洗涤固体支持物以除去任何非特异性结合的分子。可以使用严格洗涤条件,因为结合光适配体的靶分子由于该光适配体上光激活的官能团所产生的共价键而通常未被除去。在这种方式中,测定允许检测对应于测试样品中的生物标记的生物标记值。A variation of this assay uses aptamers that contain photoreactive functional groups, which allow the aptamers to covalently bind or "photocrosslink" to their target molecules. See, for example, U.S. Patent No. 6,544,776, entitled "Nucleic Acid Ligand Diagnostic Biochip." These photoreactive aptamers are also referred to as photoaptamers. See, for example, U.S. Patent No. 5,763,177, U.S. Patent No. 6,001,577, and U.S. Patent No. 6,291,184, all entitled "Systematic Evolution of Nucleic Acid Ligands by Exponential Enrichment: Photoselection of Nucleic Acid Ligands and Solution SELEX"; see also, for example, U.S. Patent No. 6,458,539, entitled "Photoselection of Nucleic Acid Ligands." After the microarray is contacted with the sample and the photoaptamers have an opportunity to bind to their target molecules, the photoaptamers are photoactivated and the solid support is washed to remove any non-specifically bound molecules. Stringent wash conditions can be used because target molecules bound to the photoaptamer are typically not removed due to the covalent bond created by the photoactivated functional group on the photoaptamer.In this manner, the assay allows for the detection of biomarker values corresponding to the biomarkers in the test sample.

在这两种测定形式中,适配体在与样品接触之前固定在固体支持物上。然而,在某些情况下,在与样品接触之前固定适配体也许无法提供最佳的测定。例如,预固定适配体可能导致适配体与靶分子在固体支持物表面上的无效混合,这可能导致漫长的反应时间及因此延长的温育时间以允许适配体与其靶分子有效结合。此外,当光适配体用于测定并且取决于用作固体支持物的材料时,该固体支持物可能趋于分散或吸收用于实现光适配体与其靶分子之间的共价键形成的光。此外,根据所用的方法,结合适配体的靶分子的检测可能不准确,因为固体支持物的表面也可能暴露于且受所用的任何标记剂的影响。最后,适配体固定在固体支持物上通常包括在适配体暴露于样品之前的适配体制备步骤(即固定),这个制备步骤可能影响适配体的活性或功能性。In both assay formats, the aptamer is immobilized on a solid support prior to contact with the sample. However, in some cases, immobilizing the aptamer prior to contact with the sample may not provide the optimal assay. For example, pre-immobilizing the aptamer may result in ineffective mixing of the aptamer and the target molecule on the surface of the solid support, which may result in lengthy reaction times and, therefore, extended incubation times to allow the aptamer to effectively bind to its target molecule. In addition, when photoaptamers are used for the assay and depending on the material used as the solid support, the solid support may tend to disperse or absorb the light used to achieve the covalent bond formation between the photoaptamer and its target molecule. Furthermore, depending on the method used, detection of the target molecule bound to the aptamer may be inaccurate because the surface of the solid support may also be exposed to and affected by any labeling agent used. Finally, immobilization of the aptamer on the solid support typically includes an aptamer preparation step (i.e., immobilization) before the aptamer is exposed to the sample, and this preparation step may affect the activity or functionality of the aptamer.

还描述了适配体测定,其允许适配体在溶液中捕获其靶标,然后在检测之前使用设计为除去适配体-靶标混合物中特定组分的分离步骤(参见题为“测试样品的多重分析”的美国专利申请公开20090042206)。所述适配体测定方法允许检测和定量测试样品中的非核酸靶标(如蛋白靶标),这通过检测和定量核酸(即适配体)进行。所述方法产生核酸替代物(surrogate)(即适配体)以检测和定量非核酸靶标,由此允许包括扩增在内的许多核酸技术用于包括蛋白靶标在内的更大范围的期望靶标。Aptamer assays have also been described that allow aptamers to capture their targets in solution, followed by a separation step designed to remove specific components of the aptamer-target mixture prior to detection (see U.S. Patent Application Publication 20090042206, entitled "Multiplexed Analysis of Test Samples"). The aptamer assay method allows for the detection and quantification of non-nucleic acid targets (such as protein targets) in a test sample by detecting and quantifying nucleic acids (i.e., aptamers). The method generates nucleic acid surrogate (i.e., aptamers) to detect and quantify non-nucleic acid targets, thereby allowing many nucleic acid techniques, including amplification, to be applied to a wider range of desired targets, including protein targets.

可以构建适配体以促进从适配体生物标记复合物(或光适配体生物标记共价复合物)分离测定组分,以及允许分离适配体以进行检测和/或定量。在一实施方案中,这些构建体可以包含适配体序列中可裂解或可释放的元件。在其他实施方案中,可以在适配体中引入额外的官能性,例如标记的或可检测的组分、间隔组分或者特异性结合标签或固定元件。例如,适配体可以包含通过可裂解部分与适配体连接的标签、标记、分隔标记与可裂解部分的间隔组分。在一实施方案中,可裂解元件是光可裂解接头(linker)。光可裂解接头可以连接至生物素部分和间隔区段,可以包含NHS基团以用于胺的衍生化,以及可以用于在适配体中引入生物素基团,从而允许适配体在测定方法中较晚地释放。Aptamers can be constructed to facilitate separation of assay components from aptamer-biomarker complexes (or photoaptamer-biomarker covalent complexes), as well as to allow separation of aptamers for detection and/or quantification. In one embodiment, these constructs can include cleavable or releasable elements within the aptamer sequence. In other embodiments, additional functionality can be introduced into the aptamer, such as a labeled or detectable component, a spacer component, or a specific binding tag or immobilization element. For example, the aptamer can include a tag, a label, or a spacer component separating a label from a cleavable moiety that is attached to the aptamer via a cleavable moiety. In one embodiment, the cleavable element is a photocleavable linker. The photocleavable linker can be attached to a biotin moiety and a spacer segment, can include an NHS group for amine derivatization, and can be used to introduce a biotin group into the aptamer, thereby allowing the aptamer to be released later in the assay method.

用溶液中所有测定组分进行的均质测定在检测信号之前不需要分离样品与试剂。这些方法是快速且易于使用的。这些方法基于分子捕获或与其特异性靶标反应的结合试剂产生信号。对于胰腺癌,分子捕获试剂是适配体或抗体等,特异性靶标是表1第2列的胰腺癌生物标记。Homogeneous assays, using all assay components in solution, do not require separation of sample and reagents before signal detection. These methods are rapid and easy to use. These methods generate signals based on molecular capture or binding reagents that react with their specific targets. For pancreatic cancer, the molecular capture reagents are aptamers or antibodies, and the specific targets are pancreatic cancer biomarkers listed in Table 1, column 2.

在一实施方案中,一种信号产生方法利用由于荧光团-标记的捕获试剂与其特异性生物标记靶标的相互作用而导致的各向异性信号改变。当标记的捕获剂与其靶标反应时,增加的分子量导致附着于该复合物的荧光团的旋转运动变得更慢,从而改变各向异性值。通过监测各向异性改变,结合事件可以用于定量测量溶液中的生物标记。其他方法包括荧光偏振测定、分子信标方法、时间分辨荧光猝灭法、化学发光、荧光共振能量转移等。In one embodiment, a signal generation method utilizes the change in anisotropy signal caused by the interaction of a fluorophore-labeled capture agent with its specific biomarker target. When the labeled capture agent reacts with its target, the increased molecular weight causes the rotational motion of the fluorophore attached to the complex to slow down, thereby changing the anisotropy value. By monitoring the change in anisotropy, the binding event can be used to quantitatively measure the biomarker in solution. Other methods include fluorescence polarization assays, molecular beacon methods, time-resolved fluorescence quenching, chemiluminescence, fluorescence resonance energy transfer, and others.

可以用于检测对应于生物学样品中生物标记的生物标记值的基于溶液的示例性适配体测定包括以下步骤:(a)通过使所述生物学样品与适配体接触来制备混合物,所述适配体包含第一标签并具有对所述生物标记的特异性亲和力,其中当所述样品中存在所述生物标记时形成适配体亲和复合物;(b)使所述混合物暴露于包含第一捕获元件的第一固体支持物,并且允许所述第一标签与所述第一捕获元件结合;(c)除去未与所述第一固体支持物结合的混合物的任何组分;(d)使第二标签附着于所述适配体亲和复合物的生物标记组分;(e)从所述第一固体支持物释放所述适配体亲和复合物;(f)使释放的适配体亲和复合物暴露于包含第二捕获元件的第二固体支持物,并且允许所述第二标签与所述第二捕获元件结合;(g)通过分离未复合的适配体与所述适配体亲和复合物来从所述混合物除去任何未复合的适配体;(h)从固体支持物洗脱适配体;以及(i)通过检测所述适配体亲和复合物的适配体组分来检测所述生物标记。An exemplary solution-based aptamer assay that can be used to detect a biomarker value corresponding to a biomarker in a biological sample comprises the following steps: (a) preparing a mixture by contacting the biological sample with an aptamer comprising a first tag and having a specific affinity for the biomarker, wherein an aptamer affinity complex is formed when the biomarker is present in the sample; (b) exposing the mixture to a first solid support comprising a first capture element and allowing the first tag to bind to the first capture element; and (c) removing any components of the mixture that are not bound to the first solid support. ; (d) attaching a second tag to the biomarker component of the aptamer affinity complex; (e) releasing the aptamer affinity complex from the first solid support; (f) exposing the released aptamer affinity complex to a second solid support comprising a second capture element and allowing the second tag to bind to the second capture element; (g) removing any uncomplexed aptamers from the mixture by separating them from the aptamer affinity complex; (h) eluting aptamers from the solid support; and (i) detecting the biomarker by detecting the aptamer component of the aptamer affinity complex.

本领域已知的任何方法可以用于通过检测适配体亲和复合物的适配体组分来检测生物标记值。许多不同的检测方法可以用于检测亲和复合物的适配体组分,例如,杂交测定、质谱分析或QPCR。在一些实施方案中,核酸测序方法可以用于检测适配体亲和复合物的适配体组分,从而检测生物标记值。简单地说,可以使测试样品进行任何种类的核酸测序方法以鉴定和定量测试样品中存在的一种或多种适配体的序列或多个序列。在一些实施方案,序列包括整个适配体分子或者可以用来唯一地鉴定该分子的该分子的任何部分。在其他实施方案中,鉴定序列是添加至适配体的特定序列;这类序列常称为“标签”、“条形码”或“邮政编码”。在一些实施方案中,测序方法包括酶促步骤以扩增适配体序列,或者将任何种类的核酸(包括在任何位置包含化学修饰的RNA和DNA)转化为适合测序的任何其他种类的核酸。Any method known in the art can be used to detect biomarker values by detecting the aptamer component of the aptamer affinity complex. Many different detection methods can be used to detect the aptamer component of the affinity complex, for example, hybridization assays, mass spectrometry, or qPCR. In some embodiments, nucleic acid sequencing methods can be used to detect the aptamer component of the aptamer affinity complex, thereby detecting the biomarker value. Simply put, a test sample can be subjected to any type of nucleic acid sequencing method to identify and quantify the sequence or sequences of one or more aptamers present in the test sample. In some embodiments, the sequence includes the entire aptamer molecule or any portion of the molecule that can be used to uniquely identify the molecule. In other embodiments, the identification sequence is a specific sequence added to the aptamer; such sequences are often referred to as "tags," "barcodes," or "zip codes." In some embodiments, the sequencing method includes an enzymatic step to amplify the aptamer sequence, or to convert any type of nucleic acid (including RNA and DNA containing chemical modifications at any position) into any other type of nucleic acid suitable for sequencing.

在一些实施方案中,测序方法包括一个或多个克隆步骤。在其他实施方案中,测序方法包括没有克隆的直接测序方法。In some embodiments, the sequencing method comprises one or more cloning steps. In other embodiments, the sequencing method comprises a direct sequencing method without cloning.

在一些实施方案中,测序方法包括具有靶向测试样品中的一种或多种适配体的特异性引物的直接方法。在其他实施方案中,测序方法包括靶向测试样品中的所有适配体的鸟枪法。In some embodiments, the sequencing method comprises a direct approach with specific primers targeting one or more aptamers in the test sample. In other embodiments, the sequencing method comprises a shotgun approach targeting all aptamers in the test sample.

在一些实施方案中,测序方法包括酶促步骤以扩增测序靶向的分子。在其他实施方案中,测序方法直接测序单一分子。可以用来检测对应于生物学样品中的生物标记的生物标记值的示例性基于核酸测序的方法包括以下步骤:(a)通过酶促步骤将包含化学修饰的核苷酸转化为未修饰的核酸;(b)用大规模平行测序平台鸟枪测序所得的未修饰的核酸,例如454测序系统(454 Life Sciences/Roche)、Illumina测序系统(Illumina)、ABI SOLiD测序系统(Applied Biosystems)、HeliScope单分子测序仪(Helicos Biosciences)、或Pacific Biosciences实时单分子测序系统(Pacific BioSciences)或Polonator G测序系统(Dover Systems);以及(c)通过特异性测序和测序计数来鉴定和定量混合物中存在的适配体。In some embodiments, the sequencing method includes an enzymatic step to amplify the molecules targeted for sequencing. In other embodiments, the sequencing method directly sequences a single molecule. An exemplary nucleic acid sequencing-based method that can be used to detect biomarker values corresponding to biomarkers in a biological sample includes the following steps: (a) converting chemically modified nucleotides into unmodified nucleic acids by an enzymatic step; (b) sequencing the unmodified nucleic acids obtained by shotgun sequencing using a massively parallel sequencing platform, such as the 454 Sequencing System (454 Life Sciences/Roche), the Illumina Sequencing System (Illumina), the ABI SOLiD Sequencing System (Applied Biosystems), the HeliScope Single Molecule Sequencer (Helicos Biosciences), or the Pacific Biosciences Real-Time Single Molecule Sequencing System (Pacific BioSciences) or the Polonator G Sequencing System (Dover Systems); and (c) identifying and quantifying the aptamers present in the mixture by specific sequencing and sequencing counts.

使用免疫测定确定生物标记值Determination of biomarker values using immunoassays

免疫测定方法基于抗体与其对应靶标或分析物的反应,并且根据特定测定形式可以检测样品中的分析物。为了改进基于免疫反应性的测定方法的特异性和灵敏性,通常由于单克隆抗体的特异性表位识别而使用单克隆抗体。多克隆抗体由于其与单克隆抗体相比增加的靶标亲和力而成功地用于各种免疫测定中。免疫测定已经设计为用于大范围生物学样品基质。免疫测定形式已经设计为提供定性、半定量和定量结果。Immunoassays are based on the reaction of antibodies with their corresponding targets or analytes, and depending on the specific assay format, can detect analytes in samples. In order to improve the specificity and sensitivity of immunoreactivity-based assays, monoclonal antibodies are typically used due to their specific epitope recognition. Polyclonal antibodies have been successfully used in various immunoassays due to their increased target affinity compared to monoclonal antibodies. Immunoassays have been designed for use with a wide range of biological sample matrices. Immunoassay formats have been designed to provide qualitative, semi-quantitative, and quantitative results.

定量结果通过使用已知浓度的待检测的特定分析物产生的标准曲线来产生。将来自未知样品的应答或信号在标准曲线上作图,并确定该未知样品中对应于靶标的量或值。Quantitative results are generated by using a standard curve generated using known concentrations of the specific analyte to be detected. The response or signal from an unknown sample is plotted against the standard curve and the amount or value corresponding to the target in the unknown sample is determined.

已经设计了许多免疫测定形式。ELISA或EIA可以定量检测分析物。这种方法依赖于标记对分析物或抗体的附着,并且标记组分直接或间接包括酶。ELISA测试可以设计为直接、间接、竞争性或者夹心检测分析物。其他方法依赖于标记,如放射性同位素(I125)或荧光。其他技术包括例如凝集反应、浊度测定法、比浊法、蛋白印迹、免疫沉淀、免疫细胞化学、免疫组织化学、流式细胞术、Luminex测定等(参见ImmunoAssay:A Practical Guide,edited by Brian Law,published by Taylor&Francis,Ltd.,2005edition)。Many immunoassay formats have been designed. ELISA or EIA can quantitatively detect analytes. This method relies on the attachment of a label to the analyte or antibody, and the label component directly or indirectly includes an enzyme. The ELISA test can be designed to detect analytes directly, indirectly, competitively, or sandwich. Other methods rely on labels, such as radioisotopes (I125) or fluorescence. Other techniques include, for example, agglutination reaction, turbidimetry, turbidimetry, Western blotting, immunoprecipitation, immunocytochemistry, immunohistochemistry, flow cytometry, Luminex assays, etc. (see ImmunoAssay: A Practical Guide, edited by Brian Law, published by Taylor & Francis, Ltd., 2005 edition).

示例性测定形式包括酶联免疫吸附测定(ELISA)、放射性免疫测定、荧光、化学发光以及荧光共振能量转移(FRET)或时间分辨的-FRET(TR-FRET)免疫测定。检测生物标记的方法的实例包括生物标记免疫沉淀及随后允许辨别大小和肽水平的定量方法,如凝胶电泳、毛细管电泳、平面电色谱等。Exemplary assay formats include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay, fluorescence, chemiluminescence, and fluorescence resonance energy transfer (FRET) or time-resolved-FRET (TR-FRET) immunoassays. Examples of methods for detecting biomarkers include biomarker immunoprecipitation followed by quantitative methods that allow for the resolution of size and peptide levels, such as gel electrophoresis, capillary electrophoresis, planar electrochromatography, and the like.

检测和/或定量可检测标记或信号产生材料的方法取决于所述标记的性质。由合适的酶催化的反应产物(其中所述可检测标记是酶,见上文)可以是但不限于荧光、发光或放射性的,或者它们可以吸收可见光或紫外光。适合于检测这样的可检测标记的检测仪的实例包括但不限于X光照片、放射性计数器、闪烁计数器、分光光度计、比色计、荧光计、发光计和光密度计。The method for detecting and/or quantitative detectable label or signal generating material depends on the property of the label. The reaction product catalyzed by a suitable enzyme (wherein the detectable label is an enzyme, see above) can be but is not limited to fluorescence, luminescence or radioactivity, or they can absorb visible light or ultraviolet light. The example of the detector suitable for detecting such detectable label includes but is not limited to X-ray photograph, radioactivity counter, scintillation counter, spectrophotometer, colorimeter, fluorometer, luminometer and densitometer.

可以通过允许适当准备、处理和分析反应的任何方式来进行任何检测方法。这可以例如在多孔测定板(如96孔或384孔)中进行,或者使用任何合适的阵列或微阵列进行。可以人工或自动化制备各种试剂的储液,使用能够检测可检测标记的可商购的分析软件、机器人技术和检测仪器自动化进行所有随后的移液、稀释、混合、分配、洗涤、温育、样品读取、数据收集和分析。Any detection method can be performed by any means that allows for appropriate preparation, processing and analysis of the reaction. This can be performed, for example, in a multi-well assay plate (such as a 96-well or 384-well plate) or using any suitable array or microarray. Liquid reservoirs of various reagents can be prepared manually or automatically, and all subsequent pipetting, dilutions, mixing, distribution, washing, incubation, sample reading, data collection and analysis can be automated using commercially available analytical software, robotics and detection instruments that can detect detectable labels.

使用基因表达谱确定生物标记值Determining biomarker values using gene expression profiling

测量生物学样品中的mRNA可以用作检测该生物学样品中相应的蛋白水平的替代。因此,本文所述的任何生物标记或生物标记的组还可以通过检测适当的RNA来检测。Measuring mRNA in a biological sample can be used as a surrogate for detecting the corresponding protein levels in the biological sample.Thus, any biomarker or panel of biomarkers described herein can also be detected by detecting the appropriate RNA.

mRNA表达水平通过逆转录定量聚合酶链式反应(RT-PCR及随后的qPCR)测量。RT-PCR用于从mRNA产生cDNA。cDNA可以用于qPCR测定以随DNA扩增过程的进展而产生荧光。通过与标准曲线比较,qPCR可以产生绝对测量度,如每细胞的mRNA拷贝数。RNA印迹、微阵列、Invader测定以及与毛细管电泳组合的RT-PCR全部已经用于测量样品中mRNA的表达水平。参见Gene Expression Profiling:Methods and Protocols,Richard A.Shimkets,editor,Humana Press,2004.mRNA expression levels are measured by reverse transcription quantitative polymerase chain reaction (RT-PCR followed by qPCR). RT-PCR is used to generate cDNA from mRNA. cDNA can be used in qPCR assays to generate fluorescence as the DNA amplification process progresses. By comparing with a standard curve, qPCR can generate absolute measurements, such as the number of mRNA copies per cell. Northern blots, microarrays, Invader assays, and RT-PCR combined with capillary electrophoresis have all been used to measure mRNA expression levels in samples. See Gene Expression Profiling: Methods and Protocols, Richard A. Shimkets, editor, Humana Press, 2004.

miRNA分子是小RNA,其不编码但是可以调节基因表达。适合测量mRNA表达水平的任何方法均可以用于相应的miRNA。最近,许多实验室已经研究了miRNA作为疾病的生物标记的用途。许多疾病涉及广泛的转录调节,并且毫不意外地发现miRNA可以作为生物标记。miRNA浓度与疾病之间的关联通常不如蛋白水平与疾病之间的关联明确,但是miRNA生物标记值可能是重要的。当然,随着疾病期间任何RNA的不同表达,开发体外诊断产品所面临的问题包括需要miRNA在患病细胞中存活及易于提取以进行分析,或者miRNA被释放进入血液或其他基质中,在此它们必须存活足够长的时间以进行测量。蛋白生物标记具有相似的要求,尽管许多潜在的蛋白生物标记以旁分泌方式在疾病期间于病变和功能部位有意地分泌。许多潜在的蛋白生物标记设计为在合成那些蛋白的细胞外起作用。MiRNA molecules are small RNA molecules that do not encode but can regulate gene expression. Any method suitable for measuring mRNA expression levels can be applied to the corresponding miRNA. Recently, many laboratories have investigated the use of miRNA as disease biomarkers. Many diseases involve extensive transcriptional regulation, and it is not surprising that miRNAs can be found as biomarkers. The correlation between miRNA concentration and disease is generally not as clear as the correlation between protein levels and disease, but the value of miRNA biomarkers can be important. Of course, with the differential expression of any RNA during disease, the development of in vitro diagnostic products faces challenges, including the need for miRNAs to survive in diseased cells and be easily extracted for analysis, or for miRNAs to be released into the blood or other matrices, where they must survive long enough to be measured. Protein biomarkers have similar requirements, although many potential protein biomarkers are intentionally secreted in a paracrine manner at sites of disease and function during disease. Many potential protein biomarkers are designed to act outside the cells that synthesize those proteins.

使用体内分子成像技术检测分子标记Detection of molecular markers using in vivo molecular imaging techniques

任何所述的生物标记(见表1第2列)还可以用于分子成像测试。例如,显像剂可以与任何所述生物标记偶联,这可以用于辅助胰腺癌诊断、监测疾病进展/缓解或转移、监测疾病复发或者监测对治疗的应答等。Any of the biomarkers (see Table 1, column 2) can also be used in molecular imaging tests. For example, an imaging agent can be coupled to any of the biomarkers, which can be used to assist in the diagnosis of pancreatic cancer, monitor disease progression/remission or metastasis, monitor disease recurrence, or monitor response to treatment.

体内成像技术提供了用于确定个体体内特定疾病状态的非侵入性方法。例如,身体的所有部分或者甚至整个身体均可以作为三维图像观察,从而提供关于身体内形态学和结构的有价值的信息。这样的技术可以与检测本文所述的生物标记组合以提供关于个体的癌症状态,特别是胰腺癌状态的信息。In vivo imaging techniques provide non-invasive methods for determining the status of a specific disease within an individual. For example, all parts of the body, or even the entire body, can be viewed as three-dimensional images, providing valuable information about the morphology and structure within the body. Such techniques can be combined with detection of the biomarkers described herein to provide information about an individual's cancer status, particularly pancreatic cancer status.

体内分子成像技术的应用由于该技术的各种进展而得以扩展。这些进展包括新造影剂或标记的开发,如放射性标记和/或荧光标记,其可以在身体内提供强信号;以及开发更强的新成像技术,其可以从身体外部检测和分析这些信号,并且具有足够的灵敏性和精确度以提供有用的信息。造影剂可以在适当的成像系统中观察,从而提供所述造影剂所处位置的身体部分或多个部分的图像。造影剂可以与捕获试剂结合或缔合,例如适配体或抗体,例如和/或结合或缔合肽或蛋白,或寡核苷酸(例如为了检测基因表达),或者复合物,所述复合物含有任何这些物质及一种或多种大分子和/或其他颗粒形式。The use of in vivo molecular imaging techniques has been expanded by various advances in the technology. These advances include the development of new contrast agents or labels, such as radiolabels and/or fluorescent labels, which can provide strong signals inside the body; and the development of new and more powerful imaging technologies that can detect and analyze these signals from outside the body with sufficient sensitivity and accuracy to provide useful information. The contrast agents can be viewed in an appropriate imaging system to provide an image of the part or parts of the body where the contrast agent is located. The contrast agent can be bound or associated with a capture agent, such as an aptamer or antibody, for example, and/or bound or associated with a peptide or protein, or an oligonucleotide (for example, to detect gene expression), or a complex containing any of these substances and one or more macromolecules and/or other particulate forms.

造影剂还是可用于成像的放射性原子的特征。对于闪烁照相研究,合适的放射性原子包括锝-99m或碘-123。其他易于检测的部分包括例如磁共振成像(MRI)的自旋标记物,如碘-123、碘-131、铟-111、氟-19、碳-13、氮-15、氧-17、钆、锰或铁。这样的标记为本领域熟知,并且可以由本领域技术人员容易地选择。Contrast agents are also characterized by radioactive atoms that can be used for imaging. For scintigraphic studies, suitable radioactive atoms include technetium-99m or iodine-123. Other readily detectable moieties include, for example, spin labels for magnetic resonance imaging (MRI), such as iodine-123, iodine-131, indium-111, fluorine-19, carbon-13, nitrogen-15, oxygen-17, gadolinium, manganese, or iron. Such labels are well known in the art and can be easily selected by those skilled in the art.

标准成像技术包括但不限于磁共振成像、计算机断层扫描、正电子发射断层扫描(PET)、单光子发射计算机断层扫描(SPECT)等。对于诊断性体内成像,可用的检测设备的类型是选择指定造影剂的主要因素,如用于靶标(蛋白、mRNA等)的指定放射性核素和特定生物标记。所选的放射性核素通常具有通过指定类型设备可检测的衰变类型。此外,当选择用于体内诊断的放射性核素时,其半衰期应当足够长以允许在靶组织最大吸收时进行检测,但是也应当足够短,以最小化宿主所受的有害辐射。Standard imaging techniques include, but are not limited to, magnetic resonance imaging, computed tomography, positron emission tomography (PET), single photon emission computed tomography (SPECT), and the like. For diagnostic in vivo imaging, the type of detection equipment available is a major factor in selecting a given contrast agent, such as the specified radionuclide and specific biomarker for the target (protein, mRNA, etc.). The selected radionuclide typically has a decay type that is detectable by a specified type of equipment. In addition, when selecting a radionuclide for in vivo diagnosis, its half-life should be long enough to allow detection at the time of maximum uptake by the target tissue, but should also be short enough to minimize harmful radiation exposure to the host.

示例性成像技术包括但不限于PET和SPECT,这是将放射性核素全身(synthetically)或局部地给予个体的成像技术。随后,随时间测量放射性示踪剂的吸收,并用于获得关于靶向的组织与生物标记的信息。由于所用的特定同位素的高能(γ-射线)发射以及用于检测它们的设备的灵敏性和完善(sophistication),可以从身体外部推导出放射性的二维分布。Exemplary imaging techniques include, but are not limited to, PET and SPECT, which are imaging techniques in which a radionuclide is administered systemically (synthetically) or locally to an individual. The absorption of the radiotracer is then measured over time and used to obtain information about the targeted tissue and biomarkers. Due to the high-energy (gamma-ray) emissions of the specific isotopes used and the sensitivity and sophistication of the equipment used to detect them, a two-dimensional distribution of radioactivity can be derived from outside the body.

PET中常用的正电子发射核素包括例如碳-11、氮-13、氧-15和氟-18。通过电子捕获和/或γ-发射衰变的同位素用于SPECT中,并且包括例如碘-123和锝-99m。用锝-99m标记氨基酸的示例性方法是在螯合前体的存在下还原高锝酸盐离子以形成不稳定的锝-99m-前体配合物,其又与双官能修饰的趋化肽的金属结合基团反应,形成锝-99m-趋化肽偶联物。Commonly used positron-emitting nuclides in PET include, for example, carbon-11, nitrogen-13, oxygen-15, and fluorine-18. Isotopes that decay by electron capture and/or gamma emission are used in SPECT and include, for example, iodine-123 and technetium-99m. An exemplary method for labeling amino acids with technetium-99m is to reduce the pertechnetate ion in the presence of a chelating precursor to form an unstable technetium-99m-precursor complex, which in turn reacts with the metal binding group of a bifunctionally modified chemotactic peptide to form a technetium-99m-chemotactic peptide conjugate.

抗体常用于这样的体内成像诊断方法。用于体内诊断的抗体的制备和用途为本领域熟知。可以将特异性结合表1第2列的任何生物标记的标记的抗体注入疑似患有某种类型癌症(如胰腺癌)的个体,并且根据所用的特定生物标记的可检测性来诊断或评价所述个体的疾病状态。如上文所述,使用的标记根据所用的成像形式来选择。标记的定位允许确定癌症的扩散。器官或组织内标记的量还允许确定该器官或组织中癌症的存在与否。Antibodies are commonly used in such in vivo imaging diagnostic methods. The preparation and use of antibodies for in vivo diagnosis are well known in the art. Labeled antibodies that specifically bind to any of the biomarkers listed in Table 1, column 2 can be injected into an individual suspected of having a certain type of cancer (e.g., pancreatic cancer), and the disease state of the individual can be diagnosed or evaluated based on the detectability of the specific biomarker used. As described above, the label used is selected based on the imaging modality used. The positioning of the label allows the spread of the cancer to be determined. The amount of the label within an organ or tissue also allows the presence or absence of cancer in that organ or tissue to be determined.

相似地,适配体可以用于这样的体内成像诊断方法。例如,用于鉴定表1第2列所述的特定生物标记的适配体(并且因此特异性结合该特定生物标记)可以适当地进行标记并注入疑似患有胰腺癌的个体,并且根据该特定生物标记的可检测性来诊断或评价所述个体的胰腺癌状态。如上文所述,使用的标记根据所用的成像形式来选择。标记的定位允许确定癌症的扩散。器官或组织内标记的量还允许确定该器官或组织中癌症的存在与否。适配体定向的显像剂与其他显像剂相比可以具有关于组织渗透、组织分布、动力学、消除、效力和选择性方面独特且有利的特征。Similarly, aptamers can be used in such in vivo imaging diagnostic methods. For example, an aptamer used to identify a specific biomarker described in Table 1, column 2 (and therefore specifically binding to that specific biomarker) can be appropriately labeled and injected into an individual suspected of having pancreatic cancer, and the pancreatic cancer status of the individual can be diagnosed or evaluated based on the detectability of the specific biomarker. As described above, the label used is selected based on the imaging modality used. The localization of the label allows the spread of the cancer to be determined. The amount of label within an organ or tissue also allows the presence or absence of cancer in that organ or tissue to be determined. Aptamer-directed imaging agents can have unique and advantageous characteristics compared to other imaging agents with respect to tissue penetration, tissue distribution, kinetics, elimination, efficacy, and selectivity.

这样的技术还可以任选地用标记的寡核苷酸进行,例如通过用反义寡核苷酸成像检测基因表达。这些方法用于原位杂交,例如用荧光分子或放射性核素作为标记。检测基因表达的其他方法包括例如检测报道基因的活性。Such technology can also optionally be carried out with labeled oligonucleotides, for example, by imaging and detecting gene expression with antisense oligonucleotides. These methods are used for in situ hybridization, for example, using fluorescent molecules or radioactive nuclides as labels. Other methods for detecting gene expression include, for example, detecting the activity of reporter genes.

另一种常见类型的成像技术是光学成像,其中对象体内的荧光信号通过所述对象体外的光学设备检测。这些信号可以是由于实际的荧光和/或生物发光。光学检测设备灵敏性的改进增加了光学成像在体内诊断测定中的应用。Another common type of imaging technique is optical imaging, in which fluorescent signals within a subject are detected by optical devices outside the subject's body. These signals can be due to actual fluorescence and/or bioluminescence. Improvements in the sensitivity of optical detection devices have increased the use of optical imaging in in vivo diagnostic assays.

体内分子生物标记成像的用途日益增加,包括临床试验,例如在新癌症疗法的试验中更快速地测量临床效力,和/或避免对诸如多发性硬化的那些疾病的长期安慰剂治疗,其中这样的长期治疗可能被认为是存在伦理问题的。In vivo molecular biomarker imaging has increasing applications, including in clinical trials, to more rapidly measure clinical efficacy, for example, in trials of new cancer therapies, and/or to avoid long-term placebo treatment for diseases such as multiple sclerosis, where such long-term treatment may be considered ethically problematic.

关于其他技术的综述,参见N.Blow,Nature Methods,6,465-469,2009。For a review of other techniques, see N. Blow, Nature Methods, 6, 465-469, 2009.

使用组织学/细胞学方法确定生物标记值Determination of biomarker values using histological/cytological methods

对于胰腺癌的评价,许多组织样品可用于组织学或细胞学方法。样品选择取决于原发肿瘤位置和转移的部位。例如,在内镜逆行胰胆管造影(ERCP)或内镜超声(EUS)指导的FNA时采集的组织样品(钳夹活检、细针抽吸(FNA)和/或刷检(brush cytology))可以用于组织学。腹水或腹腔灌洗液或胰液可以用于细胞学。本文鉴定的在胰腺癌个体中表现出上调的任何生物标记(见表1第6列)可以用于染色组织学样本作为疾病的指征。For the evaluation of pancreatic cancer, many tissue samples can be used for histological or cytological methods. Sample selection depends on the location of the primary tumor and the site of metastasis. For example, tissue samples (clamp biopsy, fine needle aspiration (FNA) and/or brush cytology) collected during FNA guided by endoscopic retrograde cholangiopancreatography (ERCP) or endoscopic ultrasound (EUS) can be used for histology. Ascites or peritoneal lavage fluid or pancreatic juice can be used for cytology. Any biomarker identified herein that shows upregulation in pancreatic cancer individuals (see Table 1, column 6) can be used to stain histological samples as an indication of the disease.

在一实施方案中,对于相应的生物标记是特异性的一种或多种捕获试剂用于胰细胞样品的细胞学评价,并且可以包括以下一个或多个方面:收集细胞样品、固定细胞样品、脱水、透明(clearing)、将细胞样品固定在显微镜载玻片上、使细胞样品透化、分析物检索处理、染色、脱色、洗涤、封闭以及在缓冲溶液中与一种或多种捕获试剂反应。在另一实施方案中,细胞样品从细胞块(cell block)中产生。In one embodiment, one or more capture reagents specific for the corresponding biomarkers are used for cytological evaluation of pancreatic cell samples and can include one or more of the following: collecting the cell sample, fixing the cell sample, dehydrating, clearing, fixing the cell sample on a microscope slide, permeabilizing the cell sample, analyte retrieval processing, staining, destaining, washing, blocking, and reacting with one or more capture reagents in a buffer solution. In another embodiment, the cell sample is generated from a cell block.

在另一实施方案中,对于相应的生物标记是特异性的一种或多种捕获试剂用于胰腺组织样品的组织学评价,并且可以包括以下一个或多个方面:收集组织样本、固定组织样品、脱水、透明、将组织样品固定在显微镜载玻片上、使组织样品透化、分析物检索处理、染色、脱色、洗涤、封闭、再水合以及在缓冲溶液中与一种或多种捕获试剂反应。在另一实施方案中,固定和脱水用冷冻代替。In another embodiment, one or more capture reagents specific for the corresponding biomarkers are used for histological evaluation of pancreatic tissue samples and can include one or more of the following: collecting the tissue sample, fixing the tissue sample, dehydrating, clearing, mounting the tissue sample on a microscope slide, permeabilizing the tissue sample, processing for analyte retrieval, staining, destaining, washing, blocking, rehydrating, and reacting with the one or more capture reagents in a buffered solution. In another embodiment, fixation and dehydration are replaced by freezing.

在另一实施方案中,使对于相应的生物标记是特异性的一种或多种适配体与组织学或细胞学样品反应,并且可以作为核酸扩增方法中的核酸靶标。合适的核酸扩增方法包括例如PCR、q-β复制酶、滚环扩增、链置换、解旋酶依赖性扩增、环介导的等温扩增、连接酶链式反应以及限制和环化辅助的滚环扩增。In another embodiment, one or more aptamers specific for the corresponding biomarkers are reacted with a histological or cytological sample and can serve as a nucleic acid target in a nucleic acid amplification method. Suitable nucleic acid amplification methods include, for example, PCR, q-beta replicase, rolling circle amplification, strand displacement, helicase-dependent amplification, loop-mediated isothermal amplification, ligase chain reaction, and restriction- and circularization-assisted rolling circle amplification.

在一实施方案中,将对于用于组织学或细胞学评价的相应生物标记是特异性的一种或多种捕获试剂在缓冲溶液中混合,所述缓冲溶液可以包含任何以下成分:封闭材料、竞争剂、去污剂、稳定剂、载体核酸、聚阴离子材料等。In one embodiment, one or more capture reagents specific for the corresponding biomarker for histological or cytological evaluation are mixed in a buffer solution, which may contain any of the following components: blocking materials, competitors, detergents, stabilizers, carrier nucleic acids, polyanionic materials, etc.

“细胞学方案”通常包括样品收集、样品固定(fixation)、样品固定(immobilization)和染色。“细胞制备”可以包括样品收集后的一些处理步骤,包括使用一种或多种慢解离速率的适配体来染色制备的细胞。A "cytology protocol" typically includes sample collection, sample fixation, sample immobilization, and staining. "Cell preparation" may include several processing steps after sample collection, including staining of prepared cells using one or more slow-dissociation-rate aptamers.

样品收集可以包括直接将样品置于未处理的转运容器中,将样品置于含有一些类型的介质的转运容器中,或者将样品直接置于玻片上(固定)而不进行任何处理或固定。Sample collection can include placing the sample directly into an untreated transport container, placing the sample into a transport container containing some type of medium, or placing the sample directly onto a slide (fixing) without any processing or fixing.

样品固定可以通过将一部分收集的样本涂在用聚赖氨酸、明胶或硅烷处理的载玻片上而改进。玻片可以通过在玻片上涂有薄且均匀的细胞层而制备。通常采取小心操作以最小化机械扭转和干燥假象。液体样本可以通过细胞块方法处理。或者,液体样本可以与固定溶液在室温下1:1混合约10分钟。Sample fixation can be improved by applying a portion of the collected sample to a slide treated with polylysine, gelatin, or silane. Slides can be prepared by coating a thin, uniform layer of cells on the slide. Care is usually taken to minimize mechanical distortion and drying artifacts. Liquid samples can be processed using the cell block method. Alternatively, liquid samples can be mixed with a fixative solution in a 1:1 ratio at room temperature for approximately 10 minutes.

细胞块可以从剩余的积液、痰、尿液沉淀、胃肠液、细胞刮取物或细针抽吸物中制备。通过离心或膜过滤浓缩或压实细胞。已经开发了许多细胞块制备方法。代表性方法包括固定的沉淀、细菌琼脂或膜过滤方法。在固定的沉淀方法中,将细胞沉淀与诸如鲍音液(Bouins)、苦味酸或缓冲的福尔马林的固定剂混合,然后将混合物离心以沉淀固定的细胞。除去上清,尽可能完全地干燥细胞团块(pellet)。收集团块并包在镜头纸中,然后置于组织盒(tissue cassette)中。将组织盒置于包含其他固定剂的罐子中并作为组织样品进行处理。琼脂方法与上述方法非常相似,只是取出团块并在纸巾上干燥,然后切成两半。将切面置于载玻片上一滴熔化的琼脂中,然后将该团块用琼脂包被,保证琼脂中无气泡形成。使琼脂变硬,然后除去任何过多的琼脂。将其置于组织盒中,完成组织处理。或者,可以将团块直接悬浮于在65℃的2%液体琼脂中并离心样品。使琼脂细胞团块在4℃下固化1小时。可以从离心管中取出固体琼脂并切成两半。将琼脂包在滤纸中,然后置于组织盒中。从这点开始的处理与上述方法相同。在任何这些方法中可以用膜过滤代替离心。任何这些方法均可以用于产生“细胞块样品”。Cell blocks can be prepared from residual effusions, sputum, urine sediment, gastrointestinal fluids, cell scrapings, or fine needle aspirates. Cells are concentrated or compacted by centrifugation or membrane filtration. Many methods for preparing cell blocks have been developed. Representative methods include fixed precipitation, bacterial agar, or membrane filtration methods. In the fixed precipitation method, the cell pellet is mixed with a fixative such as Bouins solution, picric acid, or buffered formalin, and the mixture is then centrifuged to precipitate the fixed cells. The supernatant is removed and the cell pellet is dried as completely as possible. The pellet is collected and wrapped in lens tissue and then placed in a tissue cassette. The tissue cassette is placed in a jar containing another fixative and processed as a tissue sample. The agar method is very similar to the above method, except that the pellet is removed and dried on a paper towel, then cut in half. The cut surface is placed in a drop of molten agar on a glass slide, and the pellet is then coated with agar, ensuring that no bubbles form in the agar. Allow the agar to harden, then remove any excess agar. Place in a tissue cassette to complete tissue processing. Alternatively, the mass can be suspended directly in 2% liquid agar at 65°C and the sample centrifuged. Allow the agar cell mass to solidify at 4°C for 1 hour. The solid agar can be removed from the centrifuge tube and cut in half. Wrap the agar in filter paper and place in a tissue cassette. From this point on, the processing is the same as the above method. Membrane filtration can be used instead of centrifugation in any of these methods. Any of these methods can be used to produce a "cell block sample."

细胞块可以使用专门的树脂制备,包括Lowicryl树脂、LR White、LR Gold、Unicryl和MonoStep。这些树脂具有低粘度,并且可以在低温下及用紫外(UV)光聚合。包埋方法依赖于在脱水期间逐渐冷却样品,将样品转移至树脂以及于最终低温下在合适的UV波长处聚合细胞块。Cell blocks can be prepared using specialized resins, including Lowicryl resin, LR White, LR Gold, Unicryl, and MonoStep. These resins have low viscosity and can be polymerized at low temperatures and with ultraviolet (UV) light. The embedding method relies on gradually cooling the sample during dehydration, transferring the sample to the resin, and finally polymerizing the cell block at a low temperature using an appropriate UV wavelength.

细胞块切片可以用苏木精-伊红染色以进行细胞形态学检查,而其他切片用于特异性标记检查。Cell block sections can be stained with hematoxylin-eosin for examination of cell morphology, while other sections are examined for specific markers.

无论方法是细胞学方法或组织学方法,可以在进一步处理之前将样品固定以防止样品降解。这种方法称作“固定”,并且描述了可以互换使用的许多材料和方法。基于待检测的靶标和待分析的特定细胞/组织类型,根据经验最佳地选择样品固定方案和试剂。样品固定依赖于试剂,如乙醇、聚乙二醇、甲醇、福尔马林或异丙醇。样品应当尽可能在收集及附着在玻片上后很快固定。然而,所选的固定剂可以在各种分子靶标中引入结构改变,这使得随后更难以检测。固定(fixation)和固定(immobilization)方法及其顺序可以改变细胞的外观,并且这些改变必须是由细胞学技术人员预期及认可的。固定剂可以导致某些类型细胞收缩,并且导致细胞质出现颗粒或网状物。许多固定剂通过使细胞组分交联而起作用。这可以破坏或改变特异性表位,产生新表位,导致分子缔合以及降低膜通透性。福尔马林固定是一种最常用的细胞学/组织学方法。福尔马林在相邻蛋白之间或在蛋白内形成甲基桥。沉淀或凝固也用于固定,乙醇常用于这种类型的固定。交联与沉淀的组合也可以用于固定。牢固的固定方法在保留形态学信息方面是最佳的,而较弱的固定方法对于保留分子靶标方面是最佳的。Whether the method is cytological or histological, the sample can be fixed before further processing to prevent degradation. This method is called "fixation" and describes a number of materials and methods that can be used interchangeably. The sample fixation protocol and reagents are optimally selected empirically based on the target to be detected and the specific cell/tissue type to be analyzed. Sample fixation relies on reagents such as ethanol, polyethylene glycol, methanol, formalin, or isopropanol. The sample should be fixed as soon as possible after collection and attachment to the slide. However, the selected fixative can introduce structural changes in various molecular targets, making subsequent detection more difficult. The fixation and immobilization methods and their sequence can change the appearance of the cells, and these changes must be anticipated and recognized by the cytographer. Fixatives can cause certain cell types to shrink and result in the appearance of granules or a network of cytoplasm. Many fixatives work by cross-linking cellular components. This can destroy or alter specific epitopes, create new epitopes, cause molecular associations, and reduce membrane permeability. Formalin fixation is one of the most commonly used cytological/histological methods. Formalin forms methyl bridges between adjacent proteins or within proteins. Precipitation or coagulation can also be used for fixation, with ethanol being a common method. A combination of cross-linking and precipitation can also be used for fixation. Strong fixation methods are best for preserving morphological information, while weaker fixation methods are best for preserving molecular targets.

代表性固定剂是50%无水乙醇、2mM聚乙二醇(PEG)、1.85%甲醛。这种制剂的变化包括乙醇(50%-95%)、甲醇(20%-50%)以及仅福尔马林(甲醛)。另一种常用的固定剂是2%PEG 1500、50%乙醇以及3%甲醇。将玻片在室温下置于固定剂中约10-15分钟,然后取出并干燥。一旦玻片被固定,可以用诸如PBS的缓冲溶液对其进行漂洗。A typical fixative is 50% absolute ethanol, 2 mM polyethylene glycol (PEG), 1.85% formaldehyde. Variations of this formulation include ethanol (50%-95%), methanol (20%-50%), and formalin (formaldehyde) alone. Another commonly used fixative is 2% PEG 1500, 50% ethanol, and 3% methanol. The slides are placed in the fixative at room temperature for approximately 10-15 minutes, then removed and dried. Once the slides are fixed, they can be rinsed with a buffer solution such as PBS.

许多染料可以用于差异地突出和反差或“染色”细胞、亚细胞和组织特征或形态学结构。苏木精(hematoylin)用于将核染色为蓝色或黑色。橘黄G-6和天青伊红(EosinAzure)均将细胞质染色。橘黄G将含有角蛋白和糖原的细胞染成黄色。伊红Y用于将核仁、纤毛、红细胞和表面上皮扁平细胞染色。罗曼诺夫斯基(Romanowsky)染色用于空气干燥的玻片,并且可以用于增强复型及区分细胞外与细胞质内材料。Many dyes can be used to differentially highlight and contrast or "stain" cellular, subcellular and tissue features or morphological structures. Hematoylin is used to stain nuclei blue or black. Tangerine G-6 and Eosin Azure both stain the cytoplasm. Tangerine G stains cells containing keratin and glycogen yellow. Eosin Y is used to stain nucleoli, cilia, red blood cells and surface epithelial squamous cells. Romanowsky stain is used on air-dried slides and can be used to enhance replication and distinguish extracellular from intracytoplasmic material.

染色方法可以包括增加细胞对染色的通透性的处理。用去污剂处理细胞可以用于增加通透性。为了增加细胞和组织通透性,可以将固定的样品用溶剂、皂苷类或非离子型去污剂进一步处理。酶促消化还可以改进组织样品中特异性靶标的可接近性。Staining methods can include treatments that increase the permeability of cells to staining. Treatment of cells with detergents can be used to increase permeability. To increase cell and tissue permeability, fixed samples can be further treated with solvents, saponins, or nonionic detergents. Enzymatic digestion can also improve the accessibility of specific targets in tissue samples.

染色后,使用渐增的醇浓度进行连续醇漂洗将样品脱水。最终的洗涤使用二甲苯或诸如柑桔萜的二甲苯取代物,其具有接近在载玻片上应用的盖玻片的折射率。这个最后的步骤称作透明。一旦使样品脱水及透明,应用封固剂。所选的封固剂具有接近玻璃的折射率,并且能够使盖玻片与载玻片粘合。其还抑制细胞样品另外的干燥、收缩或褪色。After staining, the sample is dehydrated using a series of alcohol rinses with increasing alcohol concentrations. The final wash uses xylene or a xylene substitute such as citrus terpenes, which has a refractive index close to that of the coverslip applied to the slide. This final step is called clearing. Once the sample is dehydrated and cleared, a mounting medium is applied. The selected mounting medium has a refractive index close to that of glass and allows the coverslip to adhere to the slide. It also inhibits further drying, shrinking, or fading of the cell sample.

无论使用的染色或处理,对胰细胞学样本的最后评价通过一些类型的显微镜检查进行以允许通过肉眼观察形态学并确定标记的存在与否。示例性显微镜检查方法包括明视野显微镜、相差显微镜、荧光显微镜和微分干涉相差显微镜方法。Regardless of the stain or treatment used, final evaluation of the pancreatic cytology specimen is performed by some type of microscopy to allow visual observation of morphology and determination of the presence or absence of markers. Exemplary microscopy methods include bright field microscopy, phase contrast microscopy, fluorescence microscopy, and differential interference contrast microscopy.

如果在检查后需要对样品进行次级测试,则可以除去盖玻片并对载玻片进行脱色。脱色包括使用用于染色该载玻片的最初未加入染料的原始溶剂系统,并以与原始染色程序相反顺序进行。脱色还可以通过将该载玻片浸泡在酸醇中直至细胞无色来完成。一旦无色,则将载玻片用水浴充分漂洗并进行第二染色程序。If secondary testing of the sample is required after examination, the coverslip can be removed and the slide destained. Destaining involves using the original solvent system used to stain the slide, without the dye, and reversing the original staining procedure. Destaining can also be accomplished by soaking the slide in an acid-alcohol solution until the cells are colorless. Once colorless, the slide is rinsed thoroughly in a water bath and subjected to a secondary staining procedure.

此外,通过使用特异性分子试剂,如抗体或者核酸探针或适配体,可以将特异性分子区分与细胞形态学分析组合。这改进了诊断细胞学的精确性。显微切割可以用于分离细胞的子集以进行另外的评价,特别是用于遗传学评价异常染色体、基因表达或突变。In addition, by using specific molecular reagents, such as antibodies or nucleic acid probes or aptamers, specific molecular differentiation can be combined with cell morphological analysis. This improves the accuracy of diagnostic cytology. Microdissection can be used to separate subsets of cells for additional evaluation, particularly for genetic evaluation of abnormal chromosomes, gene expression or mutations.

制备用于组织学评价的组织样品包括固定、脱水、浸润(infiltration)、包埋和切片。用于组织学的固定试剂与用于细胞学的固定试剂非常相似或相同,并且在以诸如个体蛋白的分子为代价的情况中具有相同的保持形态学特征的问题。如果组织样品不进行固定和脱水而是代之以冷冻然后在冷冻时切片可以节省时间。这是更温和的处理程序,并且可以保留更多的个体标记。然而,冷冻对于组织样品的长期保存不可接受,因为由于冰晶体的引入引起亚细胞信息丧失。冷冻组织样品中的冰也妨碍切片过程产生极薄的切片,并且因此可以丧失一些显微镜分辨力和亚细胞结构的图像。除了福尔马林固定之外,四氧化锇也用于固定和染色磷脂(膜)。Preparation of tissue samples for histological evaluation includes fixation, dehydration, infiltration, embedding and sectioning. The fixing agents used for histology are very similar or identical to those used for cytology, and have the same problem of maintaining morphological characteristics at the expense of molecules such as individual proteins. If the tissue sample is not fixed and dehydrated but is instead frozen and then sectioned when frozen, time can be saved. This is a gentler processing procedure, and more individual markers can be retained. However, freezing is unacceptable for the long-term preservation of tissue samples because the introduction of ice crystals causes the loss of subcellular information. The ice in the frozen tissue sample also hinders the sectioning process to produce extremely thin slices, and therefore can lose some microscope resolution and the image of subcellular structure. Except formalin fixation, osmium tetroxide is also used for fixing and staining phospholipids (membranes).

组织的脱水是通过用渐增浓度的醇连续洗涤来完成。透明使用可以与醇和包埋材料混溶的材料,并且包括从50:50醇:澄清试剂开始至100%澄清试剂(二甲苯或二甲苯取代物)的逐步处理过程。浸润包括将组织与液体形式的包埋剂(温热的蜡,硝化纤维素溶液)一起温育,首先是50:50包埋剂:澄清剂,随后是100%包埋剂。包埋通过将组织置于模具或盒中并充填熔化的包埋剂如蜡、琼脂或明胶来完成。使包埋剂硬化。然后将硬化的组织样品切成薄切片以用于染色和随后的检查。Dehydration of the tissue is accomplished by continuous washing with increasing concentrations of alcohol. Transparency uses materials that are miscible with alcohol and embedding materials and includes a step-by-step process starting with 50:50 alcohol: clarifier to 100% clarifier (xylene or xylene substitute). Infiltration involves incubating the tissue with a liquid embedding medium (warm wax, nitrocellulose solution), first with a 50:50 embedding medium: clarifier and then with 100% embedding medium. Embedding is accomplished by placing the tissue in a mold or cassette and filling it with a molten embedding medium such as wax, agar or gelatin. The embedding medium is allowed to harden. The hardened tissue sample is then cut into thin sections for staining and subsequent examination.

在染色之前,将组织切片脱蜡并再水合。用二甲苯使切片脱蜡,可以更换一次或多次二甲苯,并且通过在递减浓度的醇中连续洗涤来再水合。在脱蜡之前,可以将组织切片于约80℃下在载玻片上热固定约20分钟。Prior to staining, tissue sections are dewaxed and rehydrated. Sections are dewaxed with xylene, which may be replaced one or more times, and rehydrated by successive washes in decreasing concentrations of alcohol. Prior to dewaxing, tissue sections may be heat-fixed on glass slides at approximately 80°C for approximately 20 minutes.

激光捕获显微切割允许从组织切片分离细胞的子集以进行进一步分析。Laser capture microdissection allows the isolation of subsets of cells from tissue sections for further analysis.

在细胞学中,为了增强显微特征的观察,可以将组织切片或薄片用各种染色方法染色。许多可商购的染色方法可以用于增强或鉴定特定的特征。In cytology, tissue sections or slices can be stained with various stains to enhance the observation of microscopic features. Many commercially available stains can be used to enhance or identify specific features.

为了进一步增加分子试剂与细胞学/组织学样品的相互作用,已经开发了许多“分析物检索(analyte retrieval)”技术。第一种这样的技术使用高温加热固定的样品。这种方法也称作热诱导的表位检索或HIER。已经使用了许多加热技术,包括蒸汽加热、微波、高压蒸汽、水浴以及加压蒸煮或这些加热方法的组合。分析物检索溶液包括例如水、柠檬酸盐和普通盐水缓冲液。分析物检索的关键是高温的时间,但是较低温度进行较长时间也已经成功使用。分析物检索的另一关键是加热溶液的pH。据发现低pH提供最佳的免疫染色,但是也产生经常需要使用第二组织切片作为阴性对照的背景。无论缓冲液组成,使用高pH溶液通常获得最一致的益处(增加免疫染色而不增加背景)。对特异性靶标的分析物检索方法根据经验对使用加热的靶标、时间、pH和缓冲液组成的变量加以优化。使用微波分析物检索方法允许用抗体试剂顺序染色不同的靶标。但是在染色步骤之间获得抗体与酶复合物所需的时间也证实使细胞膜分析物降解。微波加热方法也改进原位杂交方法。To further increase the interaction of molecular reagents with cytological/histological samples, a number of "analyte retrieval" techniques have been developed. The first such technique uses high-temperature heating of fixed samples. This method is also known as heat-induced epitope retrieval or HIER. Many heating techniques have been used, including steam heating, microwaves, high-pressure steam, water baths, and pressure cooking or combinations of these heating methods. Analyte retrieval solutions include, for example, water, citrate, and normal saline buffer. The key to analyte retrieval is the time at high temperature, but lower temperatures for longer times have also been successfully used. Another key to analyte retrieval is the pH of the heated solution. It has been found that low pH provides the best immunostaining, but also produces background that often requires the use of a second tissue section as a negative control. Regardless of the buffer composition, the use of a high pH solution generally yields the most consistent benefits (increased immunostaining without increasing background). Analyte retrieval methods for specific targets are empirically optimized based on the variables of target, time, pH, and buffer composition used for heating. The use of microwave analyte retrieval methods allows for sequential staining of different targets with antibody reagents. However, the time required to obtain the antibody-enzyme complex between staining steps has also been shown to degrade cell membrane analytes. Microwave heating methods have also improved in situ hybridization methods.

为了开始分析物检索过程,首先将切片脱蜡并水合。然后将玻片置于平皿或罐子中的10mM柠檬酸钠缓冲液pH 6.0中。代表性程序使用1100W微波,以100%功率对玻片微波处理2分钟,随后在确保玻片保留覆盖于液体中之后使用20%功率对玻片微波处理18分钟。然后使玻片在敞口容器中冷却,随后用蒸馏水漂洗。HIER可以与酶促消化组合使用以改进靶标对免疫化学试剂的反应性。To begin the analyte retrieval process, the sections are first dewaxed and hydrated. The slides are then placed in 10mM sodium citrate buffer pH 6.0 in a dish or jar. A representative procedure uses an 1100W microwave, microwaved at 100% power for 2 minutes, then microwaved at 20% power for 18 minutes after ensuring the slides remain covered in liquid. The slides are then cooled in an open container and subsequently rinsed with distilled water. HIER can be used in combination with enzymatic digestion to improve the reactivity of the target to immunochemical reagents.

一种这样的酶促消化方案使用蛋白酶K。20g/ml浓度的蛋白酶K在50mM Tris碱、1mM EDTA、0.5%Triton X-100、pH 8.0缓冲液中制备。该方法首先包括将切片在更换2次的二甲苯中脱蜡,每次5分钟。然后将样品在更换2次的100%乙醇中水合,每次3分钟,在95%和80%乙醇中水合,每次1分钟,然后在蒸馏水中漂洗。将切片用蛋白酶K工作溶液覆盖,于37C下在加湿室中温育10-20分钟(最佳温育时间可以根据组织类型和固定程度而变化)。将切片在室温下冷却10分钟,然后在PBS吐温(Tween)20中漂洗2次2分钟。如果需要,可以将切片封闭以消除来自内源化合物和酶的潜在干扰。然后将切片用在一抗稀释缓冲液中适当稀释的一抗在室温下温育1小时或者在4C下温育过夜。然后将该切片用PBS吐温20漂洗2次2分钟。如果需要特定的应用,可以进行另外的封闭,随后用PBS吐温20再漂洗3次2分钟,然后最后完成免疫染色方案。One such enzymatic digestion protocol uses Proteinase K. Proteinase K is prepared at a concentration of 20 μg/ml in 50 mM Tris base, 1 mM EDTA, 0.5% Triton X-100, pH 8.0 buffer. The method first involves dewaxing the sections in two changes of xylene for 5 minutes each. The samples are then hydrated in two changes of 100% ethanol for 3 minutes each, and in 95% and 80% ethanol for 1 minute each, followed by rinsing in distilled water. The sections are covered with Proteinase K working solution and incubated at 37°C in a humidified chamber for 10-20 minutes (the optimal incubation time can vary depending on the tissue type and degree of fixation). The sections are cooled at room temperature for 10 minutes and then rinsed twice for 2 minutes in PBS Tween 20. If necessary, the sections can be blocked to eliminate potential interference from endogenous compounds and enzymes. The sections are then incubated with the primary antibody appropriately diluted in primary antibody dilution buffer for 1 hour at room temperature or overnight at 4°C. The sections were then rinsed two times for 2 minutes with PBS Tween 20. If required for a specific application, additional blocking could be performed, followed by three more rinses for 2 minutes with PBS Tween 20 before finalizing the immunostaining protocol.

在室温下用1%SDS简单处理也已经证实改进了免疫组织化学染色。分析物检索方法已经应用于玻片固定切片(slide mounted section)以及自由浮动切片(free floatingsection)。另一处理选择是将玻片置于pH 6.0的含有柠檬酸和0.1诺纳德(Nonident)P40的罐子中,并加热至95℃。然后将该玻片用诸如PBS的缓冲溶液洗涤。Brief treatment with 1% SDS at room temperature has also been shown to improve immunohistochemical staining. The analyte retrieval method has been applied to slide-mounted sections as well as free-floating sections. Another treatment option is to place the slide in a jar containing citric acid and 0.1% Nonident P40 at pH 6.0 and heat to 95°C. The slide is then washed with a buffer solution such as PBS.

对于组织的免疫学染色,可以通过将切片浸入诸如血清或脱脂奶粉的蛋白溶液中来封闭抗体与组织蛋白的非特异性结合。For immunological staining of tissues, nonspecific binding of antibodies to tissue proteins can be blocked by immersing the sections in a protein solution such as serum or nonfat dry milk.

封闭反应可以包括需要降低内源生物素的水平;消除内源电荷作用;失活内源核酸酶;和/或失活内源酶如过氧化物酶和碱性磷酸酶。内源核酸酶可以通过以下方式失活:用蛋白酶K降解;热处理;使用螯合剂,如EDTA或EGTA;引入载体DNA或RNA;用离液剂处理,如尿素、硫脲、盐酸胍、硫氰酸胍、高氯酸锂等或焦碳酸二乙酯。碱性磷酸酶可以通过用0.1NHCl在室温下处理5分钟或用1mM左旋咪唑处理来失活。过氧化物酶活性可以通过用0.03%过氧化氢处理来消除。内源生物素可以通过将玻片或切片在室温下浸入抗生物素蛋白(链霉抗生物素蛋白,可以取代中性链亲和素(neutravidin))溶液中至少15分钟来封闭。然后将玻片或切片在缓冲液中洗涤至少10分钟。这个步骤可以重复至少3次。然后将玻片或切片浸入生物素溶液中10分钟。这个步骤可以重复至少3次,每次使用新鲜的生物素溶液。重复缓冲液洗涤程序。应当减少封闭方案以防止破坏所关注的细胞或组织结构或者靶标或多个靶标,但是可以组合一种或多种这样的方案以“封闭”玻片或切片,然后与一种或多种慢解离速率适配体反应。参见BasicMedical Histology:the Biology of Cells,Tissuesand Organs,authored by Richard G.Kessel,Oxford University Press,1998。Blocking reactions may involve reducing endogenous biotin levels; eliminating endogenous charge effects; inactivating endogenous nucleases; and/or inactivating endogenous enzymes such as peroxidases and alkaline phosphatase. Endogenous nucleases can be inactivated by degradation with proteinase K; heat treatment; use of chelating agents such as EDTA or EGTA; introduction of vector DNA or RNA; treatment with chaotropic agents such as urea, thiourea, guanidine hydrochloride, guanidine thiocyanate, lithium perchlorate, or diethyl pyrocarbonate. Alkaline phosphatase can be inactivated by treatment with 0.1N HCl for 5 minutes at room temperature or with 1 mM levamisole. Peroxidase activity can be eliminated by treatment with 0.03% hydrogen peroxide. Endogenous biotin can be blocked by immersing slides or sections in an avidin solution (streptavidin, which can replace neutravidin) for at least 15 minutes at room temperature. The slides or sections are then washed in buffer for at least 10 minutes. This step can be repeated at least three times. The slide or section is then immersed in a biotin solution for 10 minutes. This step can be repeated at least three times, using fresh biotin solution each time. The buffer wash procedure is repeated. Blocking protocols should be minimized to prevent damage to the cell or tissue structure or target or targets of interest, but one or more such protocols can be combined to "block" the slide or section prior to reaction with one or more slow-off-rate aptamers. See Basic Medical Histology: the Biology of Cells, Tissues and Organs, authored by Richard G. Kessel, Oxford University Press, 1998.

使用质谱方法确定生物标记值Determination of biomarker values using mass spectrometry

许多质谱仪的配制(configuration)可以用于检测生物标记值。一些类型的质谱仪可以获得或可以用各种配制生产。通常,质谱仪具有以下主要部件:样品入口、离子源、质量分析仪、检测仪、真空系统以及设备控制系统和数据系统。样品入口、离子源和质量分析仪的差异通常限定设备的类型及其能力。例如,入口可以是毛细管柱液体层析源,或者可以是直接探针或镜台(stage)如用于基质辅助激光解吸电离中。常用的离子源是例如电喷射,包括纳米喷射(nanospray)和微喷射(microspray);或者基质辅助激光解吸电离。常用的质量分析仪包括四极滤质器(quadrupole mass filter)、离子阱质量分析仪和飞行时间质量分析仪。其他质谱方法为本领域熟知(参见Burlingame et al.Anal.Chem.70:647R-716R(1998);Kinter and Sherman,New York(2000))。Many mass spectrometer configurations can be used to detect biomarker values. Several types of mass spectrometers are available or can be manufactured with various configurations. Typically, a mass spectrometer has the following main components: a sample inlet, an ion source, a mass analyzer, a detector, a vacuum system, and an instrument control and data system. The differences in the sample inlet, ion source, and mass analyzer often define the type of instrument and its capabilities. For example, the inlet can be a capillary column liquid chromatography source, or it can be a direct probe or stage, such as used in matrix-assisted laser desorption ionization. Common ion sources include, for example, electrospray, including nanospray and microspray, or matrix-assisted laser desorption ionization. Common mass analyzers include quadrupole mass filters, ion trap mass analyzers, and time-of-flight mass analyzers. Other mass spectrometry methods are well known in the art (see Burlingame et al. Anal. Chem. 70:647R-716R (1998); Kinter and Sherman, New York (2000)).

蛋白生物标记和生物标记值可以通过任何以下方式检测和测量:电喷射离子化质谱(ESI-MS)、ESI-MS/MS、ESI-MS/(MS)n、基质辅助激光解吸离子化飞行时间质谱(MALDI-TOF-MS)、表面增强激光解吸/离子化飞行时间质谱分析(SELDI-TOF-MS)、硅表面解吸/离子化(DIOS)、二次离子质谱(SIMS)、四极飞行时间(Q-TOF)、称作ultraflex III TOF/TOF的串联式飞行时间(TOF/TOF)技术、大气压化学离子化质谱(APCI-MS)、APCI-MS/MS、APCI-(MS)N、大气压光电离质谱(APPI-MS)、APPI-MS/MS和APPI-(MS)N、四极质谱、傅里叶变换质谱(FTMS)、定量质谱以及离子阱质谱。Protein biomarkers and biomarker values can be detected and measured by any of the following: electrospray ionization mass spectrometry (ESI-MS), ESI-MS/MS, ESI-MS/(MS)n, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS), surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS), surface desorption/ionization on silicon (DIOS), secondary ion mass spectrometry (SIMS), quadrupole time-of-flight (Q-TOF), tandem time-of-flight (TOF/TOF) technology called ultraflex III TOF/TOF, atmospheric pressure chemical ionization mass spectrometry (APCI-MS), APCI-MS/MS, APCI-(MS)N, atmospheric pressure photoionization mass spectrometry (APPI-MS), APPI-MS/MS and APPI-(MS)N, quadrupole mass spectrometry, Fourier transform mass spectrometry (FTMS), quantitative mass spectrometry, and ion trap mass spectrometry.

样品制备策略用于在对蛋白生物标记进行质谱表征及确定生物标记值之前标记和富集样品。标记方法包括但不限于用于相对和绝对定量的等量异位标签(iTRAQ)和在细胞培养中用氨基酸稳定同位素标记(SILAC)。在质谱分析之前用于选择性富集候选生物标记蛋白样品的捕获试剂包括但不限于适配体、抗体、核酸探针、嵌合物、小分子、F(ab’)2片段、单链抗体片段、Fv片段、单链Fv片段、核酸、凝集素、配体-结合受体、affybodies、纳米抗体、锚蛋白、结构域抗体、可变抗体支架(例如双抗体等)印刷的聚合物、高亲合性多聚体、肽模拟物、拟肽、肽核酸、苏糖核酸、激素受体、细胞因子受体及合成的受体以及这些物质的修饰和片段。Sample preparation strategies are used to label and enrich samples prior to mass spectrometry characterization of protein biomarkers and determination of biomarker values. Labeling methods include, but are not limited to, isobaric tagging for relative and absolute quantification (iTRAQ) and stable isotope labeling with amino acids in cell culture (SILAC). Capture reagents used to selectively enrich candidate biomarker protein samples prior to mass spectrometry analysis include, but are not limited to, aptamers, antibodies, nucleic acid probes, chimeras, small molecules, F(ab')2 fragments, single-chain antibody fragments, Fv fragments, single-chain Fv fragments, nucleic acids, lectins, ligand-binding receptors, affybodies, nanobodies, ankyrins, domain antibodies, polymers printed on variable antibody scaffolds (e.g., diabodies), high-affinity multimers, peptide mimetics, peptidomimetics, peptide nucleic acids, threose nucleic acids, hormone receptors, cytokine receptors, and synthetic receptors, as well as modifications and fragments of these substances.

使用邻位连接技术确定生物标记值Determining biomarker values using proximity ligation technology

邻位连接技术可以用来确定生物标记值。简单地说,使测试样品与一对亲和性探针接触,所述一对亲和性探针可以是一对抗体或一对适配体,这对的每个成员延伸出寡核苷酸。这对亲和性探针的靶标可以是一个蛋白上的两个不同决定簇或者两个不同蛋白中每一个上的一个决定簇,其可以作为同源或异源多聚复合物存在。当探针结合靶决定簇时,寡核苷酸延伸的游离端足够接近以杂交在一起。寡核苷酸延伸的杂交通过常见的连接寡核苷酸来促进,所述连接寡核苷酸在位置足够接近时用于将寡核苷酸延伸连接在一起。一旦探针的寡核苷酸延伸杂交,延伸的末端通过酶促DNA连接而连接在一起。Proximity ligation technology can be used to determine biomarker values. Briefly, a test sample is contacted with a pair of affinity probes, which can be a pair of antibodies or a pair of aptamers, with each member of the pair extending an oligonucleotide. The targets of the affinity probe pair can be two different determinants on a protein or a determinant on each of two different proteins, which can exist as homologous or heterologous multimeric complexes. When the probes bind to the target determinants, the free ends of the oligonucleotide extensions are close enough to hybridize together. Hybridization of the oligonucleotide extensions is promoted by common linker oligonucleotides, which are used to link the oligonucleotide extensions together when they are positioned close enough. Once the oligonucleotide extensions of the probes hybridize, the extended ends are linked together by enzymatic DNA ligation.

每个寡核苷酸延伸包含用于PCR扩增的引物位点。一旦寡核苷酸延伸连接在一起,寡核苷酸形成连续的DNA序列,通过PCR扩增,其显示关于靶蛋白的性质和量的信息,以及当靶决定簇在两个不同蛋白上时关于蛋白-蛋白相互作用的信息。邻位连接可以通过使用实时PCR提供实时蛋白浓度和相互作用信息的高度灵敏和特异性的测定。不结合所关注的决定簇的探针不具有邻近的相应的寡核苷酸延伸,并且不可以进行连接或PCR扩增,导致没有信号产生。In one embodiment, oligonucleotide is extended and comprises the primer site that is used for pcr amplification.In case oligonucleotide extension links together, oligonucleotide forms continuous dna sequence dna, by pcr amplification, it shows the information about the character and amount of target protein, and when target determinant is on two different albumen, about the information of protein-protein interaction.The adjacent position is connected and can provide the highly sensitive and specific mensuration of real-time protein concentration and interaction information by using real-time PCR.Do not have the corresponding oligonucleotide extension of contiguous in conjunction with the probe of the determinant of paying close attention to, and can not be connected or pcr amplification, cause not having signal to produce.

前述测定允许检测可用于诊断胰腺癌的方法的生物标记值,其中所述方法包括在来自个体的生物学样品中检测至少N个生物标记值,所述至少N个生物标记值每个对应于选自表1第2列提供的生物标记的组的生物标记,其中如下文详述,利用生物标记值的分类指示所述个体是否患有胰腺癌。尽管某些所述胰腺癌生物标记可以单独用于检测和诊断胰腺癌,但是本文所述的方法还用于分组胰腺癌生物标记的多个子集,其各自可用作三个或更多个生物标记的组。因此,本申请的各个实施方案提供了包含N个生物标记的组合,其中N是至少三个生物标记。在其他实施方案中,N选自2-65个生物标记中的任意数。应当理解N可以选自任何上述范围以及相似但更高级范围中的任意数。根据本文所述的任何方法,可以单独检测和分类生物标记值,或者可以共同检测和分类生物标记值,例如以多重测定形式。The aforementioned assays allow for the detection of biomarker values useful in methods for diagnosing pancreatic cancer, wherein the method comprises detecting at least N biomarker values in a biological sample from an individual, each of the at least N biomarker values corresponding to a biomarker selected from the group of biomarkers provided in Table 1, Column 2, wherein, as described in detail below, classification of the biomarker values indicates whether the individual has pancreatic cancer. While certain of the pancreatic cancer biomarkers described herein can be used individually to detect and diagnose pancreatic cancer, the methods described herein are also useful for grouping multiple subsets of pancreatic cancer biomarkers, each of which can be used as a panel of three or more biomarkers. Thus, various embodiments of the present application provide combinations comprising N biomarkers, wherein N is at least three biomarkers. In other embodiments, N is selected from any number between 2 and 65 biomarkers. It should be understood that N can be selected from any number within any of the aforementioned ranges, as well as similar but higher order ranges. According to any of the methods described herein, biomarker values can be detected and classified individually, or they can be detected and classified together, for example, in a multiplex assay format.

在另一方面,本发明提供了检测胰腺癌不存在的方法,所述方法包括在来自个体的生物学样品中检测至少N个生物标记值,所述至少N个生物标记值每个对应于选自表1第2列提供的生物标记的组的生物标记,其中如下文详述,生物标记值的分类指示所述个体中不存在胰腺癌。尽管某些所述胰腺癌生物标记可以单独用于检测和诊断胰腺癌不存在,但是本文所述的方法还用于分组胰腺癌生物标记的多个子集,其各自可用作三个或更多个生物标记的组。因此,本申请的各个实施方案提供了包含N个生物标记的组合,其中N是至少三个生物标记。在其他实施方案中,N选自2-65个生物标记中的任意数。应当理解N可以选自任何上述范围以及相似但更高级范围中的任意数。根据本文所述的任何方法,可以单独检测和分类生物标记值,或者可以共同检测和分类生物标记值,例如以多重测定形式。In another aspect, the present invention provides a method for detecting the absence of pancreatic cancer, the method comprising detecting at least N biomarker values in a biological sample from an individual, each of the at least N biomarker values corresponding to a biomarker selected from the group of biomarkers provided in Table 1, Column 2, wherein, as described in detail below, classification of the biomarker values indicates the absence of pancreatic cancer in the individual. While certain of the pancreatic cancer biomarkers described herein can be used alone to detect and diagnose the absence of pancreatic cancer, the methods described herein are also useful for grouping multiple subsets of pancreatic cancer biomarkers, each of which can be used as a panel of three or more biomarkers. Thus, various embodiments of the present application provide a combination comprising N biomarkers, wherein N is at least three biomarkers. In other embodiments, N is selected from any number between 2 and 65 biomarkers. It should be understood that N can be selected from any number within any of the aforementioned ranges, as well as similar but higher order ranges. According to any of the methods described herein, biomarker values can be detected and classified individually, or they can be detected and classified together, for example, in a multiplex assay format.

生物标记分类和疾病评分计算Biomarker classification and disease score calculation

给定诊断测试的生物标记“特征”含有标记的集合,每个标记在所关注群体中具有不同水平。在这方面,不同水平可以指针对两个或更多个组中个体的标记水平的不同平均值(mean),或者两个或更多个组中的不同的方差,或者这两者的组合。对于最简单形式的诊断测试,这些标记可以用于将来自个体的未知样品分配到两组中的一组中,疾病组或非疾病组。将样品分配于两个或更多个组中的一组称为分类,用于实现这种分配的程序称为分类器或分类方法。分类方法也可以称为评分方法。有许多分类方法可以用于从生物标记值的集合构建诊断分类器。通常,分类方法最容易用监督学习技术进行,其中用获得自希望区分的两个(或更多个,对于多个分类状态)不同组内的个体的样品收集数据集合。因为每个样品所属的类别(组或群体)事先对于每个样品均是已知的,所以可以训练分类方法以获得期望的分类应答。还可以使用无监督学习技术来产生诊断分类器。The biomarker "signature" of a given diagnostic test comprises a collection of markers, each of which has different levels in the population of interest. In this regard, different levels can refer to different means of marker levels for individuals in two or more groups, or different variances in two or more groups, or a combination of both. In the simplest form of a diagnostic test, these markers can be used to assign an unknown sample from an individual to one of two groups: a disease group or a non-disease group. Assigning a sample to one of two or more groups is called classification, and the procedure used to achieve this assignment is called a classifier or classification method. Classification methods can also be referred to as scoring methods. There are many classification methods that can be used to construct a diagnostic classifier from a collection of biomarker values. Generally, classification methods are most easily performed using supervised learning techniques, in which a data set is collected using samples obtained from individuals in two (or more, for multiple classification states) different groups that one wishes to distinguish. Because the category (group or population) to which each sample belongs is known in advance for each sample, the classification method can be trained to obtain the desired classification response. Unsupervised learning techniques can also be used to generate diagnostic classifiers.

开发诊断分类器的常用方法包括决策树;bagging,boosting,forests和随机forests;基于规则推论的学习(rule inference based learning);Parzen窗方法(ParzenWindows);线性模型;逻辑;神经网络方法;无监督聚类;K-means;分级上升/下降(hierarchical ascending/descending);半监督学习;原型方法;近邻取样(nearestneighbor);核密度估计(kernel density estimation);支持向量机(support vectormachine);隐马尔可夫模型(hidden Markov model);玻尔兹曼学习(BoltzmannLearning);并且分类器可以简单组合或者以最小化特定目标函数的方式组合。综述参见例如Pattern Classification,R.O.Duda,et al.,editors,John Wiley&Sons,2nd edition,2001;还参见The Elements of Statistical Learning-Data Mining,Inference,andPrediction,T.Hastie,et al.,editors,Springer Science+Business Media,LLC,2ndedition,2009;它们均整体援引加入本文。Common methods for developing diagnostic classifiers include decision trees; bagging, boosting, forests and random forests; rule inference based learning; Parzen Windows; linear models; logic; neural network methods; unsupervised clustering; K-means; hierarchical ascending/descending; semi-supervised learning; prototype methods; nearest neighbor sampling; kernel density estimation; support vector machine; hidden Markov model; Boltzmann Learning; and classifiers can be combined simply or in a way to minimize a specific objective function. For a review, see, e.g., Pattern Classification, R.O. Duda, et al., editors, John Wiley & Sons, 2nd edition, 2001; also see The Elements of Statistical Learning - Data Mining, Inference, and Prediction, T. Hastie, et al., editors, Springer Science+Business Media, LLC, 2nd edition, 2009; all of which are incorporated herein by reference in their entirety.

为了用监督学习技术产生分类器,获得称为训练数据的样品集合。在诊断测试的情况下,训练数据包括来自未知样品稍后会被分配的不同组(类别)的样品。例如,收集自对照群体的个体和特定疾病群体的个体的样品可以组成训练数据以开发可以分类未知样品(或者,更特别地,样品所来自的个体)为患有该疾病或无该疾病的分类器。从训练数据开发分类器已知为训练该分类器。分类器训练的具体细节取决于监督学习技术的性质。作为示例,训练朴素贝叶斯(Bayesian)分类器的实例在下文进行描述(参见例如PatternClassification,R.O.Duda,et al.,editors,John Wiley&Sons,2nd edition,2001;还参见The Elements of Statistical Learning-Data Mining,Inference,and Prediction,T.Hastie,et al.,editors,Springer Science+Business Media,LLC,2nd edition,2009)。In order to produce a classifier with supervised learning technology, a sample set called training data is obtained. In the case of a diagnostic test, training data includes samples of different groups (categories) that will be assigned later from unknown samples. For example, samples collected from the individual of a control group and the individual of a specific disease group can form training data to develop a classifier that can classify unknown samples (or, more particularly, the individual from which the sample is derived) as suffering from the disease or without the disease. Developing a classifier from training data is known as training the classifier. The specific details of classifier training depend on the nature of supervised learning technology. As an example, the example of training a naive Bayesian classifier is described below (see, for example, Pattern Classification, R.O.Duda, et al., editors, John Wiley & Sons, 2nd edition, 2001; also see The Elements of Statistical Learning-Data Mining, Inference, and Prediction, T.Hastie, et al., editors, Springer Science+Business Media, LLC, 2nd edition, 2009).

由于通常在训练集合中存在比样品多得多的潜在生物标记值,所以必须小心避免过拟合。当统计学模型描述随机误差或噪声而非潜在关系时发生过拟合。过拟合可以由各种方式避免,这包括例如限制开发分类器中使用的标记数目,假设标记应答互相独立,限制采用的潜在统计学模型的复杂性,以及保证潜在统计学模型符合数据。Because there are typically many more potential biomarker values in the training set than samples, care must be taken to avoid overfitting. Overfitting occurs when the statistical model describes random error or noise rather than the underlying relationship. Overfitting can be avoided in various ways, including, for example, limiting the number of markers used in developing the classifier, assuming independence of marker responses, limiting the complexity of the underlying statistical model used, and ensuring that the underlying statistical model fits the data.

使用生物标记的集合开发诊断测试的说明性实例包括应用朴素贝叶斯分类器,这是一种基于贝叶斯(Bayes)定理的简单或然性分类器,具有生物标记的严格独立处理。每个生物标记由针对每种类别中测量的RFU值或logRFU(相对荧光单位)值的类别依赖性概率密度函数(pdf)描述。一个类别中的标记的集合的共同pdf(joint pdf)假定为每个生物标记的个体类别依赖性pdf的积。在这种情况下训练朴素贝叶斯分类器意味着分配参数(“参数化”)以表征类别依赖性pdf。类别依赖性pdf的任何潜在模型均可以使用,但是模型应该通常符合在训练集合中观察到的数据。An illustrative example of developing a diagnostic test using a collection of biomarkers involves applying a naive Bayesian classifier, a simple probabilistic classifier based on Bayes' theorem, with strictly independent treatment of the biomarkers. Each biomarker is described by a class-dependent probability density function (pdf) for the RFU values or logRFU (relative fluorescence units) values measured in each class. The joint pdf for the collection of markers in a class is assumed to be the product of the individual class-dependence pdfs for each biomarker. Training a naive Bayesian classifier in this context means assigning parameters ("parameterization") to characterize the class-dependence pdfs. Any potential model for the class-dependence pdfs can be used, but the model should generally fit the data observed in the training set.

具体地,测量疾病类别中生物标记i的值xi的类别依赖性概率写作p(xi|d),并且观察具有值的n个标记的整体朴素贝叶斯概率写作其中各个xi是以RFU或log RFU表示的测量的生物标记水平。对于未知的分类分配通过以下方法来促进:对于相同测量值,计算与不患病(对照)的概率相比的具有测量的的患病概率这些概率的比率通过应用贝叶斯定理从类别依赖性pdf计算,即其中p(d)为适合测试的群体中疾病的发病率。对这一比率的两边取对数并从以上代入朴素贝叶斯类别依赖性概率,获得这种形式已知为log似然比,并简单地表示不患有特定疾病比患有所述疾病的log似然性,并且主要由n个单独生物标记的单独log似然比的总和组成。在其最简单的形式中,未知样品(或者,更特别地,样品所来自的个体)分类为不患有疾病,如果上述比率大于0;而分类为患有疾病,如果所述比率小于0。Specifically, the class-dependent probability of measuring a value x i for biomarker i in a disease class is written as p( xi | d), and the overall naive Bayesian probability of observing n markers with a value is written as where each x i is the measured biomarker level expressed in RFU or log RFU. Class assignment for an unknown is facilitated by calculating the probability of having a measured disease compared to the probability of not having the disease (control) for the same measured value. The ratio of these probabilities is calculated from the class-dependent pdf by applying Bayes' theorem, i.e., where p(d) is the prevalence of the disease in the population suitable for testing. Taking the logarithm of both sides of this ratio and substituting the naive Bayesian class-dependent probability from above yields a value known as the log likelihood ratio, which simply represents the log likelihood of not having a particular disease compared to having the disease, and is essentially composed of the sum of the individual log likelihood ratios for n individual biomarkers. In its simplest form, an unknown sample (or, more specifically, the individual from whom the sample was derived) is classified as not having the disease if the ratio is greater than 0, and as having the disease if the ratio is less than 0.

在一示例性实施方案中,类别依赖性生物标记pdf p(xi|c)和p(xi|d)假定为在测量的RFU值xi中为正态分布或log正态分布,即并且对于具有相似表达。模型的参数化要求估计来自训练数据的每个类别依赖性pdf的两个参数,平均值μ和方差δ2。这可以通过许多方式实现,包括例如最大似然估计、最小平方以及本领域技术人员已知的任何其他方法。将正态分布代入上文所定义的log-似然比,获得以下表达式:In an exemplary embodiment, the class-dependent biomarker pdfs p( xi |c) and p( xi |d) are assumed to be normally distributed or log-normally distributed in the measured RFU values xi, i.e., and for xi with similar expression. Parameterization of the model requires estimating two parameters of each class-dependent pdf from the training data, the mean μ and the variance δ 2 . This can be achieved in many ways, including, for example, maximum likelihood estimation, least squares estimation, and any other method known to those skilled in the art. Substituting the normal distribution into the log-likelihood ratio defined above, the following expression is obtained:

一旦μ和δ2的集合已对来自训练数据的每个类别的每个pdf定义,并且确定群体中的疾病发病率,则贝叶斯分类器完全确定并且可以用于分类具有测量值的未知样品。Once the set of μ and δ 2 has been defined for each pdf for each class from the training data, and the disease incidence in the population is determined, the Bayesian classifier is fully determined and can be used to classify unknown samples with measured values.

朴素贝叶斯分类器的性能取决于用于构建和训练分类器的生物标记的数目和质量。如以下实施例3所定义,单个生物标记根据其KS-距离(柯尔莫可洛夫-斯米洛夫(Kolmogorov-Smirnov))运行。如果分类器性能度量(metric)定义为接受者操作特征曲线(AUC)下的面积,则完美分类器的评分为1,而随机分类器的平均评分为0.5。大小为n和m的两个集合A和B之间的KS-距离的定义为值Dn,m=supx|FA,n(x)-FB,m(x)|,其为两个基于经验的累积分布函数(cdf)之间的最大差异。n个观察Xi的集合A的基于经验的cdf定义为其中为指示物函数,如果Xi<x,其等于1,否则其等于0。通过定义,这个值介于0和1之间,其中1的KS-距离表示基于经验的分布不重叠。The performance of a naive Bayes classifier depends on the number and quality of biomarkers used to construct and train the classifier. As defined in Example 3 below, individual biomarkers are ranked according to their KS-distance (Kolmogorov-Smirnov). If the classifier performance metric is defined as the area under the receiver operating characteristic curve (AUC), a perfect classifier scores 1, while a random classifier scores an average of 0.5. The KS-distance between two sets A and B of size n and m is defined as the value Dn ,m = sup x |FA ,n (x)-FB ,m (x)|, which is the maximum difference between the two empirical cumulative distribution functions (cdfs). The empirical cdf of a set A of n observations Xi is defined as where is the indicator function, which is equal to 1 if Xi < x and equal to 0 otherwise. By definition, this value lies between 0 and 1, where a KS-distance of 1 indicates that the empirical distributions do not overlap.

具有良好KS距离(例如>0.3)的后续标记的加入通常会改善分类性能,如果后续加入的标记独立于第一标记。使用灵敏性加特异性作为分类器评分,用贪婪算法的变体会直接产生许多高评分分类器。(贪婪算法是那些应用了解决问题的元启发式(metaheuristic)算法,这些算法使得本地优化选择在每个阶段都可以找到全方位的最优化。)The addition of subsequent markers with good KS distances (e.g., >0.3) often improves classification performance if the subsequent markers are independent of the first marker. Using sensitivity plus specificity as the classifier score, a variant of the greedy algorithm will directly produce many high-scoring classifiers. (Greedy algorithms are those that apply a metaheuristic to the problem, making local optimization choices at each stage to find the overall optimum.)

此处使用的算法在实施例4中详细描述。简单地说,从潜在生物标记表中产生所有单分析物分类器并加入到列表中。接下来,向每个存储的单分析物分类器进行所有可能的第二分析物的加入,在一个新列表上储存预定数目的最佳评分配对,例如一千个。用这个最佳二-标记分类器的新列表开发所有可能的三标记分类器,再次储存其中最佳的一千个。继续这个过程直至评分进入平稳状态或开始随着额外标记的加入而变差。可以评价那些在收敛后仍保留的高评分分类器对于目标用途的期望性能。例如,在一种诊断应用中,具有高灵敏性和中等特异性的分类器可以比中等灵敏性和高特异性是更期望的。在另一诊断应用中,具有高特异性和中等灵敏性的分类器可以是更期望的。期望的性能水平通常基于在假阳性和假阴性数之间必须进行的交换(trade-off)而选择,所述假阳性和假阴性可以针对特定诊断应用而各自容忍。这种交换通常取决于假阳性或假阴性误差的医学后果。The algorithm used here is described in detail in Example 4. Briefly, all single-analyte classifiers are generated from a table of potential biomarkers and added to a list. Next, all possible second analytes are added to each stored single-analyte classifier, and a predetermined number of the best-scoring pairs, for example, one thousand, are stored in a new list. This new list of the best two-marker classifiers is used to develop all possible three-marker classifiers, and the best one thousand are again stored. This process continues until the scores plateau or begin to deteriorate with the addition of additional markers. The high-scoring classifiers that remain after convergence can be evaluated for their expected performance for the intended application. For example, in one diagnostic application, a classifier with high sensitivity and moderate specificity may be more desirable than one with moderate sensitivity and high specificity. In another diagnostic application, a classifier with high specificity and moderate sensitivity may be more desirable. The desired performance level is typically selected based on the trade-off that must be made between the number of false positives and false negatives that can be tolerated for a particular diagnostic application. This trade-off typically depends on the medical consequences of false positive or false negative errors.

现有技术已知各种其他技术,并且可以用来使用朴素贝叶斯分类器从生物标记列表产生许多潜在分类器。在一实施方案中,所谓的遗传算法可以使用上文定义的适合度评分用于组合不同的标记。遗传算法特别适合开发潜在分类器的大且多样的群体。在另一实施方案中,所谓的蚁群优化(ant colony optimization)可以用于产生分类器的集合。现有技术已知的其他策略也可以采用,包括例如其他进化策略以及模拟退火和其他随机检索方法。还可以采用元启发式方法如和声搜索(harmony search)。Various other techniques are known in the art and can be used to generate a number of potential classifiers from a list of biomarkers using a naive Bayesian classifier. In one embodiment, a so-called genetic algorithm can be used to combine different markers using the fitness score defined above. Genetic algorithms are particularly well suited for developing large and diverse populations of potential classifiers. In another embodiment, so-called ant colony optimization can be used to generate a collection of classifiers. Other strategies known in the art can also be used, including, for example, other evolutionary strategies as well as simulated annealing and other random search methods. Metaheuristic methods such as harmony search can also be used.

示例性实施方案使用表1第2列所列的任意数目的胰腺癌生物标记以各种组合来产生用于检测胰腺癌的诊断测试(如何鉴定这些生物标记的详细描述见实施例2)。在一实施方案中,诊断胰腺癌的方法使用朴素贝叶斯分类方法联合表1第2列所列的任意数目的胰腺癌生物标记。在说明性实例(实施例3)中,用于从GI和正常对照群体检测胰腺癌的最简单的测试可以使用诸如CTSB的单生物标记来构建,CTSB在胰腺癌中差异表达,KS-距离为0.52。使用来自表16的针对CTSB的参数μc,i、σc,i、μd,i和σd,i以及上述log-似然性的等式,可以产生具有0.79的AUC的诊断测试,见表15。这个测试的ROC曲线示于图2。Exemplary embodiments use any number of the pancreatic cancer biomarkers listed in Table 1, column 2, in various combinations to create a diagnostic test for detecting pancreatic cancer (see Example 2 for a detailed description of how these biomarkers were identified). In one embodiment, a method for diagnosing pancreatic cancer uses a naive Bayesian classification method in conjunction with any number of the pancreatic cancer biomarkers listed in Table 1, column 2. In an illustrative example (Example 3), the simplest test for detecting pancreatic cancer from GI and normal control populations can be constructed using a single biomarker such as CTSB, which is differentially expressed in pancreatic cancer with a KS-distance of 0.52. Using the parameters μc ,i , σc ,i, μd ,i, and σd ,i for CTSB from Table 16 and the equation for log-likelihood described above, a diagnostic test with an AUC of 0.79 can be generated, as shown in Table 15. The ROC curve for this test is shown in Figure 2.

例如,具有0.40的KS-距离的生物标记C5a的添加将分类器性能显著改善至0.85的AUC。注意,由两个生物标记构建的分类器的评分不是KS-距离的简单加和;当组合生物标记时KS-距离不是加合性时,要使用许多较弱标记来实现与强标记相同的性能水平。例如,加入第三标记C5将分类器性能增加至0.88的AUC。加入额外的生物标记例如CCL18、CSF1R、KLK7、ETHE1、C5-C6、KLK8和VEGFA产生了一系列胰腺癌测试,总结在表15中,并在图3中显示为一系列ROC曲线。作为分类器构建中使用的分析物数目的函数的分类器评分示于图4。这种示例性10-标记分类器的AUC为0.91。For example, the addition of biomarker C5a, which has a KS-distance of 0.40, significantly improved classifier performance to an AUC of 0.85. Note that the score of a classifier constructed from two biomarkers is not a simple sum of the KS-distances; when combining biomarkers, when the KS-distances are not additive, many weaker markers are used to achieve the same level of performance as a strong marker. For example, the addition of a third marker, C5, increased classifier performance to an AUC of 0.88. The addition of additional biomarkers such as CCL18, CSF1R, KLK7, ETHE1, C5-C6, KLK8, and VEGFA generated a series of pancreatic cancer tests, summarized in Table 15 and shown as a series of ROC curves in Figure 3. The classifier score as a function of the number of analytes used in classifier construction is shown in Figure 4. The AUC of this exemplary 10-marker classifier was 0.91.

表1第2列所列的标记可以许多方式组合以产生用于诊断胰腺癌的分类器。在一些实施方案中,根据所选的特定诊断性能标准,生物标记的组由不同数目的分析物组成。例如,生物标记的某些组合会产生比其他组合更灵敏(或更特异性)的测试。The markers listed in Table 1, column 2, can be combined in many ways to generate a classifier for diagnosing pancreatic cancer. In some embodiments, the panel of biomarkers consists of different numbers of analytes, depending on the specific diagnostic performance criteria selected. For example, certain combinations of biomarkers may produce a more sensitive (or more specific) test than other combinations.

一旦组限定为包括来自表1第2列的生物标记的特定集合,并且从训练数据集合构建分类器,则诊断测试的定义完整。在一实施方案中,用于分类未知样品的程序示于图1A。在另一实施方案中,用于分类未知样品的程序示于图1B。将生物学样品适当稀释,然后进行一个或多个测定以产生用于分类的相关定量生物标记水平。测量的生物标记水平用作分类方法的输入,该方法输出所述样品的分类和任选的评分,这反映了类别分配的置信度。Once the panel is defined to include a specific set of biomarkers from Table 1, column 2, and a classifier is constructed from the training data set, the definition of the diagnostic test is complete. In one embodiment, the procedure for classifying an unknown sample is shown in FIG1A . In another embodiment, the procedure for classifying an unknown sample is shown in FIG1B . The biological sample is appropriately diluted and then subjected to one or more assays to generate relevant quantitative biomarker levels for classification. The measured biomarker levels serve as input to a classification method, which outputs a classification for the sample and an optional score reflecting the confidence of the class assignment.

表1鉴定了可用于诊断胰腺癌的65种生物标记。当与在生物标记发现的尝试中通常发现的相比时,这比预期令人惊讶地高,可能归因于所述研究的规模,其涵盖在几百个个体样品中测量的超过800种蛋白,在一些情况下浓度为低毫微微摩尔范围。据推测,发现的大量生物标记反映了在肿瘤生物学和身体对肿瘤存在的应答中涉及的不同生物化学途径;每种途径和过程均涉及许多蛋白。结果显示没有一小组蛋白的单个蛋白对这样复杂的过程独特提供信息;相反,多种蛋白包括在相关过程中,如凋亡或细胞外基质修复。Table 1 identifies 65 biomarkers that can be used to diagnose pancreatic cancer. This is surprisingly high compared to what is typically found in biomarker discovery attempts, likely due to the scale of the study, which encompassed over 800 proteins measured in hundreds of individual samples, in some cases at concentrations in the low femtomolar range. Presumably, the large number of biomarkers discovered reflects diverse biochemical pathways involved in tumor biology and the body's response to the presence of a tumor; each pathway and process involves numerous proteins. The results suggest that no single protein from a small set of proteins is uniquely informative about such complex processes; rather, multiple proteins are implicated in related processes, such as apoptosis or extracellular matrix repair.

鉴于在所述研究中鉴定的许多生物标记,可以预期能够衍生大量高性能分类器,它们可以用于各种诊断方法。为测试这个观点,用表1的生物标记评价了几万个分类器。如实施例4所述,表1所示的生物标记的许多子集可以组合以产生有用的分类器。例如,提供了用于检测胰腺癌的包含1、2和3个生物标记的分类器的描述。如实施例4所述,用表1的生物标记构建的所有分类器比用“非标记”构建的分类器明显更好。Given the numerous biomarkers identified in this study, it is anticipated that a large number of high-performing classifiers could be derived that could be used in a variety of diagnostic approaches. To test this concept, tens of thousands of classifiers were evaluated using the biomarkers in Table 1. As described in Example 4, many subsets of the biomarkers shown in Table 1 can be combined to generate useful classifiers. For example, descriptions of classifiers containing one, two, and three biomarkers for detecting pancreatic cancer are provided. As described in Example 4, all classifiers constructed using the biomarkers in Table 1 performed significantly better than classifiers constructed using "no markers."

还测试了通过随机排除表1的一些标记获得的分类器的性能,随机排除产生了较小的子集,从这些子集来构建分类器。如实施例4部分3所述,从表1中标记的随机子集构建的分类器的表现类似于用表1中标记的完全列表构建的最佳分类器。The performance of the classifiers obtained by randomly excluding some of the markers in Table 1 was also tested. Random exclusion produced smaller subsets from which classifiers were constructed. As described in Example 4, Part 3, the classifiers constructed from the random subsets of markers in Table 1 performed similarly to the best classifier constructed using the complete list of markers in Table 1.

还测试了通过从10-标记聚集中排除“最佳”单个标记而获得的10-标记分类器的性能。如实施例4部分3所述,未用表1的“最佳”标记构建的分类器也表现良好。表1所列生物标记的许多子集表现接近最佳,甚至在除去表中所列最好的15个标记之后也如此。这暗示任何特定分类器的性能特征可能不是由于生物标记的一些小核心组所致,并且疾病过程可能影响许多生物化学途径,其改变许多蛋白的表达水平。The performance of a 10-marker classifier obtained by excluding the "best" individual markers from the 10-marker pool was also tested. As described in Example 4, Part 3, classifiers constructed without the "best" markers from Table 1 also performed well. Many subsets of the biomarkers listed in Table 1 performed near optimally, even after removing the top 15 markers listed in the table. This suggests that the performance characteristics of any particular classifier may not be due to some small core group of biomarkers, and that disease processes may affect many biochemical pathways, which alter the expression levels of many proteins.

实施例4的结果提示一些可能的结论:首先,大量生物标记的鉴定允许它们聚集成巨大数目的分类器,其提供相似的高性能。第二,分类器可以这样构建:特定生物标记可以以反映冗余的方式取代其他生物标记,所述冗余无疑遍及潜在疾病过程的复杂性。也就是说,由表1中鉴定的任何单个生物标记贡献的有关疾病的信息与由其他生物标记贡献的信息重叠,由此表1中特定生物标记或生物标记的小组无需包括在任何分类器中。The results of Example 4 suggest several possible conclusions: First, the identification of a large number of biomarkers allows them to be aggregated into a vast number of classifiers that provide similarly high performance. Second, classifiers can be constructed such that specific biomarkers can be substituted for other biomarkers in a manner that reflects the redundancy inherent in the complexity of the underlying disease process. That is, the information about the disease contributed by any individual biomarker identified in Table 1 overlaps with the information contributed by other biomarkers, and thus a specific biomarker or panel of biomarkers in Table 1 need not be included in any classifier.

示例性实施方案使用从表16的数据构建的朴素贝叶斯分类器以分类未知样品。图1A和1B中示出了程序。在一实施方案中,将生物学样品任选地稀释并且进行多重适配体测定。将来自所述测定的数据如实施例3所述正态化及校准,将所得生物标记水平用作贝叶斯分类方案的输入。对于每个测量的生物标记单独计算log-似然比,然后求和以产生最终分类评分,其也称为诊断评分。可以报道所得分配及总体分类评分。任选地,还可以报道对每个生物标记水平计算的个体log-似然性风险因子。分类评分计算的细节示于实施例3。An exemplary embodiment uses a naive Bayesian classifier constructed from the data in Table 16 to classify unknown samples. The procedure is illustrated in Figures 1A and 1B. In one embodiment, the biological sample is optionally diluted and subjected to a multiplex aptamer assay. The data from the assay is normalized and calibrated as described in Example 3, and the resulting biomarker levels are used as input to the Bayesian classification scheme. Log-likelihood ratios are calculated individually for each measured biomarker and then summed to produce a final classification score, which is also referred to as a diagnostic score. The resulting assignments and overall classification scores can be reported. Optionally, the individual log-likelihood risk factors calculated for each biomarker level can also be reported. Details of the classification score calculation are shown in Example 3.

试剂盒Reagent test kit

表1第2列的生物标记(以及额外的生物医学信息)的任何组合可以通过使用合适的试剂盒检测,如用于进行本文公开的方法的试剂盒。此外,任何试剂盒可以含有本文所述的一种或多种可检测标记,如荧光部分等。Any combination of the biomarkers listed in Table 1, Column 2 (and additional biomedical information) can be detected using a suitable kit, such as a kit for performing the methods disclosed herein. In addition, any kit can contain one or more detectable labels described herein, such as fluorescent moieties, etc.

如本文进一步描述,在一实施方案中,试剂盒包含:(a)一种或多种捕获试剂(如至少一种适配体或抗体),以检测生物学样品中的一种或多种生物标记,其中所述生物标记包括表1第2列所列的任何生物标记,及任选存在的(b)一种或多种软件或计算机程序产品,以将从其中获得所述生物学样品的个体分类为患有或不患有胰腺癌,或者确定所述个体患有胰腺癌的似然性。或者,除了一种或多种计算机程序产品之外,可以提供人工进行上述步骤的一种或多种操作指南。As further described herein, in one embodiment, the kit comprises: (a) one or more capture reagents (e.g., at least one aptamer or antibody) for detecting one or more biomarkers in a biological sample, wherein the biomarkers include any of the biomarkers listed in Table 1, column 2, and optionally (b) one or more software or computer program products for classifying an individual from whom the biological sample was obtained as having or not having pancreatic cancer, or determining the likelihood that the individual has pancreatic cancer. Alternatively, in addition to one or more computer program products, one or more instructions for manually performing the above steps can be provided.

固体支持物与相应捕获试剂及信号产生材料的组合在本文中称作“检测装置”或“试剂盒”。试剂盒还可以包含使用该装置和试剂、处理样品以及分析数据的使用说明书。此外,试剂盒可以与计算机系统或软件一起使用以分析和报道生物学样品的分析结果。The combination of a solid support with a corresponding capture reagent and signal-generating material is referred to herein as a "detection device" or "kit." The kit may also include instructions for using the device and reagents, processing samples, and analyzing data. In addition, the kit may be used in conjunction with a computer system or software to analyze and report the results of the biological sample analysis.

试剂盒还可以含有一种或多种试剂(如增溶缓冲液、去污剂、洗涤剂或缓冲液),以对生物学样品进行处理。本文所述的任何试剂盒还可以包含例如缓冲液、封闭剂、质谱基质材料、抗体捕获剂、阳性对照样品、阴性对照样品、软件以及信息如方案、指导和参考数据。The kit may also contain one or more reagents (such as solubilization buffer, detergent, washing agent or buffer) to process the biological sample. Any kit described herein may also include, for example, buffer, blocking agent, mass spectrometry matrix material, antibody capture agent, positive control sample, negative control sample, software and information such as protocol, guidance and reference data.

在一方面,本发明提供了分析胰腺癌状态的试剂盒。试剂盒包含用于选自表1第2列的一个或多个生物标记的PCR引物。试剂盒还可以包含使用及生物标记与胰腺癌相关性的指导。试剂盒还可以包含DNA阵列,其含有选自表1第2列的一个或多个生物标记的补体、用于扩增或分离样品DNA的试剂和/或酶。试剂盒可以包含用于实时PCR的试剂,例如TaqMan探针和/或引物,以及酶。In one aspect, the present invention provides a kit for analyzing pancreatic cancer status. The kit comprises PCR primers for one or more biomarkers selected from Table 1, column 2. The kit may also comprise instructions for use and correlation of the biomarkers with pancreatic cancer. The kit may also comprise a DNA array containing a complement of one or more biomarkers selected from Table 1, column 2, reagents and/or enzymes for amplifying or isolating sample DNA. The kit may also comprise reagents for real-time PCR, such as TaqMan probes and/or primers, and enzymes.

例如,试剂盒可以包含:(a)试剂,其至少包含用于定量测试样品中的一种或多种生物标记的捕获试剂,其中所述生物标记包括表1第2列所列的生物标记或本文所述的任何其他生物标记或生物标记组;以及任选存在的(b)一种或多种算法或者计算机程序,以进行以下步骤:比较所述测试样品中定量的每个生物标记的量与一个或多个预定截断值,基于所述比较分配定量的每个生物标记评分,组合定量的每个生物标记的分配评分以获得总评分,比较该总评分与预定评分,以及使用所述比较确定个体是否患有胰腺癌。或者,除了一种或多种算法或计算机程序之外,可以提供人工进行上述步骤的一种或多种操作指南。For example, a kit can include: (a) reagents comprising at least a capture reagent for quantifying one or more biomarkers in a test sample, wherein the biomarkers include the biomarkers listed in Table 1, column 2, or any other biomarker or biomarker panel described herein; and optionally, (b) one or more algorithms or computer programs for performing the following steps: comparing the amount of each biomarker quantified in the test sample to one or more predetermined cutoff values, assigning a score to each biomarker quantified based on the comparison, combining the assigned scores for each biomarker quantified to obtain a total score, comparing the total score to a predetermined score, and determining whether an individual has pancreatic cancer using the comparison. Alternatively, in addition to one or more algorithms or computer programs, one or more instructions for manually performing the above steps can be provided.

计算机方法和软件Computer methods and software

一旦选择生物标记或生物标记组,则诊断个体的方法可包括以下步骤:1)收集或以其他方式获得生物学样品;2)进行分析方法以检测和测量所述生物学样品中的生物标记或生物标记组;3)进行用于收集生物标记值的方法所需的任何数据归一化或标准化;4)计算标记评分;5)组合所述标记评分以获得总诊断评分;以及6)报告所述个体的诊断评分。在这种方法中,诊断评分可以是从所有标记计算的总和确定的单一数值,将该数值与指示疾病存在与否的预设阈值比较。或者,诊断评分可以是一系列带(bar),其各自代表生物标记值,并且可以将应答模式与预设模式比较以确定疾病的存在与否。Once a biomarker or biomarker panel is selected, a method for diagnosing an individual may include the following steps: 1) collecting or otherwise obtaining a biological sample; 2) performing an analytical method to detect and measure the biomarker or biomarker panel in the biological sample; 3) performing any data normalization or standardization required by the method for collecting biomarker values; 4) calculating a marker score; 5) combining the marker scores to obtain an overall diagnostic score; and 6) reporting a diagnostic score for the individual. In this method, the diagnostic score can be a single numerical value determined from the sum of all marker calculations, which is compared to a preset threshold value indicating the presence or absence of a disease. Alternatively, the diagnostic score can be a series of bars, each representing a biomarker value, and the response pattern can be compared to a preset pattern to determine the presence or absence of a disease.

本文所述方法的至少一些实施方案可以使用计算机实施。计算机系统100的实例在图6中示出。参考图6,系统100示出包含通过总线108电耦合的硬件元件,包括处理器101、输入设备102、输出设备103、存储设备104、计算机可读取的存储介质读取器105a、通讯系统106、处理加速(如DSP或特定用途处理器)107和存储器109。计算机可读取存储介质读取器105a与计算机可读取的存储介质105b进一步耦合,该组合全面地代表远程、局域、固定和/或可移动的存储装置加上存储介质、存储器等,以暂时和/或更长久地含有计算机可读取的信息,这可以包括存储设备104、存储器109和/或任何其他这样的可存储系统100资源。系统100还包含软件元件(显示目前位于工作内存191中),包括操作系统192及其他编码193,如程序、数据等。At least some embodiments of the methods described herein can be implemented using a computer. An example of a computer system 100 is shown in FIG6 . Referring to FIG6 , system 100 is shown to include hardware components electrically coupled via bus 108, including a processor 101, an input device 102, an output device 103, a storage device 104, a computer-readable storage medium reader 105a, a communication system 106, a processing acceleration (e.g., a DSP or a special-purpose processor) 107, and a memory 109. The computer-readable storage medium reader 105a is further coupled to a computer-readable storage medium 105b, which collectively represents remote, local, fixed, and/or removable storage devices plus storage media, memory, etc., to temporarily and/or permanently contain computer-readable information. This can include storage device 104, memory 109, and/or any other such storage-capable system 100 resources. System 100 also includes software components (shown currently located in working memory 191), including an operating system 192 and other code 193, such as programs, data, etc.

关于图6,系统100具有广泛的灵活性和可配置性。因此,例如单一计算机结构(single architecture)可以用于完成一个或多个服务器,其可以根据目前期望的方案、方案变化、扩展等进一步配置。然而,本领域技术人员应当了解可以根据更具体的应用要求更好地利用实施方案。例如,一个或多个系统元件可以在系统100部件内作为子元件执行(如在通讯系统106内)。也可以使用定制的硬件和/或特定元件可以在硬件、软件或者硬件和软件中执行。此外,虽然可以使用与其他计算设备如网络输入/输出设备(未示出)连接,但是应当理解也可以利用有线、无线、调制解调器和/或与其他计算设备的其他连接或多个连接。About Fig. 6, system 100 has extensive flexibility and configurability.Therefore, for example, single computer structure (single architecture) can be used to complete one or more servers, which can be further configured according to the scheme, scheme variation, expansion etc. of current expectation. However, it should be understood by those skilled in the art that implementation scheme can be better utilized according to more specific application requirements. For example, one or more system elements can be performed as sub-element (as in communication system 106) in system 100 parts. It is also possible to use customized hardware and/or specific elements to perform in hardware, software or hardware and software. In addition, although it is possible to use other computing devices such as network input/output device (not shown) to be connected, it should be understood that it is also possible to utilize wired, wireless, modem and/or other connections or multiple connections with other computing devices.

在一方面,所述系统可以包含含有胰腺癌特征性生物标记的特征的数据库。生物标记数据(或生物标记信息)可以用作计算机的输入以用作计算机执行方法的一部分。所述生物标记数据可以包括本文所述的数据。In one aspect, the system can include a database containing signatures of biomarkers characteristic of pancreatic cancer. The biomarker data (or biomarker information) can be used as input to a computer for use as part of a computer-implemented method. The biomarker data can include data described herein.

在一方面,所述系统还包含一个或多个装置以将输入数据提供给一个或多个处理器。In one aspect, the system further includes one or more devices to provide input data to the one or more processors.

所述系统还包含用于存储分等级的数据元件的数据集合的存储器。The system also includes a memory for storing a data set of hierarchical data elements.

在另一方面,用于提供输入数据的装置包括检测仪以检测数据元件的特征,如质谱仪或者基因芯片读取器。In another aspect, the means for providing input data includes a detector to detect characteristics of the data element, such as a mass spectrometer or a gene chip reader.

所述系统还可以包含数据库管理系统。用户请求或询问可以通过数据库管理系统理解的适当语言格式化,该数据库管理系统处理所述询问以从训练集合的数据库中提取相关信息。The system may also include a database management system.User requests or queries may be formatted in an appropriate language understood by the database management system, which processes the query to extract relevant information from the database of training sets.

所述系统可以与网络连接,所述网络连接网络服务器以及一个或多个客户端。网络可以是本领域已知的局域网(LAN)或广域网(WAN)。优选地,服务器包括运行计算机程序产品(如软件)所需的硬件,以进入数据库数据来处理用户请求。The system can be connected to a network that connects a network server and one or more clients. The network can be a local area network (LAN) or a wide area network (WAN) as known in the art. Preferably, the server includes the hardware required to run a computer program product (such as software) to access database data to process user requests.

所述系统可以包含操作系统(如UNIX或Linux),以执行来自数据库管理系统的命令。在一方面,操作系统可以在全球通讯网络上运行,如在国际互联网上运行,并且利用全球通讯网络服务器来连接这样的网络。Described system can comprise operating system (as UNIX or Linux), to carry out the order from database management system.On the one hand, operating system can be run on global communication network, as run on the Internet, and utilizes global communication network server to connect such network.

所述系统可以包括这样的一个或多个装置,其包含图形显示界面,该界面包括界面元件如按钮、下拉菜单、滚动条、输入文本的信息栏等,这些是本领域已知的图形用户界面的常见元件。用户界面上登录的请求可以传给系统中的应用程序以格式化从而在一个或多个系统数据库中搜寻相关信息。用户登录的请求或询问可以任何合适的数据库语言建立。The system may include one or more devices comprising a graphical display interface including interface elements such as buttons, pull-down menus, scroll bars, and fields for text input, which are common elements of graphical user interfaces known in the art. Requests logged in the user interface may be transmitted to an application in the system for formatting to search for relevant information in one or more system databases. User login requests or queries may be formulated in any suitable database language.

图形用户界面可以通过作为操作系统一部分的图形用户界面编码产生,并且可以用于输入数据和/或显示输入的数据。处理的数据的结果可以在界面上显示,在与该系统通讯的打印机上打印,存储在存储装置中,和/或上传至网络或者可以计算机可读取介质的形式提供。The graphical user interface may be generated by graphical user interface code as part of an operating system and may be used to input data and/or display the inputted data. The results of the processed data may be displayed on the interface, printed on a printer in communication with the system, stored in a storage device, and/or uploaded to a network or provided in the form of a computer-readable medium.

所述系统可以与输入装置通讯,以将关于数据元件的数据提供给系统(如表达值)。在一方面,输入装置可以包括基因表达谱系统,包括如质谱仪、基因芯片或阵列读取器等。The system can communicate with an input device to provide data about the data element to the system (e.g., expression value). In one aspect, the input device can include a gene expression profiling system, including, for example, a mass spectrometer, a gene chip, or an array reader.

根据各个实施方案分析胰腺癌生物标记信息的方法和设备可以通过任何合适的方式执行,例如使用在计算机系统上运行的计算机程序。可以使用常规的计算机系统,其包含处理器和随机存取存储器,如可远程登录的应用服务器、网络服务器、个人电脑或工作站。其他计算机系统部件可以包括存储装置或信息存储系统,如大量存储系统和用户界面,例如常规的监视器、键盘和跟踪装置。计算机系统可以是单机系统,或者是包含服务器以及一个或多个数据库的计算机网络的一部分。Methods and apparatus for analyzing pancreatic cancer biomarker information according to various embodiments can be implemented in any suitable manner, such as using a computer program running on a computer system. Conventional computer systems comprising a processor and random access memory, such as a remotely accessible application server, network server, personal computer, or workstation, can be used. Other computer system components may include storage devices or information storage systems, such as mass storage systems, and user interfaces, such as conventional monitors, keyboards, and tracking devices. The computer system can be a stand-alone system or part of a computer network comprising a server and one or more databases.

胰腺癌生物标记分析系统可以提供完整数据分析的运算(function)和运行,如数据收集、处理、分析、报告和/或诊断。例如,在一实施方案中,计算机系统可以执行计算机程序,该程序可以接收、储存、搜寻、分析和报告关于胰腺癌生物标记的信息。计算机程序可以包含进行各种运算或运行的多个模块,如处理原始数据和产生补充数据的处理模块,以及分析原始数据和补充数据以产生胰腺癌状态和/或诊断的分析模块。诊断胰腺癌状态可以包括产生或收集任何其他信息,包括额外的生物医学信息、关于个体相对于疾病的状况,鉴定是否需要进一步检测,或者另外评价个体的健康状态。The pancreatic cancer biomarker analysis system can provide comprehensive data analysis functions and operations, such as data collection, processing, analysis, reporting, and/or diagnosis. For example, in one embodiment, a computer system can execute a computer program that can receive, store, search, analyze, and report information related to pancreatic cancer biomarkers. The computer program can include multiple modules that perform various operations or operations, such as a processing module that processes raw data and generates supplemental data, and an analysis module that analyzes the raw data and supplemental data to generate a pancreatic cancer status and/or diagnosis. Diagnosing pancreatic cancer status can include generating or collecting any other information, including additional biomedical information, about the individual's condition relative to the disease, identifying whether further testing is necessary, or otherwise evaluating the individual's health status.

关于图7,可以看出根据公开的实施方案的原理利用计算机的方法的一个实例。在图7中,示出了流程图3000。在模块(block)3004中,可以检索个体的生物标记信息。所述生物标记信息可以从计算机数据库中检索,例如在测试个体的生物学样品之后。生物标记信息可以包含生物标记值,所述生物标记值每个对应于选自由表1第2列提供的生物标记组成的组的至少N个生物标记之一,其中N=2-65。在模块3008中,计算机可以用于将每个生物标记值分类。在模块3012中,基于多个分类可以确定个体患有胰腺癌的似然性。可以将该指示输出至显示器或其他显示装置,以供人观察。因此,例如该指示可以在计算机的显示器屏幕或其他输出装置上显示。With respect to FIG. 7 , an example of a method utilizing a computer according to the principles of the disclosed embodiments can be seen. In FIG. 7 , a flow chart 3000 is shown. In block 3004 , biomarker information for an individual can be retrieved. The biomarker information can be retrieved from a computer database, for example, after testing a biological sample from the individual. The biomarker information can include biomarker values, each of which corresponds to one of at least N biomarkers selected from the group consisting of the biomarkers provided in column 2 of Table 1 , where N = 2-65. In block 3008 , the computer can be configured to classify each biomarker value. In block 3012 , a likelihood that the individual has pancreatic cancer can be determined based on the multiple classifications. This indication can be output to a display or other display device for viewing by a person. Thus, for example, the indication can be displayed on a computer monitor screen or other output device.

现在关于图8,通过流程图3200举例说明了根据另一实施方案利用计算机的可选方法。在模块3204中,可以利用计算机检索个体的生物标记信息。生物标记信息包含生物标记值,所述生物标记值对应于选自表1第2列提供的生物标记的组的生物标记。在模块3208中,可以用计算机将生物标记值进行分类。并且,在模块3212中,基于所述分类可以对个体患有胰腺癌的似然性作出指示。可以将该指示输出至显示器或其他显示装置,以供人观察。因此,例如该指示可以在计算机的显示器屏幕或其他输出装置上显示。Now referring to FIG. 8 , an alternative method utilizing a computer according to another embodiment is illustrated by flowchart 3200. At block 3204, biomarker information for an individual can be retrieved using a computer. The biomarker information includes biomarker values corresponding to biomarkers selected from the group of biomarkers provided in column 2 of Table 1. At block 3208, the biomarker values can be classified by the computer. Furthermore, at block 3212, an indication can be made regarding the likelihood that the individual has pancreatic cancer based on the classification. This indication can be output to a display or other display device for viewing by a person. Thus, for example, the indication can be displayed on a computer monitor screen or other output device.

本文所述的一些实施方案可以执行以包含计算机程序产品。计算机程序产品可以包括具有包含于介质中的计算机可读取程序编码的计算机可读取介质,以使得应用程序可以在具有数据库的计算机上执行。Some embodiments described herein may be implemented to include a computer program product. The computer program product may include a computer-readable medium having computer-readable program code embodied in the medium, such that an application program can be executed on a computer having a database.

如本文所用,“计算机程序产品”指自然或程序设计语言语句形式的组织化的指令集合,其包含于任何性质的物理介质上(如书写、电子、磁性、光学或其他性质),并且可以与计算机或其他自动化数据处理系统一起使用。这样的程序设计语言语句在由计算机或数据处理系统执行时,使得所述计算机或数据处理系统根据语句的特定内容起作用。计算机程序产品包括但不限于:包含于计算机可读取介质中的源代码和目标码和/或测试或数据库中的程序。此外,允许计算系统或数据处理设备以预选方式起作用的计算机程序产品可以以多种形式提供,包括但不限于源代码(original source code)、汇编码(assemblycode)、目标码、机器语言、前述代码的加密或压缩形式以及任何和所有等价物。As used herein, a "computer program product" means an organized set of instructions in the form of natural or programming language statements, embodied on a physical medium of any nature (e.g., written, electronic, magnetic, optical, or other nature), and usable with a computer or other automated data processing system. Such programming language statements, when executed by a computer or data processing system, cause the computer or data processing system to function in accordance with the specific content of the statements. Computer program products include, but are not limited to, source code and object code contained in a computer-readable medium and/or programs in tests or databases. In addition, a computer program product that allows a computing system or data processing device to function in a preselected manner may be provided in a variety of forms, including but not limited to original source code, assembly code, object code, machine language, encrypted or compressed forms of the foregoing, and any and all equivalents.

在一方面,本发明提供了一种指示胰腺癌的似然性的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:检索归因于来自个体的生物学样品的数据的代码,其中所述数据包括生物标记值,所述生物标记值每个对应于所述生物学样品中选自表1第2列提供的生物标记的组的至少N个生物标记之一,其中N=2-65;以及执行分类方法的代码,所述分类方法将所述个体的胰腺癌状态指示为所述生物标记值的函数。In one aspect, the present invention provides a computer program product that indicates the likelihood of pancreatic cancer. The computer program product includes a computer-readable medium containing program code, the program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values, each of the biomarker values corresponding to one of at least N biomarkers in the biological sample selected from the group of biomarkers provided in Table 1, Col. 2, where N = 2-65; and code for executing a classification method that indicates the pancreatic cancer status of the individual as a function of the biomarker values.

在另一方面,本发明提供了一种指示胰腺癌的似然性的计算机程序产品。所述计算机程序产品包括包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:检索归因于来自个体的生物学样品的数据的代码,其中所述数据包括生物标记值,所述生物标记值对应于所述生物学样品中选自表1第2列提供的生物标记的组的生物标记;以及执行分类方法的代码,所述分类方法将所述个体的胰腺癌状态指示为所述生物标记值的函数。In another aspect, the present invention provides a computer program product that indicates the likelihood of pancreatic cancer. The computer program product includes a computer-readable medium containing program code, the program code executable by a processor of a computing device or system, the program code comprising: code for retrieving data attributed to a biological sample from an individual, wherein the data includes biomarker values corresponding to biomarkers in the biological sample selected from the group of biomarkers provided in Table 1, Col. 2; and code for executing a classification method that indicates pancreatic cancer status for the individual as a function of the biomarker values.

虽然已经描述了本发明的方法或设备的各种实施方案,但是应当理解所述实施方案可以通过与计算机耦合的代码执行,如在计算机上或可由计算机登录的代码。例如,软件和数据库可以用于执行许多上述方法。因此,除了由硬件完成的实施方案之外,还应当注意到这些实施方案可以通过使用这样的产品实现,所述产品包含具有在其中包含计算机可读取程序代码的计算机可用介质,其允许行使本文公开的功能。因此,期望所述实施方案也可以被视为由本专利以其程序代码方式等保护。此外,所述实施方案可以体现为存储在实际上任何类型的计算机可读取存储器中的代码,包括但不限于RAM、ROM、磁性介质、光学介质或磁-光学介质。更通常地,所述实施方案可以在软件或硬件或者它们的任何组合中实施,包括但不限于在通用处理器、微代码、PLA或ASIC上运行的软件。Although various embodiments of the method or apparatus of the present invention have been described, it should be understood that the embodiments can be executed by code coupled to a computer, such as code on a computer or that can be logged by a computer. For example, software and databases can be used to perform many of the above methods. Therefore, in addition to the embodiments performed by hardware, it should also be noted that these embodiments can be implemented by using such products, which include computer-usable media with computer-readable program code contained therein, which allow the functions disclosed herein to be exercised. Therefore, it is expected that the embodiments can also be considered to be protected by this patent in the form of its program code, etc. In addition, the embodiments can be embodied as code stored in virtually any type of computer-readable memory, including but not limited to RAM, ROM, magnetic media, optical media or magneto-optical media. More generally, the embodiments can be implemented in software or hardware or any combination thereof, including but not limited to software running on a general-purpose processor, microcode, PLA or ASIC.

还期望所述实施方案可以作为包含于载波中的计算机信号以及通过传送介质传送的信号(如电信号和光信号)实现。因此,上述各种类型的信息均可以在结构中格式化,如数据结构,并且作为电信号通过传送介质传送,或者存储在计算机可读取介质中。It is also contemplated that the embodiments described herein can be implemented as computer signals embodied in carrier waves, as well as signals transmitted via transmission media (e.g., electrical and optical signals). Thus, the various types of information described above can be formatted in a structure, such as a data structure, and transmitted as an electrical signal via a transmission medium, or stored on a computer-readable medium.

还应当注意,本文所列举的许多结构、材料和条文可以列举为用于行使功能或行使功能的步骤的方式。因此,应当理解这样的语言有权覆盖本说明书公开的所有的这些结构、材料或条文以及它们的等价物,包括通过引用并入本文的内容。It should also be noted that many structures, materials and clauses listed herein can be listed as modes for performing functions or steps for performing functions. Therefore, it should be understood that such language is entitled to cover all of these structures, materials or clauses disclosed in this specification and their equivalents, including contents incorporated herein by reference.

关于胰腺癌,上文详细描述了生物标记鉴定过程、本文公开的生物标记的利用以及确定生物标记值的各种方法。然而,所述过程的应用、鉴定的生物标记的用途以及确定生物标记值的方法完全可以应用于其他特定类型的癌症,一般的癌症,任何其他疾病或医学状况,或者鉴定可能或可能不受益于辅助医学治疗的个体。除了当从上下文清楚地看出指与胰腺癌相关的特定结果时,本文中提到胰腺癌可以理解为包括其他类型的癌症、一般的癌症或者任何其他疾病或医学状况。The biomarker identification process, the use of the biomarkers disclosed herein, and various methods for determining biomarker values are described above in detail with respect to pancreatic cancer. However, the application of the process, the use of the identified biomarkers, and the methods for determining biomarker values can be applied to other specific types of cancer, cancer in general, any other disease or medical condition, or to identify individuals who may or may not benefit from adjunctive medical treatment. References to pancreatic cancer herein should be understood to include other types of cancer, cancer in general, or any other disease or medical condition, except when the context clearly indicates a specific outcome related to pancreatic cancer.

实施例Example

下述实施例仅用于示例目的而不意图限制所附权利要求限定的本申请的范围。本文描述的所有实施例用本领域技术人员公知的常规标准技术进行。下述实施例描述的常规分子生物学技术可以如标准实验室手册所述进行,如Sambrook et al.,MolecularCloning:A Laboratory Manual,3rd.ed.,Cold Spring Harbor Laboratory Press,ColdSpring Harbor,N.Y.,(2001)。The following examples are for illustrative purposes only and are not intended to limit the scope of this application, which is defined by the appended claims. All examples described herein were performed using conventional standard techniques known to those skilled in the art. Conventional molecular biology techniques described in the following examples can be performed as described in standard laboratory manuals, such as Sambrook et al., Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (2001).

实施例1.样品的多重适配体分析Example 1. Multiple aptamer analysis of samples

本实施例描述用于分析样品和对照以鉴定表1第2列所列的生物标记(见图9)和鉴定表19所列的癌症生物标记的多重适配体测定。对于胰腺癌、肺癌和间皮瘤研究,多重分析使用823个适配体,每个对于特定靶标是独特的。This example describes a multiplex aptamer assay used to analyze samples and controls to identify the biomarkers listed in Table 1, column 2 (see Figure 9) and to identify the cancer biomarkers listed in Table 19. For the pancreatic cancer, lung cancer, and mesothelioma studies, the multiplex assay used 823 aptamers, each unique to a specific target.

在这个方法中,每次添加溶液均更换移液器尖头。In this method, the pipette tip is changed with each solution addition.

而且,除非另有说明,大多数溶液转移和洗涤添加使用Beckman Biomek FxP的96孔头(head)。除非另有说明,手工移液的方法步骤使用12通道P200Pipetteman(RaininInstruments,LLC,Oakland,CA)。称为SB17的定制缓冲液由内部(in-house)制备,其包含40mM HEPES、100mM NaCl、5mM KCl、5mM MgCl2、1mM EDTA,pH 7.5。称为SB18的定制缓冲液由内部制备,其包含40mM HEPES、100mM NaCl、5mM KCl、5mM MgCl2,pH 7.5。除非另有说明,所有步骤均在室温下进行。Furthermore, unless otherwise stated, most solution transfers and wash additions used a 96-well head from a Beckman Biomek FxP. Unless otherwise stated, manual pipetting procedures used a 12-channel P200 Pipetteman (Rainin Instruments, LLC, Oakland, CA). A custom buffer, called SB17, was prepared in-house and contained 40 mM HEPES, 100 mM NaCl, 5 mM KCl, 5 mM MgCl2, 1 mM EDTA, pH 7.5. A custom buffer, called SB18, was prepared in-house and contained 40 mM HEPES, 100 mM NaCl, 5 mM KCl, 5 mM MgCl2, pH 7.5. All steps were performed at room temperature unless otherwise stated.

1.适配体储液的制备1. Preparation of Aptamer Stock Solution

对于不具有光可裂解生物素接头的适配体,用合适的光可裂解生物素酰化引物在1x SB17,0.05%吐温-20中以8x浓度制备用于10%、1%和0.03%血浆的定制适配体储液,其中所得的引物浓度是相关适配体浓度的3倍。引物与全部或部分相应适配体杂交。For aptamers without a photocleavable biotin linker, custom aptamer stocks for 10%, 1%, and 0.03% plasma were prepared using the appropriate photocleavable biotinylated primers at 8x concentration in 1x SB17, 0.05% Tween-20, resulting in a primer concentration that is 3-fold that of the relevant aptamer. Primers hybridized to all or part of the corresponding aptamer.

将3种8x适配体溶液的每一种分别以1:4稀释入1xSB17,0.05%吐温-20(1500μL的8x储液稀释入4500μL的1xSB17,0.05%吐温-20)以实现2x浓度。然后将每种稀释的适配体主混合物(master mix)分配到4个2mL螺旋盖试管中,每个试管1500μL,并且加热至95℃保持5分钟,随后在37℃下温育15分钟。温育后,将对应于特定适配体主混合物的4个2mL试管组合入试剂槽(trough),将55μL的2x适配体混合物(对于所有三种混合物)手工移液入96孔Hybaid平板并用箔将平板密封。最终结果是3个96孔箔密封的Hybaid平板。各个适配体浓度为0.5nM。Each of the three 8x aptamer solutions was diluted 1:4 into 1xSB17, 0.05% Tween-20 (1500 μL of the 8x stock solution was diluted into 4500 μL of 1xSB17, 0.05% Tween-20) to achieve a 2x concentration. Each diluted aptamer master mix was then distributed into four 2mL screw-capped test tubes, 1500 μL per test tube, and heated to 95°C for 5 minutes, followed by incubation at 37°C for 15 minutes. After incubation, the four 2mL test tubes corresponding to the specific aptamer master mix were combined into a reagent trough, and 55 μL of the 2x aptamer mixture (for all three mixtures) was manually pipetted into a 96-well Hybaid plate and sealed with foil. The end result was three 96-well foil-sealed Hybaid plates. The concentration of each aptamer was 0.5 nM.

2.测定样品制备2. Assay Sample Preparation

将储存在-80℃的100%血浆的冷冻等份置于25℃水浴中保持10分钟。将融化的样品置于冰上,温和地涡漩(设为4)8秒钟,然后再置于冰上。A frozen aliquot of 100% plasma stored at -80°C was placed in a 25°C water bath for 10 minutes. The thawed sample was placed on ice, vortexed gently (setting 4) for 8 seconds, and then placed back on ice.

在4℃下,用50μL的8通道多通道移液器(spanning pipettor)将16μL样品转移至96孔Hybaid平板中以制备20%样品溶液,每个孔含有64μL合适的样品稀释物(对于血浆,0.8x SB18、0.05%吐温-20、2μM Z-block_2、0.6mM MgCl2)。将这个平板保存在冰上直至开始下一个样品稀释步骤。At 4°C, 16 μL of sample was transferred to a 96-well Hybaid plate using a 50 μL 8-channel multichannel pipette to prepare a 20% sample solution, with each well containing 64 μL of the appropriate sample dilution (for plasma, 0.8x SB18, 0.05% Tween-20, 2 μM Z-block_2, 0.6 mM MgCl 2 ). This plate was kept on ice until the next sample dilution step was started.

为了开始样品和适配体平衡,将20%样品平板短暂离心并置于Beckman FX上,在此将其用96孔移液器上下移液来混合。然后通过将10μL的20%样品稀释至90μL的1xSB17,0.05%吐温-20中来制备2%样品。然后,将6μL所得的2%样品稀释至194μL的1xSB17,0.05%吐温-20中以制备0.06%样品平板。在Beckman Biomek FxP上进行稀释。每次转移后,通过上下移液来将溶液混合。然后,通过将55μL样品加入到55μL合适的2x适配体混合物中,将3个样品稀释平板转移至它们各自的适配体溶液。将样品和适配体溶液通过上下移液在机器人(robot)上混合。To begin sample and aptamer balance, the 20% sample plate was briefly centrifuged and placed on a Beckman FX, where it was mixed by pipetting up and down with a 96-well pipette. A 2% sample was then prepared by diluting 10 μL of the 20% sample into 90 μL of 1xSB17, 0.05% Tween-20. Then, 6 μL of the 2% sample obtained was diluted into 194 μL of 1xSB17, 0.05% Tween-20 to prepare a 0.06% sample plate. Dilution was performed on a Beckman Biomek FxP. After each transfer, the solution was mixed by pipetting up and down. Then, 3 sample dilution plates were transferred to their respective aptamer solutions by adding 55 μL of sample to 55 μL of the appropriate 2x aptamer mixture. Sample and aptamer solution were mixed on a robot by pipetting up and down.

3.样品平衡结合3. Sample equilibrium binding

将样品/适配体平板用箔密封,并且置于37℃培养箱中保持3.5小时,然后进行Catch 1步骤。The sample/aptamer plate was sealed with foil and placed in a 37°C incubator for 3.5 hours before proceeding to the Catch 1 step.

4.制备Catch 2珠平板4. Preparation of Catch 2 Bead Plates

将5.5mL的MyOne(Invitrogen Corp.,Carlsbad,CA)链霉抗生物素蛋白C1珠等份(10mg/mL)用等体积的20mM NaOH洗涤2次(每次洗涤温育5分钟),用等体积的1x SB17,0.05%吐温-20洗涤3次,并且重悬浮于5.5mL的1x SB17,0.05%吐温-20中。用12-通道(span)多通道移液器将50μL该溶液手工移液至96孔Hybaid平板的每个孔中。然后将平板用箔覆盖,并保存在4℃用于测定。A 5.5 mL aliquot of MyOne (Invitrogen Corp., Carlsbad, CA) streptavidin C1 beads (10 mg/mL) was washed twice with an equal volume of 20 mM NaOH (incubation for 5 minutes each wash), washed three times with an equal volume of 1x SB17, 0.05% Tween-20, and resuspended in 5.5 mL of 1x SB17, 0.05% Tween-20. 50 μL of this solution was manually pipetted into each well of a 96-well Hybaid plate using a 12-span multichannel pipette. The plate was then covered with foil and stored at 4°C for use in the assay.

5.制备Catch 1珠平板5. Preparation of Catch 1 Bead Plates

将3个0.45μm的Millipore HV平板(Durapore membrane,Cat#MAHVN4550)用100μL的1x SB17,0.05%吐温-20平衡至少10分钟。然后通过平板过滤平衡缓冲液,向每孔中加入133.3μL的7.5%链霉抗生物素蛋白-琼脂糖珠浆液(于1x SB17,0.05%吐温-20中)。为了在将链霉抗生物素蛋白-琼脂糖珠转移入滤板时保持它们悬浮,将珠溶液用200μL,12-通道移液器手工混合15次。在将珠分配于3个滤板中之后,施加真空以除去珠上清。最后,将珠在滤板中用200μL的1x SB17,0.05%吐温-20洗涤,然后重悬于200μL的1x SB17,0.05%吐温-20中。将滤板的底部吸干(blot),并将平板储存以用于测定。Three 0.45 μm Millipore HV plates (Durapore membrane, Cat# MAHVN4550) were equilibrated with 100 μL of 1x SB17, 0.05% Tween-20 for at least 10 minutes. The equilibration buffer was then filtered through the plates, and 133.3 μL of a 7.5% streptavidin-agarose bead slurry (in 1x SB17, 0.05% Tween-20) was added to each well. To keep the streptavidin-agarose beads suspended as they were transferred to the filter plates, the bead solution was manually mixed 15 times using a 200 μL, 12-channel pipette. After dispensing the beads into the three filter plates, vacuum was applied to remove the bead supernatant. Finally, the beads were washed in the filter plate with 200 μL of 1× SB17, 0.05% Tween-20 and then resuspended in 200 μL of 1× SB17, 0.05% Tween- 20. The bottom of the filter plate was blotted and the plate was stored for assay.

6.装载Cytomat6. Load Cytomat

将cytomat用所有尖头、平板、槽中的所有试剂(除了NHS-生物素试剂,其在即将加入到平板之前制备)、3个制备的Catch 1滤板和1个制备的MyOne平板装载。The cytomat was loaded with all tips, plates, all reagents in wells (except the NHS-biotin reagent, which was prepared just prior to addition to the plate), 3 prepared Catch 1 filter plates, and 1 prepared MyOne plate.

7.Catch 17.Catch 1

3.5小时平衡时间之后,将样品/适配体平板从培养箱中移出,离心约1分钟,除去箔,置于Beckman Biomek FxP平台上。开始Beckman Biomek FxP程序。除非另有说明,Catch1中的所有后续步骤均由Beckman Biomek FxP机器人完成。在该程序内,向Catch 1滤板施加真空以除去珠上清。将100微升10%、1%和0.03%平衡结合反应的每一种加入到它们各自的Catch 1滤板,每个板用即用(on-deck)轨道摇床(orbital shaker)以800rpm混合10分钟。After a 3.5 hour equilibration period, the sample/aptamer plate was removed from the incubator, centrifuged for approximately 1 minute, the foil removed, and placed on the Beckman Biomek FxP platform. The Beckman Biomek FxP program was started. Unless otherwise noted, all subsequent steps in Catch 1 were performed by the Beckman Biomek FxP robot. Within this program, vacuum was applied to the Catch 1 filter plate to remove the bead supernatant. 100 microliters of each of the 10%, 1% and 0.03% equilibrium binding reactions were added to their respective Catch 1 filter plates, and each plate was mixed for 10 minutes using an on-deck orbital shaker at 800 rpm.

通过真空过滤除去未结合的溶液。通过分配溶液并立即施加真空以使溶液通过平板过滤,将Catch 1珠用190μL的1x SB17,0.05%吐温-20中的100μM生物素洗涤,然后用190μL的1x SB17,0.05%吐温-20洗涤。Unbound solution was removed by vacuum filtration. The Catch 1 beads were washed with 190 μL of 100 μM biotin in 1x SB17, 0.05% Tween-20, followed by 190 μL of 1x SB17, 0.05% Tween-20 by dispensing the solution and immediately applying vacuum to filter the solution through the plate.

然后,将190μL的1x SB17,0.05%吐温-20加入Catch 1平板中。用即用吸干装置(blot station)将平板吸干以除去液滴,然后用轨道摇床在800rpm、25℃下温育10分钟。Then, 190 μL of 1× SB17, 0.05% Tween-20 was added to the Catch 1 plate. The plate was blotted dry using a blot station to remove droplets and then incubated at 800 rpm, 25° C. for 10 minutes using an orbital shaker.

机器人通过真空过滤除去这个洗涤液,并且用即用吸干装置吸干滤板底部以除去液滴。The robot removes this wash solution by vacuum filtration and the bottom of the filter plate is blotted dry using a ready-to-use blot device to remove droplets.

8.标记(Tagging)8. Tagging

将NHS-PEO4-生物素等份在37℃下融化6分钟,然后用标记缓冲液(pH=7.25的SB17 0.05%吐温-20)以1:100稀释。将NHS-PEO4-生物素试剂在无水DMSO中溶解为100mM的浓度,并且冷冻保存于-20℃。在机器人辅助下,将稀释的NHS-PEO4-生物素试剂手工加入即用槽中,手工重新启动机器人程序以将100μL的NHS-PEO4-生物素分配到每个Catch 1滤板的每个孔中。将这个溶液用Catch 1珠在轨道摇床上以800rpm振荡温育5分钟。The NHS-PEO4-biotin aliquot was thawed at 37°C for 6 minutes and then diluted 1:100 with labeling buffer (SB17 0.05% Tween-20 at pH=7.25). The NHS-PEO4-biotin reagent was dissolved in anhydrous DMSO to a concentration of 100mM and stored frozen at -20°C. With the assistance of a robot, the diluted NHS-PEO4-biotin reagent was manually added to the ready-to-use tank, and the robot program was manually restarted to distribute 100 μL of NHS-PEO4-biotin to each well of each Catch 1 filter plate. This solution was incubated with Catch 1 beads on an orbital shaker at 800rpm for 5 minutes.

9.动力学攻击(Kinetic Challenge)和光裂解9. Kinetic Challenge and Photolysis

通过向Catch 1平板加入150μL的1x SB17,0.05%吐温-20中的20mM甘氨酸来中止标记反应,同时仍含有NHS标签。然后将平板在轨道摇床上以800rpm温育1分钟。通过真空过滤除去NHS-标签/甘氨酸溶液。然后,将190μL的20mM甘氨酸(1x SB17,0.05%吐温-20)加入到每个平板,并在轨道摇床上以800rpm温育1分钟,之后通过真空过滤除去。The labeling reaction was stopped by adding 150 μL of 20 mM glycine in 1x SB17, 0.05% Tween-20 to the Catch 1 plate while still containing the NHS tag. The plate was then incubated on an orbital shaker at 800 rpm for 1 minute. The NHS-tag/glycine solution was removed by vacuum filtration. 190 μL of 20 mM glycine (1x SB17, 0.05% Tween-20) was then added to each plate and incubated on an orbital shaker at 800 rpm for 1 minute before being removed by vacuum filtration.

将190μL的1x SB17,0.05%吐温-20加入到每个平板,并通过真空过滤除去。190 μL of 1x SB17, 0.05% Tween-20 was added to each plate and removed by vacuum filtration.

然后,通过加入190μL的1x SB17,0.05%吐温-20,将平板置于轨道摇床上以800rpm保持1分钟随后真空过滤来将Catch 1平板的孔洗涤3次。最后一次洗涤后,将平板置于1mL深孔平板的顶部并从平台上移开。将Catch 1平板在1000rpm下离心1分钟以在洗脱之前从琼脂糖珠上尽可能除去无关的体积。The wells of the Catch 1 plate were then washed three times by adding 190 μL of 1x SB17, 0.05% Tween-20, placing the plate on an orbital shaker at 800 rpm for 1 minute, and then vacuum filtering. After the final wash, the plate was placed on top of a 1 mL deep-well plate and removed from the platform. The Catch 1 plate was centrifuged at 1000 rpm for 1 minute to remove as much extraneous volume as possible from the agarose beads prior to elution.

将平板放回Beckman Biomek FxP上,并向滤板的每孔中加入85μL的1x SB17,0.05%吐温-20中的10mM DxSO4The plate was placed back onto the Beckman Biomek FxP and 85 μL of 1x SB17, 10 mM DxSO4 in 0.05% Tween-20 was added to each well of the filter plate.

将滤板从平台上移开,置于Variomag Thermoshaker(Thermo FisherScientific,Inc.,Waltham,MA)上,在BlackRay(Ted Pella,Inc.,Redding,CA)光源下,以800rpm振荡同时照射10分钟。The filter plate was removed from the platform and placed on a Variomag Thermoshaker (Thermo Fisher Scientific, Inc., Waltham, MA) and irradiated under a BlackRay (Ted Pella, Inc., Redding, CA) light source while shaking at 800 rpm for 10 minutes.

通过首先将10%Catch 1滤板置于1mL深孔平板的顶部并以1000rpm离心1分钟来将光裂解的溶液依次从每个Catch 1平板洗脱入同一个深孔平板。然后,将1%和0.03%Catch 1平板依次离心入相同的深孔平板。The photocleaved solution was eluted sequentially from each Catch 1 plate into the same deep-well plate by first placing the 10% Catch 1 filter plate on top of a 1 mL deep-well plate and centrifuging at 1000 rpm for 1 minute. Then, the 1% and 0.03% Catch 1 plates were centrifuged sequentially into the same deep-well plate.

10. Catch 2珠捕获10. Catch 2 Bead Capture

将含有合并的Catch 1洗脱物的1mL深孔块(block)置于用于Catch 2的BeckmanBiomek FxP的平台上。A 1 mL deep well block containing the combined Catch 1 eluate was placed on the platform of the Beckman Biomek FxP for Catch 2.

机器人将所有光裂解洗脱物从1mL深孔平板转移至含有先前制备的Catch 2MyOne磁珠的Hybaid平板上(通过磁性分离除去MyOne缓冲液后)。The robot transferred all photocleavage eluates from the 1 mL deep-well plate to a Hybaid plate containing previously prepared Catch 2 MyOne magnetic beads (after removal of the MyOne buffer by magnetic separation).

将溶液在Variomag Thermoshaker(Thermo Fisher Scientific,Inc.,Waltham,MA)上于25℃以1350rpm振荡温育5分钟。The solution was incubated at 25°C with shaking at 1350 rpm for 5 minutes on a Variomag Thermoshaker (Thermo Fisher Scientific, Inc., Waltham, MA).

机器人将平板转移至即用磁性分离器。将平板在磁体上温育90秒,然后除去并弃去上清。The robot transfers the plate to a ready-to-use magnetic separator. The plate is incubated on the magnet for 90 seconds, and then the supernatant is removed and discarded.

11. 37℃ 30%甘油洗涤11. Wash with 30% glycerol at 37°C

将Catch 2平板移到即用保温摇床上,并且将75μL的1x SB17,0.05%吐温-20转移到每孔中。将平板在37℃下以1350rpm混合1分钟以重悬并加热珠。在37℃下,向Catch 2平板的每个孔中转移75μL的60%甘油,并且将平板继续在37℃下以1350rpm再混合1分钟。机器人将平板转移到37℃磁性分离器上,在此将其在磁体上温育2分钟,然后机器人除去并弃去上清。将这些洗涤再重复2次。The Catch 2 plate was moved to a ready-to-use incubator shaker, and 75 μL of 1x SB17, 0.05% Tween-20 was transferred to each well. The plate was mixed at 1350 rpm at 37°C for 1 minute to resuspend and warm the beads. 75 μL of 60% glycerol was transferred to each well of the Catch 2 plate at 37°C, and the plate was mixed at 1350 rpm at 37°C for another minute. The plate was robotically transferred to a 37°C magnetic separator, where it was incubated on a magnet for 2 minutes, after which the supernatant was robotically removed and discarded. These washes were repeated two more times.

将第三30%甘油洗涤液从Catch 2珠除去后,将150μL的1x SB17,0.05%吐温-20加入每孔中,并且在37℃下以1350rpm振荡温育1分钟,然后在37℃磁体上通过磁性分离除去。After the third 30% glycerol wash was removed from the Catch 2 beads, 150 μL of 1x SB17, 0.05% Tween-20 was added to each well and incubated at 37°C with shaking at 1350 rpm for 1 minute before removal by magnetic separation on a 37°C magnet.

将Catch 2珠用150μL的1x SB19,0.05%吐温-20在25℃下以1350rpm振荡温育1分钟来进行最后洗涤一次,然后进行磁性分离。The Catch 2 beads were washed one final time with 150 μL of 1× SB19, 0.05% Tween-20 at 25° C. with shaking at 1350 rpm for 1 minute before magnetic separation.

12. Catch 2珠洗脱和中和12. Catch 2 Bead Elution and Neutralization

通过向每孔加入105μL含1M NaCl,0.05%吐温-20的100mM CAPSO来将适配体从Catch 2珠洗脱。将珠用这种溶液以1350rpm振荡温育5分钟。The aptamers were eluted from the Catch 2 beads by adding 105 μL of 100 mM CAPSO containing 1 M NaCl, 0.05% Tween-20 to each well. The beads were incubated with this solution for 5 minutes with shaking at 1350 rpm.

然后,将Catch 2平板置于磁性分离器上保持90秒,然后将90μL洗脱物转移至每孔含有10μL的500mM HCl、500mM HEPES、0.05%吐温-20的新96孔平板中。转移后,通过移液90μL上下5次来机器混合溶液。The Catch 2 plate was then placed on a magnetic separator for 90 seconds, and 90 μL of the eluate was transferred to a new 96-well plate containing 10 μL per well of 500 mM HCl, 500 mM HEPES, 0.05% Tween-20. After transfer, the solution was mechanically mixed by pipetting 90 μL up and down 5 times.

13.杂交13. Hybridization

Beckman Biomek FxP转移20μL中和的Catch 2洗脱物至新鲜的Hybaid平板,向每孔加入5μL含有10x峰值(spike)杂交对照的10x Agilent Block。然后,将25μL的2xAgilent Hybridization缓冲液手工移液入含有中和的样品和封闭缓冲液的平板的每孔中,并且通过手工缓慢移液25μL上下15次以避免大量泡沫形成而将溶液混合。将平板以1000rpm离心1分钟。Beckman Biomek FxP transferred 20 μL of the neutralized Catch 2 eluate to a fresh Hybaid plate, and 5 μL of 10x Agilent Block containing a 10x spike hybridization control was added to each well. Then, 25 μL of 2xAgilent Hybridization Buffer was manually pipetted into each well of the plate containing the neutralized sample and blocking buffer, and the solution was mixed by slowly pipetting 25 μL up and down 15 times to avoid excessive foam formation. The plate was centrifuged at 1000 rpm for 1 minute.

将gasket slide置于Agilent杂交室中,将40μL每种含有杂交和封闭溶液的样品手工移液至每个gasket中。以减少气泡形成的方式使用8-通道可调(variable)多通道移液器。然后将定制Agilent微阵列玻片(Agilent Technologies,Inc.,Santa Clara,CA)以条形码朝上缓慢下降到gasket slide上(参见Agilent手册的详细描述)。A gasket slide was placed in an Agilent hybridization chamber, and 40 μL of each sample containing hybridization and blocking solution was manually pipetted into each gasket. An 8-channel variable multichannel pipette was used to minimize bubble formation. A custom Agilent microarray slide (Agilent Technologies, Inc., Santa Clara, CA) was then slowly lowered onto the gasket slide with the barcode facing upward (see the Agilent manual for detailed description).

将杂交室上部置于玻片/背衬(backing)夹心结构上,将夹紧支架(clampingbracket)盖在整个装置上。通过旋紧螺旋来夹紧这些装置。The upper part of the hybridization chamber is placed on the slide/backing sandwich structure, and the clamping bracket is placed on the entire device. The devices are clamped by tightening the screws.

目视检查每个玻片/背衬玻片夹心结构以确保溶液气泡可以在样品内自由活动。如果气泡不自由活动,则轻拍杂交室装置以释放靠近垫片(gasket)的气泡。Visually inspect each slide/backing slide sandwich to ensure that solution bubbles can move freely within the sample. If bubbles are not free, tap the hybridization chamber assembly to release any bubbles near the gasket.

将组装的杂交室在Agilent杂交炉中在60℃下以20rpm旋转温育19小时。The assembled hybridization chamber was incubated in an Agilent hybridization oven at 60°C with rotation at 20 rpm for 19 hours.

14.杂交后洗涤14. Post-hybridization Washing

将约400mL的Agilent Wash Buffer 1置于两个单独的玻璃染色皿的每一个中。将一个染色皿置于磁力搅拌板上,将玻片架和搅拌棒置于缓冲液中。Place approximately 400 mL of Agilent Wash Buffer 1 in each of two separate glass staining dishes. Place one staining dish on a magnetic stir plate and place a slide rack and stir bar in the buffer.

通过将搅拌棒置于空玻璃染色皿中来制备用于Agilent Wash 2的染色皿。Prepare a staining dish for Agilent Wash 2 by placing a stir bar in an empty glass staining dish.

将第四玻璃染色皿置于一旁以用于最终乙腈洗涤。A fourth glass staining dish was set aside for a final acetonitrile wash.

分解6个杂交室中的每一个。逐个将玻片/背衬夹心结构从其杂交室移开并浸入含有Wash 1的染色皿中。用一对镊子将玻片/背衬夹心结构撬开,同时仍浸没微阵列玻片。将玻片快速转移至磁力搅拌板上的Wash 1染色皿中的玻片架中。Disassemble each of the six hybridization chambers. Remove each slide/backing sandwich from its hybridization chamber and immerse it in a staining dish containing Wash 1. Use a pair of tweezers to pry the slide/backing sandwich apart while still submerging the microarray slide. Quickly transfer the slide to a slide rack in a staining dish containing Wash 1 on a magnetic stir plate.

将玻片架缓慢上升和降低5次。将磁力搅拌器以低设定开启,并且将玻片温育5分钟。The slide rack was slowly raised and lowered 5 times. The magnetic stirrer was turned on at a low setting and the slides were incubated for 5 minutes.

当Wash 1剩余1分钟时,将在培养箱中预热至37℃的Wash Buffer 2加入第二制备的染色皿中。将玻片架快速转移至Wash Buffer 2中,并且通过将其在染色皿顶部刮擦来除去玻片架底部的任何过量缓冲液。将玻片架缓慢上升和降低5次。将磁力搅拌器以低设定开启,并且将玻片温育5分钟。With 1 minute remaining in Wash 1, add Wash Buffer 2, preheated to 37°C in the incubator, to the second prepared staining dish. Quickly transfer the slide rack to Wash Buffer 2 and remove any excess buffer from the bottom of the rack by scraping it against the top of the staining dish. Slowly raise and lower the rack five times. Turn on the magnetic stirrer at a low setting and incubate the slides for 5 minutes.

将玻片架从Wash 2中缓慢取出,将玻片从溶液中取出需要约15秒。Slowly remove the slide rack from Wash 2. It takes approximately 15 seconds to remove the slides from the solution.

当在Wash 2中剩余1分钟时,将乙腈(ACN)加入第四染色皿中。将玻片架转移至乙腈染色皿中。将玻片架缓慢上升和降低5次。将磁力搅拌器以低设定开启,并且将玻片温育5分钟。When 1 minute is left in Wash 2, add acetonitrile (ACN) to the fourth staining dish. Transfer the slide rack to the acetonitrile staining dish. Slowly raise and lower the slide rack 5 times. Turn on the magnetic stirrer at a low setting and incubate the slides for 5 minutes.

将玻片架缓慢从ACN染色皿中取出并置于吸水纸上。将玻片底部边缘快速干燥,并且将玻片置于干净的玻片盒中。Slowly remove the slide rack from the ACN staining dish and place it on absorbent paper. Quickly dry the bottom edge of the slide and place the slide in a clean slide box.

15.微阵列成像15. Microarray Imaging

将微阵列玻片置于Agilent扫描仪玻片容器中,并根据厂商指导装载至Agilent微阵列扫描仪中。The microarray slides were placed in an Agilent scanner slide container and loaded into the Agilent microarray scanner according to the manufacturer's instructions.

将玻片在Cy3通道中以5μm分辨率在100%PMT设定及XRD选项为0.05的条件下成像。将所得的tiff图像用Agilent特征抽取软件版本10.5处理。The slides were imaged in the Cy3 channel at 5 μm resolution with 100% PMT setting and XRD option 0.05. The resulting tiff images were processed using Agilent Feature Extraction Software version 10.5.

实施例2.生物标记鉴定Example 2. Biomarker Identification

进行潜在的胰腺癌生物标记的鉴定,用于诊断无症状的个体以及患有急性或慢性胰腺炎(或两者)、胰腺梗阻、GERD、胆石或者后来发现为良性的异常成像的有症状的个体(统称为GI和正常对照)中的胰腺癌。这个研究的招收标准是年龄18岁或更大,能够签署知情同意书,并且提供血浆样品以及胰腺癌或良性发现的记录诊断。对于病例,在治疗或手术之前采集血液样品,随后诊断为胰腺癌。排除标准包括抽血的5年内癌症(除了皮肤的鳞状细胞癌)的早期诊断或治疗。血浆样品采集自2个不同部位,并且包括143个胰腺癌样品和115个对照组样品。用实施例1所述的多重适配体亲和测定测量和报道这些258个样品中每一个的823个分析物的RFU值。因为血浆样品在相似方案下获得自2个独立研究和部位,所以在生物标记发现分析之前进行位点差异的检测。Potential pancreatic cancer biomarkers were identified for the diagnosis of pancreatic cancer in asymptomatic individuals and symptomatic individuals with acute or chronic pancreatitis (or both), pancreatic obstruction, GERD, gallstones, or abnormal imaging that was later found to be benign (collectively referred to as GI and normal controls). Enrollment criteria for this study were age 18 years or older, the ability to provide informed consent, and the provision of a plasma sample and a documented diagnosis of pancreatic cancer or benign findings. For cases, blood samples were collected prior to treatment or surgery and subsequently diagnosed with pancreatic cancer. Exclusion criteria included prior diagnosis or treatment of cancer (excluding squamous cell carcinoma of the skin) within 5 years of the blood draw. Plasma samples were collected from two different sites and included 143 pancreatic cancer samples and 115 control samples. The RFU values for 823 analytes in each of these 258 samples were measured and reported using the multiplex aptamer affinity assay described in Example 1. Because plasma samples were obtained from two independent studies and sites under similar protocols, site-specific differences were tested prior to biomarker discovery analysis.

每个病例和对照群体分别通过产生823个分析物中每一个的类别依赖性累积分布函数(cdf)来比较。来自样品的两个集合的值之间的KS-距离(柯尔莫可洛夫-斯米洛夫统计量)是如下程度的非参数测量,即来自一个集合(集合A)的值的经验分布与来自另一集合(集合B)的值的分布的差异程度。对于阈值T的任何值,来自集合A的某些比例的值小于T,并且来自集合B的某些比例的值小于T。KS-距离测量对于任何T选择来自两个集合的值的比例之间的最大(无符号)差异。Each case and control population was compared separately by generating a class-dependent cumulative distribution function (cdf) for each of the 823 analytes. The KS-distance (Kolmokorov-Smilov statistic) between the values of two sets of samples is a nonparametric measure of the degree to which the empirical distribution of values from one set (set A) differs from the distribution of values from another set (set B). For any value of a threshold value, T, some proportion of values from set A is less than T, and some proportion of values from set B is less than T. The KS-distance measures the maximum (unsigned) difference between the proportions of values selected from the two sets for any T.

潜在生物标记的这个集合可以用于构建分类器,该分类器将样品分配至对照或疾病组。事实上,许多这样的分类器是从这些生物标记的集合产生的,并且确定了任何生物标记用于良好评分分类器的频率。在顶级评分分类器中最常出现的那些生物标记对于产生诊断测试是最有用的。在本实施例中,贝叶斯分类器用于开发分类空间,但是许多其他监督学习技术可以为此目的而采用。任何单独分类器的评分适合度使用贝叶斯表面分类器的接受者操作特征曲线下面积(ROC的AUC)而判断,假定疾病发病率为0.5。这个评分度量从0到1变化,1是无误差分类器。从生物标记群体测量构建贝叶斯分类器的细节如实施例3所述。This set of potential biomarkers can be used to construct a classifier that assigns samples to control or disease groups. In fact, many such classifiers are generated from these sets of biomarkers, and the frequency of any biomarker for a good scoring classifier is determined. Those biomarkers that appear most frequently in the top scoring classifiers are the most useful for generating diagnostic tests. In this example, a Bayesian classifier is used to develop the classification space, but many other supervised learning techniques can be used for this purpose. The fitness of the score of any individual classifier is judged using the area under the receiver operating characteristic curve (AUC) of the Bayesian surface classifier, assuming a disease prevalence of 0.5. This scoring metric varies from 0 to 1, with 1 being a perfect classifier. Details of constructing a Bayesian classifier from biomarker population measurements are described in Example 3.

使用表1中的65个分析物,发现总计973个10-分析物分类器具有诊断胰腺癌与对照组的0.90的AUC。从分类器的这个集合,发现总计11个生物标记存在于30%或更多的高评分分类器中。表13提供这些潜在生物标记的列表,并且图10为鉴定的生物标记的频率图。Using the 65 analytes in Table 1, a total of 973 10-analyte classifiers were found with an AUC of 0.90 for diagnosing pancreatic cancer versus control. From this set of classifiers, a total of 11 biomarkers were found to be present in 30% or more of the high-scoring classifiers. Table 13 provides a list of these potential biomarkers, and Figure 10 is a frequency plot of the identified biomarkers.

实施例3.胰腺癌的朴素贝叶斯分类Example 3. Naive Bayesian Classification of Pancreatic Cancer

从鉴定为可用于区分胰腺癌和对照的生物标记列表,选择一组10个生物标记并构建朴素贝叶斯分类器,见表16。将类别依赖性概率密度函数(pdf)p(xi|c)和p(xi|d)建模为log-正态分布函数,其中xi为生物标记i的测量的RFU值的log,c和d指对照和疾病群体,该函数的特征在于平均值μ和方差σ2。10个生物标记的pdf的参数列于表16,拟合为正态pdf的原始数据与模型的实例示于图5。如图5所示,潜在的假设看起来拟合数据非常好。From the list of biomarkers identified as useful for distinguishing pancreatic cancer from controls, a set of 10 biomarkers was selected and a naive Bayes classifier was constructed, as shown in Table 16. The class-dependent probability density functions (pdfs), p(xi| c ) and p( xi |d), were modeled as log-normal distribution functions, where xi is the log of the measured RFU value for biomarker i, c and d refer to the control and disease populations, respectively. The function is characterized by a mean μ and a variance σ2 . The parameters of the pdfs for the 10 biomarkers are listed in Table 16, and examples of the original data and model fitted with normal pdfs are shown in Figure 5. As shown in Figure 5, the underlying assumptions appear to fit the data very well.

这样的模型的朴素贝叶斯分类在下述等式中给出,,其中P(d)为适合测试的群体中的疾病发病率,并且n=10。求和中的每一项是单个标记的log似然比,不患有所关注的疾病(即在这种情况下是胰腺癌)比患有该疾病的样品的总log似然比是这些单独项加上表示疾病发病率的项的简单加和。为简便起见,我们假设p(d)=0.5,从而The naive Bayes classification for such a model is given by the following equation, where P(d) is the disease prevalence in the population suitable for testing, and n = 10. Each term in the sum is the log likelihood ratio of a single marker, and the overall log likelihood ratio of a sample without the disease of interest (i.e., pancreatic cancer in this case) to one with the disease is the simple sum of these individual terms plus a term representing the disease prevalence. For simplicity, we assume that p(d) = 0.5, so that

鉴于6.3、9.3、8.7、10.8、7.4、11.4、11.7、9.0、8.0、7.3的10个生物标记中每一个的以log(RFU)计的未知样品测量,分类的计算在表16中详述。将包含疾病比对照类别的log似然比的单个组分制成表格,并且可以从表16中的参数和的值计算。个体log似然比的总和为-3.044,或者没有疾病比患有疾病的个体的似然性为21,其中似然性e3.044=21。前3个生物标记值具有与疾病组更一致的似然性(log似然性>0),但是据发现剩余的7个生物标记全部一致有利于对照组。将似然性相乘给出上述相同结果;未知样品不患有疾病的似然性为21。事实上,这个样品来自训练集合中的对照群体。Given the unknown sample measurements in log(RFU) for each of the 10 biomarkers of 6.3, 9.3, 8.7, 10.8, 7.4, 11.4, 11.7, 9.0, 8.0, and 7.3, the classification calculations are detailed in Table 16. The individual components comprising the log likelihood ratios for the disease versus control categories are tabulated and can be calculated from the values of the parameters and in Table 16. The sum of the individual log likelihood ratios is -3.044, or a likelihood of 21 for individuals without disease versus individuals with disease, where likelihood e 3.044 = 21. The first three biomarker values have likelihoods that are more consistent with the disease group (log likelihood > 0), but the remaining seven biomarkers are all found to consistently favor the control group. Multiplying the likelihoods gives the same result as above; the likelihood that the unknown sample does not have disease is 21. In fact, this sample is from the control population in the training set.

实施例4.选择用于分类器的生物标记组的贪婪算法Example 4. Greedy Algorithm for Selecting Biomarker Panels for Classifiers

本实施例描述了从表1选择生物标记以形成可以在本文所述的任何方法中用作分类器的组。选择表1的生物标记的子集以构建具有良好性能的分类器。这个方法还用于确定哪些潜在标记包括为实施例2中的生物标记。This example describes the selection of biomarkers from Table 1 to form a panel that can be used as a classifier in any of the methods described herein. A subset of the biomarkers in Table 1 was selected to construct a classifier with good performance. This method was also used to determine which potential markers to include as biomarkers in Example 2.

此处所用的分类器性能的测量是AUC;0.5的性能是随机(抛硬币(coin toss))分类器的基线预期值,比随机差的分类器的评分在0.0至0.5之间,比随机性能好的分类器的评分在0.5至1.0之间。无误差的完美分类器具有1.0的灵敏性和1.0的特异性。可以将实施例4的方法用于性能的其他常规测量,如F-测量、灵敏性和特异性之和或者灵敏性和特异性之积。特别地,可能想要用不同加权处理灵敏性和特异性,从而选择具有较高特异性而损失一些灵敏性的那些分类器,或者选择具有较高灵敏性而损失一些特异性的那些分类器。因为本文所述的方法仅涉及“性能”的一个测量,因此可以使用导致单个性能测量的任何加权方案。不同应用对于真阳性和真阴性发现会有不同益处,以及与假阳性发现和假阴性发现相关的不同成本。例如,筛选无症状的高风险个体以及鉴别诊断胰腺癌与良性GI症状通常不具有相同的特异性与灵敏性之间的最佳平衡(trade-off)。两种测试的不同需求通常需要对阳性和阴性误分类设定不同的加权,反映在性能测量中。改变性能测量通常会改变对于给定数据集合的选自表1第2列的确切标记子集。The measure of classifier performance used here is AUC; a performance of 0.5 is the baseline expected value for a random (coin toss) classifier, a score between 0.0 and 0.5 for classifiers that perform worse than random, and a score between 0.5 and 1.0 for classifiers that perform better than random. A perfect classifier without error has a sensitivity of 1.0 and a specificity of 1.0. The method of Example 4 can be used for other conventional measures of performance, such as the F-measure, the sum of sensitivity and specificity, or the product of sensitivity and specificity. In particular, it may be desirable to weight sensitivity and specificity differently, thereby selecting those classifiers that have higher specificity at the expense of some sensitivity, or selecting those classifiers that have higher sensitivity at the expense of some specificity. Because the methods described herein only involve one measure of "performance," any weighting scheme that results in a single performance measure can be used. Different applications will have different benefits for true positive and true negative findings, as well as different costs associated with false positive and false negative findings. For example, screening asymptomatic high-risk individuals and differentiating pancreatic cancer from benign GI conditions generally do not have the same optimal trade-off between specificity and sensitivity. The different demands of the two tests generally require different weightings for positive and negative misclassifications, which are reflected in the performance measure. Changing the performance measure generally changes the exact subset of markers selected from column 2 of Table 1 for a given data set.

对于实施例3所述的区分胰腺癌样品与对照样品的贝叶斯方法,通过生物标记在疾病和良性训练样品中的分布来将分类器完全参数化,并且生物标记列表选自表1;即,给定训练数据集合,选择包括的标记子集以一对一的方式确定分类器。For the Bayesian approach to distinguishing pancreatic cancer samples from control samples described in Example 3, the classifier was fully parameterized by the distribution of biomarkers in disease and benign training samples, and the list of biomarkers was selected from Table 1; that is, given a set of training data, the subset of included markers was selected to determine the classifier in a one-to-one manner.

此处采用的贪婪方法用于从表1检索最佳标记子集。对于小数目标记或具有较少标记的分类器,列举每个可能的标记子集并评价用该特定标记集合构建的分类器的性能(见实施例4,部分2)。(这种方法在统计学领域公知为“最佳子集选择”;参见例如Hastie等人)。但是,对于本文所述的分类器,多个标记的组合的数目可能非常大,并且评价10个标记的每个可能的集合是不可行的,因为可以从仅30个总分析物的列表产生30,045,015种可能的组合。因为通过每个标记子集检索不切实际,所以单个最佳子集可能不能发现;但是,通过使用这种方法,发现了许多优异的子集,并且在许多情况下,任何这些子集可以代表最佳的子集。The greedy approach used here is used to retrieve the best marker subsets from Table 1. For a small number of markers or classifiers with fewer markers, each possible marker subset is enumerated and the performance of the classifier constructed with this particular marker set is evaluated (see Example 4, Part 2). (This approach is known in the field of statistics as "best subset selection"; see, for example, Hastie et al.). However, for the classifiers described herein, the number of combinations of multiple markers may be very large, and it is not feasible to evaluate each possible set of 10 markers, as 30,045,015 possible combinations can be generated from a list of only 30 total analytes. Because it is impractical to search through each marker subset, a single best subset may not be found; however, by using this approach, many excellent subsets have been found, and in many cases, any of these subsets may represent the best subset.

代替评价每个可能的标记集合,可以进行“贪婪”正向逐步方法("greedy"forwardstepwise approach)(参见例如Dabney AR,Storey JD(2007)Optimality Driven NearestCentroid Classification from Genomic Data.PLoS ONE 2(10):e1002.doi:10.1371/journal.pone.0001002)。使用这种方法,分类器以最佳的单个标记(基于各个标记的KS-距离)开始,并且在每步通过依次尝试标记列表中目前不是分类器中的标记集合的成员的每个成员而成长。将与现有分类器组合评分最佳的一个标记加入到分类器中。重复直至不再实现性能的进一步改善。不幸地,这种方法可能错过有价值的标记组合,一些单独的标记在方法终止前没有全部被选择。Instead of evaluating each possible marker set, a "greedy" forward stepwise approach can be performed (see, for example, Dabney AR, Storey JD (2007) Optimality Driven Nearest Centroid Classification from Genomic Data. PLoS ONE 2 (10): e1002. doi: 10.1371/journal.pone.0001002). Using this approach, a classifier starts with the best single marker (based on the KS-distance of each marker) and grows at each step by sequentially trying each member of the marker set that is not currently in the classifier in the marker list. A marker that is best combined with the existing classifier is added to the classifier. Repeat until further improvement in performance is no longer achieved. Unfortunately, this approach may miss valuable marker combinations, and some individual markers are not all selected before the method terminates.

此处使用的贪婪方法是前述正向逐步方法的加工,为了扩大检索,而不是在每个步骤中保持仅单个候选分类器(标记子集),保持候选分类器列表。该列表用每个单个标记子集(使用表中每个标记自身)播种。通过从目前在列表中的分类器衍生新的分类器(标记子集)并将它们加入列表中来逐步扩大列表。通过加入不是该分类器的部分的来自表1的任何标记(其加入到子集不会复制现有子集)(这些称为“允许标记”)延伸目前在列表上的每个标记子集。每个现有标记子集通过来自列表的每个允许标记延伸。很明显,这种方法最终会产生每个可能的子集,并且该列表会用完空间。因此,所有产生的分类器仅保持到列表小于某些预定大小时(通常足以保持所有三个标记子集)。一旦列表达到预定大小限制,其变成精英(elitist);即,仅显示一定水平性能的那些分类器保持在列表上,其他分类器跌落到列表末尾并被丢弃。这通过保持按照分类器性能顺序分选的列表来实现;插入至少与目前列表上最差分类器一样好的新的分类器,使得强制排除当前末尾的后进者。一种进一步执行细节是列表在每个产生步骤被完全置换;因此,列表上的每个分类器具有相同数目的标记,在每个步骤每个分类器的标记数目增加一个。The greedy method used here is a modification of the aforementioned forward stepwise method. In order to expand the search, instead of keeping only a single candidate classifier (marker subset) at each step, a list of candidate classifiers is kept. This list is seeded with each single marker subset (using each marker in the table itself). The list is gradually expanded by deriving new classifiers (marker subsets) from the classifiers currently in the list and adding them to the list. Each marker subset currently on the list is extended by adding any marker from Table 1 that is not part of the classifier (whose addition to the subset does not copy the existing subset) (these are called "allowed markers"). Each existing marker subset is extended by each allowed marker from the list. Obviously, this method will eventually produce every possible subset, and the list will run out of space. Therefore, all generated classifiers are only kept until the list is less than a certain predetermined size (usually enough to keep all three marker subsets). Once the list reaches the predetermined size limit, it becomes elite; that is, only those classifiers that show a certain level of performance are kept on the list, and the other classifiers fall to the end of the list and are discarded. This is achieved by maintaining a list sorted by classifier performance; inserting new classifiers that are at least as good as the worst classifier currently on the list forces the exclusion of the current underachiever. A further implementation detail is that the list is completely permuted at each generation step; thus, each classifier on the list has the same number of tokens, and at each step the number of tokens per classifier is increased by one.

由于这个方法使用不同标记组合产生候选分类器列表,所以人们会询问是否可以组合分类器以避免可能由最佳单个分类器或由最佳分类器的少数组产生的误差。这类“总体”和“专家委员会(committee of experts)”方法是统计学和机器学习领域公知的,并且包括例如“Averaging”、“Voting”、“Stacking”、“Bagging”和“Boosting”(参见例如Hastie等人)。简单分类器的这些组合提供了通过包括一些不同分类器及因此来自生物标记表的更大的标记集合的信息来降低分类中由于任何特定标记集合中的噪声所导致的方差的方法,这有效地在分类器之间平均。这种方法的有用性的实例是其可以防止单个标记中的异常值(outlier)负面影响单个样品的分类。测量更大数目信号的需求在常规的“一次一个标记”抗体测定中可能是不切实际的,但是对于完全多重适配体测定没有下降趋势(downside)。这些技术获益于更广泛的生物标记表,并且使用有关疾病过程的多种信息来源以提供更稳健的分类。Because this approach uses different combinations of markers to generate a list of candidate classifiers, one has asked whether classifiers can be combined to avoid the errors that may be generated by the best single classifier or by a small number of groups of the best classifiers. Such "ensemble" and "committee of experts" methods are well known in the fields of statistics and machine learning and include, for example, "averaging," "voting," "stacking," "bagging," and "boosting" (see, for example, Hastie et al.). These combinations of simple classifiers provide a way to reduce the variance in classification due to noise in any particular set of markers by including information from several different classifiers and, therefore, a larger set of markers from the biomarker table, effectively averaging across classifiers. An example of the usefulness of this approach is that it can prevent outliers in a single marker from negatively impacting the classification of a single sample. The need to measure a larger number of signals may be impractical in conventional "one marker at a time" antibody assays, but there is no downside for fully multiplexed aptamer assays. These techniques benefit from a broader biomarker table and use multiple sources of information about the disease process to provide more robust classification.

表1中选择的生物标记给出的分类器表现好于用“非标记”(即具有不满足包括在表1中的标准的信号的蛋白(如实施例2所述))构建的分类器。The biomarkers selected in Table 1 gave classifiers that performed better than classifiers constructed with "non-markers," ie, proteins with signals that did not meet the criteria included in Table 1 (as described in Example 2).

对于仅含有一个、两个和三个标记的分类器,用表1的生物标记获得的所有可能的分类器被列举并检查性能分布,与从随机选择的非标记信号的相似表构建的分类器进行比较。For classifiers containing only one, two, and three markers, all possible classifiers obtained using the biomarkers of Table 1 were enumerated and the performance distribution examined in comparison with classifiers constructed from similar tables of randomly selected non-marker signals.

在图11中,AUC用作性能的测量;0.5的性能是随机(抛硬币)分类器的基线预期值。分类器性能的柱状图与来自从65个非标记信号的“非标记”表构建的分类器的相似穷举的性能的柱状图进行比较;所述65个信号随机选自未证实对照与疾病群体之间的差异信号的适配体。In Figure 11, AUC is used as a measure of performance; a performance of 0.5 is the baseline expected value for a random (coin toss) classifier. The histogram of classifier performance is compared with a histogram of the performance of a similar exhaustive classifier constructed from a "non-labeled" table of 65 non-labeled signals; the 65 signals were randomly selected from aptamers that did not demonstrate differential signal between the control and disease populations.

图11显示从表14的生物标记参数构建的所有可能的单标记、二标记和三标记分类器对于可以区分对照组和胰腺癌的生物标记的性能的柱状图,并且将这些分类器与用65个“非标记”适配体RFU信号构建的所有可能的单标记、二标记和三标记分类器进行比较。图11A示出单标记分类器性能的柱状图,图11B示出二标记分类器性能的柱状图,而图11C示出三标记分类器性能的柱状图。Figure 11 shows a histogram of the performance of all possible single-, two-, and three-marker classifiers constructed from the biomarker parameters of Table 14 for biomarkers that can distinguish between a control group and pancreatic cancer, and compares these classifiers with all possible single-, two-, and three-marker classifiers constructed using 65 "non-marker" aptamer RFU signals. Figure 11A shows a histogram of the performance of the single-marker classifier, Figure 11B shows a histogram of the performance of the two-marker classifier, and Figure 11C shows a histogram of the performance of the three-marker classifier.

在图11中,实线表示使用表14中GI和正常对照以及胰腺癌的生物标记数据的所有单标记、二标记和三标记分类器的分类器性能的柱状图。虚线是使用对照和胰腺癌的数据但是使用随机非标记信号集合的所有单标记、二标记和三标记分类器的分类器性能的柱状图。In Figure 11, the solid line is a bar graph showing the classifier performance of all single-, two-, and three-marker classifiers using the biomarker data for GI and normal controls and pancreatic cancer in Table 14. The dashed line is a bar graph showing the classifier performance of all single-, two-, and three-marker classifiers using the control and pancreatic cancer data but using a random set of non-marker signals.

从表1所列的标记构建的分类器形成了独特的柱状图,对于所有单标记、二标记和三标记比较,与用来自“非标记”的信号构建的分类器分离良好。从表1的生物标记构建的分类器的性能和AUC评分随标记数增加也比从非标记构建的分类器更快,随着每个分类器标记数增加,标记和非标记分类器之间的分离增加。用表14所列的生物标记构建的所有分类器比用“非标记”构建的分类器表现更好。Classifiers constructed from the markers listed in Table 1 formed a distinct histogram, separating well from classifiers constructed using signals from "non-markers" for all single-, two-, and three-marker comparisons. The performance and AUC scores of classifiers constructed from the biomarkers in Table 1 also increased faster with the number of markers than those constructed from non-markers, with the separation between marker and non-marker classifiers increasing as the number of markers per classifier increased. All classifiers constructed using the biomarkers listed in Table 14 performed better than classifiers constructed using "non-markers."

分类器性能的分布显示有许多可能的多标记分类器可以来源于表1的分析物的集合。虽然如通过单一分析物的分类器评分的分布和AUC所证实的,一些生物标记本身比其他生物标记好,但是确定构建高性能分类器是否需要这类生物标记是可取的。为了进行这种确定,通过去掉一定数量的最佳生物标记来检测分类器性能的行为。图12比较了用表1的生物标记的完整列表构建的分类器的性能与用排除顶级标记的来自表1的生物标记的子集构建的分类器的性能。The distribution of classifier performance shows that there are many possible multi-marker classifiers that can be derived from the set of analytes in Table 1. Although some biomarkers are inherently better than others, as demonstrated by the distribution of classifier scores and AUCs for single analytes, it is desirable to determine whether such biomarkers are required to build a high-performing classifier. To make this determination, the behavior of classifier performance is examined by removing a certain number of the best biomarkers. Figure 12 compares the performance of a classifier built using the full list of biomarkers in Table 1 with the performance of a classifier built using a subset of the biomarkers from Table 1 that excludes the top biomarkers.

图12证实未用最佳标记构建的分类器表现良好,这暗示分类器的性能不是由一些小的核心组标记所决定,并且与疾病相关的潜在过程中的变化反映在许多蛋白的活性中。表1中的许多生物标记子集表现接近最佳,甚至在除去表1的65个标记的最好的15个之后也是如此。从表1去掉15个顶级标记(通过KS-距离评级)后,分类器性能随着选自该表的标记的数目增加至达到几乎0.87的AUC,与选自生物标记的完整列表的0.91的最佳分类器评分的性能接近。Figure 12 demonstrates that classifiers constructed without the optimal markers perform well, suggesting that classifier performance is not determined by a small core set of markers and that changes in the underlying processes associated with the disease are reflected in the activity of many proteins. Many subsets of biomarkers in Table 1 performed close to optimal, even after removing the top 15 of the 65 markers in Table 1. After removing the 15 top markers from Table 1 (ranked by KS-distance), classifier performance increased with the number of markers selected from the table to reach an AUC of almost 0.87, close to the performance of the best classifier score of 0.91 selected from the complete list of biomarkers.

最后,图16示出根据实施例3从表14中的参数列表构建的典型分类器的ROC性能。用CTSB、C5a、C5、CCL18和CSF1R构建5分析物分类器。图16A示出模型的性能,假设这些标记的独立性,如实施例3,并且图16B示出从用于定义表14的参数的研究数据集合产生的经验ROC曲线。如通过AUC证实的,可以看出给定数目的所选标记的性能在定性上是一致的,并且定量一致性一般很好,虽然模型计算倾向于高估分类器性能。这与以下观点一致,即有关疾病过程的任何特定生物标记贡献的信息与表1提供的其他生物标记贡献的信息是冗余的,而模型计算假设完全独立性。因此图16证实表1联合实施例3所述的方法允许构建和评价可用于区分胰腺癌与对照组的许多分类器。Finally, Figure 16 illustrates the ROC performance of a representative classifier constructed from the parameter list in Table 14 according to Example 3. A five-analyte classifier was constructed using CTSB, C5a, C5, CCL18, and CSF1R. Figure 16A illustrates the performance of the model, assuming independence of these markers, as in Example 3, and Figure 16B illustrates the empirical ROC curve generated from the study data set used to define the parameters in Table 14. As demonstrated by the AUC, it can be seen that the performance for a given number of selected markers is qualitatively consistent, and quantitative agreement is generally good, although model calculations tend to overestimate classifier performance. This is consistent with the view that the information contributed by any particular biomarker regarding the disease process is redundant with that contributed by the other biomarkers provided in Table 1, which assumes complete independence. Thus, Figure 16 demonstrates that Table 1, combined with the methods described in Example 3, allows for the construction and evaluation of numerous classifiers that can be used to distinguish pancreatic cancer from controls.

实施例5.并入CA19-9Example 5. Incorporation of CA19-9

癌症相关抗原19-9(CA19-9)是已知的胰腺癌的血清标记。报道的CA19-9对胰腺癌的灵敏性和特异性分别为80-90%。然而,CA 19-9鉴定患有小型手术可切除的癌症的患者的准确性有限。CA 19-9的特异性也是有限的;CA 19-9经常在患有各种良性胆胰病症的患者中升高。Cancer-associated antigen 19-9 (CA19-9) is a known serum marker for pancreatic cancer. The reported sensitivity and specificity of CA19-9 for pancreatic cancer are 80-90%, respectively. However, the accuracy of CA19-9 in identifying patients with small, surgically resectable cancers is limited. The specificity of CA19-9 is also limited; CA19-9 is often elevated in patients with various benign pancreatic and biliary disorders.

CA 19-9在胰腺癌中升高的程度与长期预后有关。此外,在看来患有潜在的可切除的疾病的患者中,CA 19-9水平的数量级还可以有助于预测放射显影隐匿性转移性疾病的存在。CA 19-9水平的系列监测可用于随访潜在的根治性手术后的患者以及接受晚期疾病的化疗的患者。升高的CA 19-9水平通常在复发性疾病的放射显影出现之前,但是疾病发展的证实应当通过成像研究和/或活组织检查来进行。生物标记水平的检测联合CA 19-9与单独CA 19-9相比,可以改善检测胰腺癌(或其他胰腺癌相关用途)的灵敏性、特异性和/或AUC。The extent to which CA 19-9 is elevated in pancreatic cancer is associated with long-term prognosis. In addition, in patients who appear to have potentially resectable disease, the magnitude of CA 19-9 levels can also help predict the presence of radiographically occult metastatic disease. Serial monitoring of CA 19-9 levels can be used to follow up patients after potentially curative surgery and patients receiving chemotherapy for advanced disease. Elevated CA 19-9 levels typically precede the appearance of radiographic recurrence of disease, but confirmation of disease progression should be performed by imaging studies and/or biopsy. Detection of biomarker levels in combination with CA 19-9 can improve the sensitivity, specificity, and/or AUC for detecting pancreatic cancer (or other pancreatic cancer-related uses) compared to CA 19-9 alone.

升高的CA19-9水平据认为是血清中35-40U/ml。Elevated CA19-9 levels are considered to be 35-40 U/ml in serum.

我们收到了训练样品的子集的临床CA19-9测量。最初的100个病例和69个对照中,我们具有99个病例和52个对照的CA19-9测量。因此,我们利用表1中的SOMAmer的子集在样品的这个子集上训练随机森林模型的新集合。我们还训练新的分类器,其并入CA19-9测量与我们的SOMAmer组(合并的组)。We received clinical CA19-9 measurements for a subset of the training samples. Of the original 100 cases and 69 controls, we had CA19-9 measurements for 99 cases and 52 controls. Therefore, we trained a new set of random forest models on this subset of samples using the subset of SOMAmers in Table 1. We also trained a new classifier that incorporated the CA19-9 measurements with our SOMAmer set (the merged set).

三种不同方法(SOMAmer、CA19-9及合并的组)的分类器性能如图13所示。SOMAmer组和CA19-9表现相似,然而当这两者合并入单个分类器时,性能显著提高。对于100%的特异性,SOMAmer组和CA19-9具有正好低于50%的灵敏性,而合并的分类器具有约75%的灵敏性。The classifier performance of the three different methods (SOMAmer, CA19-9, and the combined panel) is shown in Figure 13. The SOMAmer panel and CA19-9 performed similarly, but when the two were combined into a single classifier, the performance improved significantly. For a specificity of 100%, the SOMAmer panel and CA19-9 had a sensitivity just under 50%, while the combined classifier had a sensitivity of approximately 75%.

进一步分析显示当CA19-9包括在分类器中时,相同的相对性能所需的SOMAmer数目减少。图14示出使用CA19-9以及一个或两个额外的SOMAmer的随机森林分类器的性能。左图示出利用CA19-9和HAMP训练的模型的性能,而右图示出CA19-9、HAMP和CTSB的性能。Further analysis showed that when CA19-9 was included in the classifier, the number of SOMAmers required for the same relative performance was reduced. Figure 14 shows the performance of random forest classifiers using CA19-9 and one or two additional SOMAmers. The left graph shows the performance of the model trained with CA19-9 and HAMP, while the right graph shows the performance of CA19-9, HAMP, and CTSB.

实施例6.临床生物标记组Example 6. Clinical Biomarker Panel

随机森林分类器构建自可能最适合用于临床诊断测试的所选生物标记的组。不像通过朴素贝叶斯贪婪正向算法选择的模型,随机森林分类器不假设生物标记是随机分布的。因此这个模型可以利用在朴素贝叶斯分类器中无效的来自表1的生物标记。A random forest classifier was constructed from a panel of selected biomarkers that are likely to be most suitable for clinical diagnostic testing. Unlike the model selected by the naive Bayes greedy forward algorithm, the random forest classifier does not assume that the biomarkers are randomly distributed. Therefore, this model can utilize biomarkers from Table 1 that are not effective in the naive Bayes classifier.

使用利用随机森林分类器提供的基尼重要性度量的反向淘汰法选择组。基尼重要性是生物标记正确分类训练集合中的样品的有效性的度量。生物标记重要性的这种度量可以用来淘汰对于分类器的性能较不重要的标记。反向淘汰法通过构建包括表1中所有65个的随机森林分类器开始。然后淘汰最不重要的生物标记,并且用剩余的生物标记构建新模型。继续这个过程直至仅剩单个生物标记。The panel was selected using a backward elimination method using the Gini importance metric provided by the random forest classifier. Gini importance is a measure of the effectiveness of a biomarker in correctly classifying samples in the training set. This measure of biomarker importance can be used to eliminate markers that are less important to the performance of the classifier. The backward elimination method begins by building a random forest classifier that includes all 65 biomarkers in Table 1. The least important biomarker is then eliminated, and a new model is constructed using the remaining biomarkers. This process continues until only a single biomarker remains.

所选的最终的组提供最大AUC与模型中标记的最低数目之间的最佳平衡。满足这些标准的10个生物标记的组由以下分析物组成:APOA1、CTSB、C2、MMP7、HAMP、TFPI、C5、c5a、SFRP1和ETHE1。这个生物标记组的ROC曲线的图如图15所示。该图示出箭头所示的两种可能的决定截断值:有症状的截断值,其中84%或更多的灵敏性可以与至少80%特异性一起获得;以及无症状的截断值,其中97.5%的特异性可以与至少60%灵敏性一起获得。The final panel selected provided the best balance between maximal AUC and minimum number of markers in the model. The 10-biomarker panel that met these criteria consisted of the following analytes: APOA1, CTSB, C2, MMP7, HAMP, TFPI, C5, c5a, SFRP1, and ETHE1. A plot of the ROC curve for this biomarker panel is shown in FIG15 . The figure shows two possible decision cutoffs, indicated by arrows: a symptomatic cutoff, where a sensitivity of 84% or more can be achieved with at least 80% specificity; and an asymptomatic cutoff, where a specificity of 97.5% can be achieved with at least 60% sensitivity.

实施例7.用于诊断癌症的生物标记Example 7. Biomarkers for diagnosing cancer

鉴定用于癌症的一般诊断的潜在生物标记。评价来自3种不同类型的癌症(胰腺癌、肺癌和间皮瘤)的病例和对照样品。在采集部位中,包括标准是至少18岁,签署知情同意书。病例和对照均排除除考虑的癌症之外的已知恶性肿瘤。Identification of potential biomarkers for general diagnosis of cancer. Case and control samples from three different types of cancer (pancreatic cancer, lung cancer, and mesothelioma) were evaluated. Inclusion criteria were age at least 18 years and signed informed consent at the collection site. Both cases and controls were excluded from known malignancies other than the cancer under consideration.

胰腺癌.病例和对照样品如实施例2所述获得。Pancreatic cancer. Case and control samples were obtained as described in Example 2.

肺癌.病例和对照样品获得自3个学术癌症中心生物数据库(biorepositories)和一个商业生物数据库以鉴定潜在的标记,用于鉴别诊断非小细胞肺癌(NSCLC)与高风险吸烟者的对照组和具有良性肺小结的个体。该研究由采集自吸烟者和具有良性小结的患者以及320个诊断患有NSCLC的个体的978个样品组成。Lung cancer. Case and control samples were obtained from three academic cancer center biorepositories and one commercial biorepository to identify potential markers for differential diagnosis of non-small cell lung cancer (NSCLC) from controls of high-risk smokers and individuals with benign lung nodules. The study consisted of 978 samples collected from smokers and patients with benign nodules, as well as 320 individuals diagnosed with NSCLC.

胸膜间皮瘤.Pleural mesothelioma.

病例和对照样品获得自学术癌症中心数据库以鉴定潜在的标记,用于鉴别诊断恶性胸膜间皮瘤与具有石棉接触史或良性肺疾病的患者,包括后来诊断为非恶性的可疑放射学发现。该研究由采集自接触石棉的个体的30个样品和采集自间皮瘤患者的41个样品组成。Case and control samples were obtained from an academic cancer center database to identify potential markers for differentiating malignant pleural mesothelioma from patients with a history of asbestos exposure or benign lung disease, including suspicious radiological findings later diagnosed as non-malignant. The study consisted of 30 samples collected from individuals exposed to asbestos and 41 samples collected from patients with mesothelioma.

癌症生物标记的最终列表通过组合3个不同癌症研究中的每一个考虑的生物标记的集合来鉴定。利用贪婪算法成功构建了使用增加大小的生物标记集合的贝叶斯分类器(如本实施例的7.2节更详细地描述)。将可用于在所述癌症类型中一般诊断癌症的生物标记的集合(或组)汇编为函数集合(或组)大小并分析它们的性能。这个分析导致表19所示的10个癌症生物标记的列表,其中每一个存在于这些连续标记集合的至少一个中,所述连续标记集合的大小为3-10个标记。作为说明性实例,我们描述如表32所示的由10个癌症生物标记组成的特定组的产生。The final list of cancer biomarkers was identified by combining the sets of biomarkers considered in each of the three different cancer studies. A Bayesian classifier using biomarker sets of increasing size was successfully constructed using a greedy algorithm (as described in more detail in Section 7.2 of this Example). Sets (or panels) of biomarkers that can be used to generally diagnose cancer in the cancer type were compiled as a function of set (or panel) size and their performance was analyzed. This analysis resulted in a list of 10 cancer biomarkers, shown in Table 19, each of which was present in at least one of these continuous marker sets, the size of which ranged from 3 to 10 markers. As an illustrative example, we describe the generation of a specific panel of 10 cancer biomarkers, as shown in Table 32.

7.1癌症的朴素贝叶斯分类7.1 Naive Bayesian Classification of Cancer

如本实施例的7.2节所述,从表1中生物标记的列表,利用生物标记选择的贪婪算法选择10个潜在癌症生物标记的组。给这3个中的每一个构建不同的朴素贝叶斯分类器。将类别依赖性概率密度函数(pdf)、p(xi|c)和p(xi|d)建模为log-正态分布函数,其中xi为生物标记i的测量的RFU值的log,c和d指对照和疾病群体,该函数的特征在于平均值μ和方差σ2。由10个潜在生物标记组成的3个模型的pdf的参数如表31所列。As described in Section 7.2 of this Example, a panel of 10 potential cancer biomarkers was selected from the list of biomarkers in Table 1 using a greedy algorithm for biomarker selection. A different naive Bayes classifier was constructed for each of these three. The class-dependent probability density functions (pdfs), p( xi |c) and p( xi |d), were modeled as log-normal distribution functions, where xi is the log of the measured RFU value for biomarker i, c and d refer to the control and disease populations, respectively. The function is characterized by a mean μ and a variance σ² . The parameters of the pdfs for the three models consisting of 10 potential biomarkers are listed in Table 31.

这样的模型的朴素贝叶斯分类在下述等式中给出,,其中P(d)为适合测试的群体中的疾病发病率,并且n=10。求和中的每一项是单个标记的log似然比,不患有所关注的疾病(即在这种情况下,来自3种不同癌症类型的每种特定疾病)比患有该疾病的样品的总log似然比是这些单独项加上表示疾病发病率的项的简单加和。为简便起见,我们假设p(d)=0.5,从而The naive Bayes classification for such a model is given by the following equation, where P(d) is the disease prevalence in the population suitable for testing, and n = 10. Each term in the sum is the log likelihood ratio of a single marker, and the total log likelihood ratio of a sample without the disease of interest (i.e., in this case, each specific disease from the three different cancer types) to one with the disease is simply the sum of these individual terms plus a term representing the disease prevalence. For simplicity, we assume p(d) = 0.5, so that

鉴于10.1、8.9、8.8、8.8、9.1、7.3、8.2、9.5、6.7、7.7的10个生物标记中每一个的以log(RFU)计的未知样品测量,分类的计算在表32中详述。将包含疾病比对照类别的log似然比的单个组分制成表格,并且可以从表31中的参数和的值计算。个体log似然比的总和为-4.568,或者没有疾病比患有疾病的个体的似然性为96,其中似然性e4.568=96。仅1个生物标记值具有与疾病组更一致的似然性(log似然性>0),但是据发现剩余的9个生物标记全部一致有利于对照组。将似然性相乘给出上述相同结果;未知样品不患有疾病的似然性为96。事实上,这个样品来自NSCLC训练集合中的对照群体。Given the unknown sample measurements in log(RFU) for each of the 10 biomarkers of 10.1, 8.9, 8.8, 8.8, 9.1, 7.3, 8.2, 9.5, 6.7, and 7.7, the classification calculations are detailed in Table 32. The individual components comprising the log likelihood ratios for the disease versus control categories are tabulated and can be calculated from the values of the parameters and in Table 31. The sum of the individual log likelihood ratios is -4.568, or a likelihood of 96 for individuals without disease versus those with disease, where likelihood e 4.568 = 96. Only one biomarker value had a likelihood that was more consistent with the disease group (log likelihood > 0), but the remaining nine biomarkers were all found to consistently favor the control group. Multiplying the likelihoods gives the same result as above; the likelihood that the unknown sample does not have disease is 96. In fact, this sample is from the control population in the NSCLC training set.

7.2选择用于分类器的癌症生物标记组的贪婪算法7.2 Greedy Algorithm for Selecting Cancer Biomarker Panels for Classifiers

部分1Part 1

选择表1中生物标记的子集以构建潜在的分类器,其可以用来确定哪些可以用作一般癌症生物标记以检测癌症。A subset of the biomarkers in Table 1 were selected to construct potential classifiers that can be used to determine which can be used as general cancer biomarkers to detect cancer.

考虑到标记的集合,为3个癌症研究中的每一个训练不同的模型,所以需要性能的全面测量以选择能够同时分类许多不同类型的癌症的生物标记的集合。这里使用的分类器性能的测量是所有朴素贝叶斯分类器的ROC曲线下面积的平均值。ROC曲线单个分类器真阳性率(灵敏性)对假阳性率(1-特异性)的作图。ROC曲线下面积(AUC)范围为0-1.0,其中1.0的AUC对应于完美分类,而0.5的AUC对应于随机(抛硬币)分类器。可以应用性能的其他常规测量如F-测量或者灵敏性和特异性之和或之积。特别地,可能想要用不同加权处理灵敏性和特异性,以便选择具有较高特异性而损失一些灵敏性的那些分类器,或者选择具有较高灵敏性而损失特异性的那些分类器。我们选择使用AUC,因为它在单一测量中涵盖灵敏性和特异性的所有组合。不同应用对于真阳性和真阴性发现会有不同益处,并且会具有与假阳性发现和假阴性发现相关的不同成本。改变性能测量可能改变选择用于数据的给定集合的标记的确切子集。Given the set of markers, a different model was trained for each of the three cancer studies, so a comprehensive measure of performance was needed to select a set of biomarkers that could simultaneously classify many different types of cancer. The measure of classifier performance used here was the average of the area under the ROC curve for all naive Bayes classifiers. The ROC curve plots the true positive rate (sensitivity) of a single classifier against the false positive rate (1-specificity). The area under the ROC curve (AUC) ranges from 0-1.0, where an AUC of 1.0 corresponds to perfect classification and an AUC of 0.5 corresponds to a random (coin toss) classifier. Other conventional measures of performance such as the F-measure or the sum or product of sensitivity and specificity can be applied. In particular, one may want to weight sensitivity and specificity differently to select those classifiers with higher specificity at the expense of some sensitivity, or to select those classifiers with higher sensitivity at the expense of specificity. We chose to use AUC because it covers all combinations of sensitivity and specificity in a single measure. Different applications will have different benefits for true positive and true negative discoveries, and will have different costs associated with false positive and false negative discoveries.Changing the performance measure may change the exact subset of markers that is selected for a given set of data.

对于本实施例7.1节所述的区分癌症样品与对照样品的贝叶斯方法,通过生物标记在3个癌症研究中的每一个中的分布来将分类器完全参数化,并且生物标记列表选自表19。即,给定训练数据集合,选择包括的标记子集以一对一的方式确定分类器。For the Bayesian approach to distinguishing cancer samples from control samples described in Section 7.1 of this Example, the classifier was fully parameterized by the distribution of biomarkers in each of the three cancer studies, and the list of biomarkers was selected from Table 19. That is, given a set of training data, the included subset of markers was selected to determine the classifier in a one-to-one manner.

此处采用的贪婪方法用于从表1检索最佳标记子集。对于小数目标记或具有较少标记的分类器,列举每个可能的标记子集并评价用该特定标记集合构建的分类器的性能(见实施例4,部分2)。(这种方法在统计学领域公知为“最佳子集选择”;参见例如Hastie等人)。但是,对于本文所述的分类器,多个标记的组合的数目可能非常大,并且评价10个标记的每个可能的集合是不可行的,因为可以从仅30个总分析物的列表产生30,045,015种可能的组合。因为通过每个标记子集检索不切实际,所以单个最佳子集可能不能发现;但是,通过使用这种方法,发现了许多优异的子集,并且在许多情况下,任何这些子集可以代表最佳的子集。The greedy approach used here is used to retrieve the best marker subsets from Table 1. For a small number of markers or classifiers with fewer markers, each possible marker subset is enumerated and the performance of the classifier constructed with this particular marker set is evaluated (see Example 4, Part 2). (This approach is known in the field of statistics as "best subset selection"; see, for example, Hastie et al.). However, for the classifiers described herein, the number of combinations of multiple markers may be very large, and it is not feasible to evaluate each possible set of 10 markers, as 30,045,015 possible combinations can be generated from a list of only 30 total analytes. Because it is impractical to search through each marker subset, a single best subset may not be found; however, by using this approach, many excellent subsets have been found, and in many cases, any of these subsets may represent the best subset.

代替评价每个可能的标记集合,可以进行“贪婪”正向逐步方法(参见例如DabneyAR,Storey JD(2007)Optimality Driven Nearest Centroid Classification fromGenomic Data.PLoS ONE 2(10):e1002.doi:10.1371/journal.pone.0001002)。使用这种方法,分类器以最佳的单个标记(基于各个标记的KS-距离)开始,并且在每步通过依次尝试标记列表中目前不是分类器中的标记集合的成员的每个成员而成长。将与现有分类器组合评分最佳的一个标记加入到分类器中。重复直至不再实现性能的进一步改善。不幸地,这种方法可能错过有价值的标记组合,一些单独的标记在方法终止前没有全部被选择。Instead of evaluating each possible marker set, a "greedy" forward stepwise approach can be performed (see, for example, Dabney AR, Storey JD (2007) Optimality Driven Nearest Centroid Classification from Genomic Data. PLoS ONE 2 (10): e1002. doi: 10.1371/journal.pone.0001002). Using this approach, a classifier begins with the best single marker (based on the KS-distance of each marker), and grows at each step by successively trying each member of the marker set that is not currently in the classifier in the marker list. A marker that is best combined with the existing classifier is added to the classifier. Repeat until no further improvement in performance is achieved. Unfortunately, this approach may miss valuable marker combinations, and some individual markers were not all selected before the method terminated.

此处使用的贪婪方法是前述正向逐步方法的加工,为了扩大检索,而不是在每个步骤中保持仅单个标记子集,保持候选标记集合的列表。该列表用单个标记的列表播种。通过从目前在列表中的标记子集衍生新的标记子集并将它们加入列表中来逐步扩大列表。通过加入不是该分类器的部分的来自表1的任何标记(其加入到子集不会复制现有子集)(这些称为“允许标记”)延伸目前在列表上的每个标记子集。每次定义标记的新集合,利用这些标记训练由每个癌症研究各一个组成的分类器的集合,并且通过全部3个研究的平均AUC测量全面性能。为了避免潜在的过拟合,通过10倍交叉验证方法计算每个癌症研究模型的AUC。每个现有标记子集通过来自列表的每个允许标记延伸。很明显,这种方法最终会产生每个可能的子集,并且该列表会用完空间。因此,所有产生的标记集合仅在列表小于某些预定大小时保持。一旦列表达到预定大小限制,其变成精英;即,仅显示一定水平性能的那些分类器集合保持在列表上,其他分类器跌落到列表末尾并被丢弃。这通过保持按照分类器集合性能顺序分选的列表来实现;插入其分类器整体上至少与目前列表上最差分类器一样好的新标记集合,使得强制排除当前实现分类器集合下的末尾。一种进一步执行细节是列表在每个产生步骤被完全置换;因此,列表上的每个标记集合具有相同数目的标记,在每个步骤每个分类器的标记数目增加一个。The greedy method used here is a processing of the aforementioned forward stepwise method. In order to expand the search, instead of keeping only a single marker subset in each step, a list of candidate marker sets is kept. This list is seeded with a list of single markers. The list is gradually expanded by deriving new marker subsets from the marker subsets currently in the list and adding them to the list. Each marker subset currently on the list is extended by adding any marker from Table 1 that is not part of the classifier (which is added to the subset without duplicating the existing subset) (these are called "allowed markers"). Each new set of markers is defined, and a set of classifiers consisting of one each for each cancer study is trained using these markers, and the overall performance is measured by the average AUC of all three studies. In order to avoid potential overfitting, the AUC of each cancer study model is calculated by a 10-fold cross validation method. Each existing marker subset is extended by each allowed marker from the list. Obviously, this method will eventually produce every possible subset, and the list will run out of space. Therefore, all generated marker sets are only kept when the list is less than some predetermined size. Once the list reaches a predetermined size limit, it becomes elitist; that is, only those classifier sets that show a certain level of performance remain on the list, and the other classifiers fall to the end of the list and are discarded. This is achieved by maintaining the list sorted in order of classifier set performance; inserting new tag sets whose classifiers are at least as good as the worst classifier currently on the list, so as to force the elimination of the end under the currently implemented classifier set. A further implementation detail is that the list is completely permuted at each generation step; therefore, each tag set on the list has the same number of tags, and at each step the number of tags per classifier is increased by one.

在一实施方案中,可用于构建诊断一般癌症与非癌症的分类器的生物标记的集合(或组)是基于用于分类方案的生物标记的特定组合的平均AUC。我们鉴定了许多能够有效分类不同癌症样品与对照的衍生自表19的标记的生物标记组合。代表性组如表22-29所示,其示出一系列3-10个生物标记的100个不同的组,其具有所示的每组的验证(CV)AUC平均值。这些组的每个中每个标记出现的总数目在每个表的底部示出。In one embodiment, the set (or panel) of biomarkers that can be used to construct a classifier for diagnosing general cancer versus non-cancer is based on the average AUC for a particular combination of biomarkers used in the classification scheme. We identified a number of biomarker combinations derived from the markers in Table 19 that were able to effectively classify different cancer samples from controls. Representative panels are shown in Tables 22-29, which show a range of 100 different panels of 3-10 biomarkers, with the average validation (CV) AUC for each panel shown. The total number of occurrences of each marker in each of these panels is shown at the bottom of each table.

表19所选的生物标记产生表现比用“非标记”构建的分类器更好的分类器。在图17中,我们示出与其他可能的分类器相比的我们的10生物标记分类器的性能。The biomarkers selected in Table 19 produced classifiers that performed better than the classifier built with "non-markers". In Figure 17 we show the performance of our 10-biomarker classifier compared to other possible classifiers.

图17A示出分类器的平均AUC分布,所述分类器构建自10个“非标记”的随机采样集合,其采自存在于所有3个研究中的10的整个集合,排除表19中的10个标记。10个潜在的癌症生物标记的性能显示为垂直虚线。这个图清楚显示这10个潜在的生物标记的性能远远超过其他标记组合的分布。Figure 17A shows the distribution of mean AUCs for classifiers constructed from a randomly sampled set of 10 "non-markers" taken from the entire set of 10 present in all three studies, excluding the 10 markers in Table 19. The performance of the 10 potential cancer biomarkers is shown as vertical dashed lines. This plot clearly shows that the performance of these 10 potential biomarkers far exceeds the distribution of other marker combinations.

图17B显示与图17A相似的分布,然而,随机采样集合限于通过10分析物分类器的贪婪生物标记选择方法未选择的来自表1的55个生物标记。这个图证实通过贪婪算法选择的10个标记代表生物标记的子集,其推广至类型的癌症,远优于用剩余的55个生物标记构建的分类器。Figure 17B shows a similar distribution to Figure 17A, however, the randomly sampled set was limited to the 55 biomarkers from Table 1 that were not selected by the greedy biomarker selection method of the 10-analyte classifier. This figure demonstrates that the 10 markers selected by the greedy algorithm represent a subset of biomarkers that generalizes to types of cancer much better than the classifier built with the remaining 55 biomarkers.

最后,图18示出3个癌症研究分类器中每一个的分类器ROC曲线。上述实施方案和实施例仅作为实例示出。没有特定实施方案、实施例或者特定实施方案或实施例的要素被认为是任何权利要求关键的、需要的或必需的要素或特征。另外,本文描述的要素不是实施所附权利要求所需的,除非明确描述为“必需”或“关键的”。可以对所公开的实施方案进行各种改变、修饰、取代和其他变化而不背离所附权利要求限定的本申请的范围。包括附图和实施例的说明书是示例性而非限制性的,所有这样的修饰和取代包括在本申请范围内。因此,本申请的范围应当由所附的权利要求及其法律等价物限定而非由上述实施例限定。例如,任何方法或过程权利要求中列举的步骤可以任何可行顺序执行而不限于任何实施方案、实施例或权利要求中的顺序。另外,在任何上述方法中,表1或表19的一个或多个生物标记可以作为单个生物标记或作为来自任何组的生物标记特别地排除。Finally, Figure 18 shows the classifier ROC curves for each of the three cancer research classifiers. The above embodiments and examples are shown as examples only. No particular embodiment, example, or element of a particular embodiment or example is considered to be a critical, required, or essential element or feature of any claim. In addition, no element described herein is required to implement the appended claims unless expressly described as "essential" or "critical." Various changes, modifications, substitutions, and other variations may be made to the disclosed embodiments without departing from the scope of the present application as defined by the appended claims. The description, including the figures and examples, is illustrative and not restrictive, and all such modifications and substitutions are intended to be included within the scope of this application. Therefore, the scope of this application should be defined by the appended claims and their legal equivalents, rather than by the above examples. For example, the steps listed in any method or process claim may be performed in any feasible order and are not limited to the order in any embodiment, example, or claim. In addition, in any of the above methods, one or more biomarkers of Table 1 or Table 19 may be specifically excluded as individual biomarkers or as biomarkers from any group.

表1:癌症生物标记Table 1: Cancer biomarkers

表1:续前页Table 1: Continued from previous page

表2:一种生物标记的组Table 2: A panel of biomarkers

表2:续前页Table 2: Continued from previous page

表3:二种生物标记的组Table 3: Two biomarker panels

表3:续前页Table 3: Continued from previous page

表3:续前页Table 3: Continued from previous page

表4:三种生物标记的组Table 4: Panel of three biomarkers

表4:续前页Table 4: Continued from previous page

表4:续前页Table 4: Continued from previous page

表5:四种生物标记的组Table 5: Panel of four biomarkers

表5:续前页Table 5: Continued from previous page

表5:续前页Table 5: Continued from previous page

表6:五种生物标记的组Table 6: Panel of five biomarkers

表6:续前页Table 6: Continued from previous page

表6:续前页Table 6: Continued from previous page

表7:六种生物标记的组Table 7: Panel of six biomarkers

表7:续前页Table 7: Continued from previous page

表7:续前页Table 7: Continued from previous page

表7:续前页Table 7: Continued from previous page

表7:续前页Table 7: Continued from previous page

表8:七种生物标记的组Table 8: Panel of seven biomarkers

表8:续前页Table 8: Continued from previous page

表8:续前页Table 8: Continued from previous page

表8:续前页Table 8: Continued from previous page

表8:续前页Table 8: Continued from previous page

表9:八种生物标记的组Table 9: Panel of eight biomarkers

表9:续前页Table 9: Continued from previous page

表9:续前页Table 9: Continued from previous page

表9:续前页Table 9: Continued from previous page

表9:续前页Table 9: Continued from previous page

表10:九种生物标记的组Table 10: Panel of nine biomarkers

表10:续前页Table 10: Continued from previous page

表10:续前页Table 10: Continued from previous page

表10:续前页Table 10: Continued from previous page

表10:续前页Table 10: Continued from previous page

表11:十种生物标记的组Table 11: Panel of ten biomarkers

表11:续前页Table 11: Continued from previous page

表11:续前页Table 11: Continued from previous page

表11:续前页Table 11: Continued from previous page

表11:续前页Table 11: Continued from previous page

表12:生物标记组中的标记计数Table 12: Marker counts in the biomarker panel

表12:续前页Table 12: Continued from previous page

生物标记biomarkers 33 44 55 66 77 88 99 1010 NRP1NRP1 4444 2020 1212 55 22 22 22 11 PLATPLAT 4848 5454 9292 123123 143143 145145 165165 177177 SERPINA5SERPINA5 55 00 00 00 00 00 00 00 SERPINF2SERPINF2 55 33 33 00 00 00 00 00 SGTASGTA 33 22 11 00 00 00 00 00 TFPITFPI 100100 6060 5151 4646 5757 7070 9191 111111 THBS2THBS2 44 00 00 00 00 00 00 00 THBS4THBS4 6666 110110 146146 193193 243243 276276 334334 354354 TIMP1TIMP1 22twenty two 22 00 00 00 00 00 00 TNFRSF18TNFRSF18 88 99 33 00 11 33 22 22 TNFRSF1BTNFRSF1B 2020 1212 88 66 44 11 00 00 top1top1 66 33 00 00 00 00 00 00 VEGFAVEGFA 1616 3333 4747 5151 7575 142142 268268 455455 VEGFCVEGFC 55 44 22 11 00 00 00 00

表13:10标记分类器中的分析物Table 13: Analytes in the 10-marker classifier

CTSBCTSB C5aC5a ETHE1ETHE1 CSF1RCSF1R CCL18CCL18 C5C5 KLK7KLK7 VEGFAVEGFA KITKIT THBS4THBS4 LTFLTF

表14:源自朴素贝叶斯分类器训练组的参数Table 14: Parameters from the Naive Bayes classifier training set

生物标记biomarkers CSF1RCSF1R 10.71210.712 10.99510.995 0.3980.398 0.3990.399 CTSBCTSB 8.8368.836 9.3989.398 0.2870.287 0.6210.621 IL1RL1IL1RL1 9.7029.702 10.18910.189 0.5330.533 0.7800.780 GDF11GDF11 8.8898.889 8.5788.578 0.2910.291 0.3790.379 ETHE1ETHE1 7.3737.373 7.4437.443 0.1190.119 0.1210.121 CCL23CCL23 8.7958.795 8.9758.975 0.3120.312 0.3290.329 FGFR3FGFR3 6.9926.992 7.1667.166 0.1780.178 0.2250.225 KITKIT 9.7709.770 9.6239.623 0.2870.287 0.3180.318 FSTL3FSTL3 8.7878.787 9.0299.029 0.2900.290 0.3740.374 THBS2THBS2 7.4817.481 7.9227.922 0.2700.270 0.6330.633 SERPINF2SERPINF2 9.2649.264 9.1759.175 0.1150.115 0.1620.162 TNFRSF1BTNFRSF1B 10.74810.748 11.02811.028 0.3800.380 0.4520.452 TNFRSF18TNFRSF18 12.30812.308 12.27912.279 0.1390.139 0.1680.168 BMP6BMP6 7.9587.958 8.1388.138 0.1420.142 0.2390.239 GFRA1GFRA1 7.3247.324 7.4657.465 0.1820.182 0.2000.200 CRPCRP 11.96511.965 12.30412.304 0.7350.735 0.2330.233 SERPINA5SERPINA5 10.30910.309 10.10110.101 0.30000.3000 0.4190.419 KLKB1KLKB1 11.80211.802 11.66611.666 0.1590.159 0.2110.211 APOEAPOE 8.0818.081 8.3148.314 0.4060.406 0.6560.656 SFRP1SFRP1 7.0967.096 7.2197.219 0.2210.221 0.3090.309 C2C2 11.50611.506 11.61111.611 0.1000.100 0.1320.132 CKMCKM 7.3137.313 7.1927.192 0.1540.154 0.1160.116 TFPITFPI 10.17910.179 10.49010.490 0.2610.261 0.3520.352 INSRINSR 8.4808.480 8.6338.633 0.2240.224 0.2550.255 NID2NID2 8.5958.595 8.8068.806 0.2130.213 0.3840.384 HAMPHAMP 10.42410.424 11.07911.079 0.7880.788 0.6170.617 MDKMDK 8.0348.034 8.4958.495 0.5700.570 0.5780.578 CDK5-CDK5R1CDK5-CDK5R1 6.9376.937 6.9946.994 0.1080.108 0.1110.111 NID1NID1 9.7719.771 9.9419.941 0.2130.213 0.3570.357 VEGFCVEGFC 7.4547.454 7.5407.540 0.1180.118 0.1260.126 C9C9 11.91111.911 12.07612.076 0.2340.234 0.2330.233 LTFLTF 10.12010.120 9.8709.870 0.4420.442 0.4190.419 IL12A-IL12BIL12A-IL12B 7.3117.311 7.2737.273 0.0520.052 0.0570.057 C5C5 9.4859.485 9.6039.603 0.1190.119 0.1430.143 IL18R1IL18R1 7.6437.643 7.8457.845 0.1860.186 0.4750.475 CCL18CCL18 11.32011.320 11.61611.616 0.4770.477 0.3980.398 FEGFAFEGFA 8.5328.532 8.6018.601 0.1700.170 0.1340.134 IDUAIDUA 8.4288.428 8.6948.694 0.3660.366 0.5580.558 top1top1 6.8926.892 6.8426.842 0.0880.088 0.0910.091 C5-C6C5-C6 6.5066.506 6.5936.593 0.1330.133 0.1440.144 TIMP1TIMP1 9.8159.815 10.14810.148 0.2640.264 0.4300.430 C5aC5a 11.35411.354 11.60611.606 0.2540.254 0.2460.246 THBS4THBS4 10.01310.013 0.7940.794 0.3590.359 0.4000.400 ENTPD1ENTPD1 7.2257.225 7.2997.299 0.1100.110 0.1030.103 LBPLBP 9.1029.102 9.4899.489 0.4390.439 0.5480.548 KLK3-SERPINA3KLK3-SERPINA3 9.01349.0134 9.2879.287 01.35301.353 0.4220.422 MCM2MCM2 7.7947.794 7.9757.975 0.2260.226 0.3590.359 SGTASGTA 5.9205.920 5.8835.883 0.0600.060 0.0790.079 ESM1ESM1 9.7159.715 9.9199.919 0.3300.330 0.4760.476

表14:续前页Table 14: Continued from previous page

生物标记biomarkers PLATPLAT 8.5178.517 8.8388.838 0.4610.461 0.5020.502 KLK7KLK7 8.3228.322 7.9897.989 0.3210.321 0.3910.391 CCL23CCL23 7.9097.909 8.0978.097 0.2270.227 0.2670.267 ACP5ACP5 10.19810.198 10.43610.436 0.2920.292 0.3430.343 NRP1NRP1 8.8328.832 9.0479.047 0.2430.243 0.2560.256 MMP7MMP7 9.0849.084 9.5749.574 0.4370.437 0.7060.706 ACY1ACY1 9.8989.898 10.41110.411 0.6280.628 0.9190.919 ALPLALPL 10.57710.577 10.29010.290 0.3770.377 0.4170.417 IL11RAIL11RA 7.3127.312 7.2137.213 0.1100.110 0.1070.107 APOA1APOA1 9.7019.701 9.4809.480 0.1710.171 0.2950.295 CKB-CKMCKB-CKM 7.5067.506 7.0257.025 0.6530.653 0.4790.479 KLK8KLK8 7.3617.361 7.4217.421 0.1000.100 0.1780.178 AHSGAHSG 11.91411.914 11.82611.826 0.1330.133 0.1670.167 HINT1HINT1 5.8355.835 5.7935.793 0.0860.086 0.1040.104 MRC1MRC1 9.6289.628 9.9959.995 0.3700.370 0.4900.490 FCGR3BFCGR3B 10.92010.920 11.14511.145 0.2550.255 0.2690.269

表15:生物标记的示例性组合的AUCTable 15: AUCs for exemplary combinations of biomarkers

## AUCAUC 11 CTSBCTSB 0.7910.791 22 CTSBCTSB C5aC5a 0.8530.853 33 CTSBCTSB C5aC5a C5C5 0.8800.880 44 CTSBCTSB C5aC5a C5C5 CCL18CCL18 0.8900.890 55 CTSBCTSB C5aC5a C5C5 CCL18CCL18 CSF1RCSF1R 0.8950.895 66 CTSBCTSB C5aC5a C5C5 CCL18CCL18 CSF1RCSF1R KLK7KLK7 0.8950.895 77 CTSBCTSB C5aC5a C5C5 CCL18CCL18 CSF1RCSF1R KLK7KLK7 ETHE1ETHE1 0.9060.906 88 CTSBCTSB C5aC5a C5C5 CCL18CCL18 CSF1RCSF1R KLK7KLK7 ETHE1ETHE1 C5-C6C5-C6 0.9020.902 99 CTSBCTSB C5aC5a C5C5 CCL18CCL18 CSF1RCSF1R KLK7KLK7 ETHE1ETHE1 C5-C6C5-C6 KLK8KLK8 0.9030.903 1010 CTSBCTSB C5aC5a C5C5 CCL18CCL18 CSF1RCSF1R KLK7KLK7 ETHE1ETHE1 C5-C6C5-C6 KLK8KLK8 VEGFAVEGFA 0.9130.913

表16:源自朴素贝叶斯分类器训练组的计算Table 16: Calculations from the Naive Bayes Classifier Training Set

表17:训练组的临床特征Table 17: Clinical characteristics of the training group

表18:10生物标记分类器蛋白质Table 18: 10 Biomarker Classifier Proteins

表19:一般癌症的生物标记Table 19: Biomarkers of general cancers

ACY1ACY1 APOA1APOA1 C5C5 CCL23CCL23 CKB-CKMCKB-CKM CKMCKM ENTPD1ENTPD1 GDF11GDF11 HAMPHAMP HINT1HINT1 KITKIT KLK3-SERPINA3KLK3-SERPINA3 LBPLBP SERPINF2SERPINF2 THBS2THBS2 TIMP1TIMP1 C9C9 FSTL3FSTL3 IL12A-IL12BIL12A-IL12B CDK5-CDK5R1CDK5-CDK5R1 CCL23CCL23

表20:一种生物标记的组Table 20: A panel of biomarkers

表21:二种生物标记的组Table 21: Two-biomarker panel

表21:续前页Table 21: Continued from previous page

表21:续前页Table 21: Continued from previous page

表22:三种生物标记的组Table 22: Panel of three biomarkers

表22:续前页Table 22: Continued from previous page

表22:续前页Table 22: Continued from previous page

表23:四种生物标记的组Table 23: Panel of Four Biomarkers

表23:续前页Table 23: Continued from previous page

表23:续前页Table 23: Continued from previous page

表24:五种生物标记的组Table 24: Panel of five biomarkers

表24:续前页Table 24: Continued from previous page

表24:续前页Table 24: Continued from previous page

表25:六种生物标记的组Table 25: Panel of six biomarkers

表25:续前页Table 25: Continued from previous page

表25:续前页Table 25: Continued from previous page

表25:续前页Table 25: Continued from previous page

表25:续前页Table 25: Continued from previous page

表26:七种生物标记的组Table 26: Panel of seven biomarkers

表26:续前页Table 26: Continued from previous page

表26:续前页Table 26: Continued from previous page

表26:续前页Table 26: Continued from previous page

表26:续前页Table 26: Continued from previous page

表27:八种生物标记的组Table 27: Panel of Eight Biomarkers

表27:续前页Table 27: Continued from previous page

表27:续前页Table 27: Continued from previous page

表27:续前页Table 27: Continued from previous page

表27:续前页Table 27: Continued from previous page

表28:九种生物标记的组Table 28: Panel of Nine Biomarkers

表28:续前页Table 28: Continued from previous page

表28:续前页Table 28: Continued from previous page

表28:续前页Table 28: Continued from previous page

表28:续前页Table 28: Continued from previous page

表29:十种生物标记的组Table 29: Panel of ten biomarkers

表29:续前页Table 29: Continued from previous page

表29:续前页Table 29: Continued from previous page

表29:续前页Table 29: Continued from previous page

表29:续前页Table 29: Continued from previous page

表30:生物标记组中标记的计数Table 30: Counts of markers in the biomarker panel

表31:源自朴素贝叶斯分类器癌症数据组的参数Table 31: Parameters derived from the Naive Bayes classifier for the cancer dataset

表32:源自朴素贝叶斯分类器训练组的计算Table 32: Calculations from the Naive Bayes Classifier Training Set

Claims (32)

1.捕获剂在制备用于通过如下方法诊断个体中胰腺癌或者确定个体中胰腺癌似然性的试剂盒中的用途,所述方法包括:1. Use of the capture agent in the preparation of a kit for diagnosing pancreatic cancer in an individual or determining the likelihood of pancreatic cancer in an individual by means of: 提供包含至少三种表1中所列的生物标记蛋白的生物标记组,其中所述至少三种生物标记蛋白的两种是选自CTSB和THBS4、CTSB和C5、以及CTSB和C2的生物标记组合,及Provide a biomarker set comprising at least three biomarker proteins listed in Table 1, wherein two of the at least three biomarker proteins are biomarker combinations selected from CTSB and THBS4, CTSB and C5, and CTSB and C2. 在来自个体的生物学样品中测量对应于所述组中的至少三种生物标记蛋白的生物标记水平,其中所述生物标记水平提供了所述个体患有或不患有胰腺癌的似然性的指示,In biological samples from individuals, the levels of biomarkers corresponding to at least three biomarker proteins in the group are measured, wherein the levels of said biomarkers provide an indication of the likelihood that the individual has or does not have pancreatic cancer. 其中所述捕获剂特异于所述至少三种生物标记蛋白。The trapping agent is specific to the at least three biomarker proteins. 2.权利要求1的用途,其中所述诊断包括鉴别诊断胰腺癌与良性疾病状况。2. The use of claim 1, wherein the diagnosis includes the differential diagnosis of pancreatic cancer from benign disease conditions. 3.权利要求2的用途,其中所述良性疾病状况选自胰腺炎和胃肠病症。3. The use of claim 2, wherein the benign disease condition is selected from pancreatitis and gastrointestinal disorders. 4.权利要求1的用途,其中所述个体具有腹部包块。4. The use of claim 1, wherein the individual has an abdominal mass. 5.捕获剂在制备用于通过如下方法对胰腺癌无症状高风险个体进行筛选的试剂盒中的用途,所述方法包括:5. Use of the capture agent in the preparation of a kit for screening individuals at high risk of asymptomatic pancreatic cancer by means of: 提供包含至少三种表1中所列的生物标记蛋白的生物标记组,其中所述至少三种生物标记蛋白的两种是选自CTSB和THBS4、CTSB和C5、以及CTSB和C2的生物标记组合,及Provide a biomarker set comprising at least three biomarker proteins listed in Table 1, wherein two of the at least three biomarker proteins are biomarker combinations selected from CTSB and THBS4, CTSB and C5, and CTSB and C2. 在来自个体的生物学样品中测量对应于所述组中的至少三种生物标记蛋白的生物标记水平,其中所述生物标记蛋白水平提供了所述个体患有或不患有胰腺癌的似然性的指示,In biological samples from individuals, the levels of biomarkers corresponding to at least three biomarker proteins in the group are measured, wherein the levels of said biomarker proteins provide an indication of the likelihood that the individual has or does not have pancreatic cancer. 其中所述捕获剂特异于所述至少三种生物标记蛋白。The trapping agent is specific to the at least three biomarker proteins. 6.权利要求1或5的用途,其中测量所述生物标记蛋白水平包括进行体外测定。6. The use of claim 1 or 5, wherein measuring the level of the biomarker protein comprises performing an in vitro assay. 7.权利要求6的用途,其中所述体外测定包括对应于每个所述生物标记的至少一种捕获试剂,并且还包括从适配体和抗体选择所述至少一种捕获试剂。7. The use of claim 6, wherein the in vitro assay comprises at least one capture reagent corresponding to each of the biomarkers, and further comprises selecting the at least one capture reagent from aptamers and antibodies. 8.权利要求6的用途,其中所述体外测定选自免疫测定、基于适配体的测定、组织学或细胞学测定。8. The use of claim 6, wherein the in vitro assay is selected from immunoassays, aptamer-based assays, histological or cytological assays. 9.权利要求1或5的用途,其中所述生物学样品选自全血、血浆、血清和胰液。9. The use of claim 1 or 5, wherein the biological sample is selected from whole blood, plasma, serum, and pancreatic juice. 10.权利要求9的用途,其中所述生物学样品为血浆。10. The use of claim 9, wherein the biological sample is plasma. 11.权利要求1-5任一项的用途,其中所述生物学样品为胰组织,并且其中所述生物标记值源自对所述胰组织的组织学或细胞学分析。11. The use according to any one of claims 1-5, wherein the biological sample is pancreatic tissue, and wherein the biomarker value is derived from histological or cytological analysis of the pancreatic tissue. 12.指示个体患有胰腺癌的似然性的设备,所述设备包括:12. A device for indicating the likelihood of an individual having pancreatic cancer, said device comprising: 提供包含至少三种表1中所列的生物标记蛋白的生物标记组的装置,其中所述至少三种生物标记蛋白的两种是选自CTSB和THBS4、CTSB和C5、以及CTSB和C2的生物标记组合,A device provides a biomarker group comprising at least three biomarker proteins listed in Table 1, wherein two of the at least three biomarker proteins are biomarker combinations selected from CTSB and THBS4, CTSB and C5, and CTSB and C2. 在计算机上检索所述个体的生物标记信息的装置,其中所述生物标记信息包括生物标记值,所述生物标记值各自对应于至少三种生物标记蛋白,其中通过测量所述组中的所述至少三种生物标记蛋白的水平来获得所述生物标记值;A device for retrieving biomarker information of an individual on a computer, wherein the biomarker information includes biomarker values, each corresponding to at least three biomarker proteins, wherein the biomarker values are obtained by measuring the levels of the at least three biomarker proteins in the group; 用计算机对每个所述生物标记值进行分类的装置;以及A device for classifying each of the biomarker values using a computer; and 基于多个分类指示所述个体患有胰腺癌的似然性的装置。A device that indicates the likelihood of an individual having pancreatic cancer based on multiple classifications. 13.权利要求12的设备,其中指示所述个体患有胰腺癌的似然性包括在计算机显示器上显示所述似然性。13. The device of claim 12, wherein indicating the likelihood that the individual has pancreatic cancer includes displaying the likelihood on a computer display. 14.指示个体患有胰腺癌的似然性的设备,所述设备包括:14. A device for indicating the likelihood of an individual having pancreatic cancer, said device comprising: 包含程序代码的计算机可读取介质,所述程序代码可由计算装置或系统的处理器执行,所述程序代码包括:A computer-readable medium containing program code executable by a processor of a computing device or system, the program code comprising: 对归因于来自所述个体的生物学样品的数据进行检索的模块,其中所述数据包括选自表1的至少三种生物标记蛋白的水平,其中所述至少三种生物标记蛋白的两种是选自CTSB和THBS4、CTSB和C5、以及CTSB和C2的生物标记组合,其中在所述生物学样品中测量所述生物标记蛋白的水平;以及A module for retrieving data attributed to biological samples from said individual, wherein said data includes levels of at least three biomarker proteins selected from Table 1, wherein two of said at least three biomarker proteins are combinations of biomarkers selected from CTSB and THBS4, CTSB and C5, and CTSB and C2, wherein the levels of said biomarker proteins are measured in said biological sample; and 执行分类方法的模块,所述分类方法将所述个体的胰疾病状态指示为所述至少三种生物标记蛋白水平的函数。A module that performs a classification method that indicates the pancreatic disease status of the individual as a function of the levels of the at least three biomarker proteins. 15.权利要求14的设备,其中所述分类方法使用概率密度函数。15. The device of claim 14, wherein the classification method uses a probability density function. 16.权利要求15的设备,其中所述分类方法使用两种或更多种类别。16. The device of claim 15, wherein the classification method uses two or more categories. 17.权利要求1-5任一项的用途,其中测量至少四种生物标记蛋白的水平。17. The use according to any one of claims 1-5, wherein the levels of at least four biomarker proteins are measured. 18.权利要求1-5任一项的用途,其中测量至少五种生物标记蛋白的水平。18. The use according to any one of claims 1-5, wherein the levels of at least five biomarker proteins are measured. 19.权利要求1-5任一项的用途,其中测量至少六种生物标记蛋白的水平。19. The use according to any one of claims 1-5, wherein the levels of at least six biomarker proteins are measured. 20.权利要求1-5任一项的用途,其中测量至少七种生物标记蛋白的水平。20. The use according to any one of claims 1-5, wherein the levels of at least seven biomarker proteins are measured. 21.权利要求1-5任一项的用途,其中测量至少八种生物标记蛋白的水平。21. The use according to any one of claims 1-5, wherein the levels of at least eight biomarker proteins are measured. 22.权利要求1-5任一项的用途,其中测量至少九种生物标记蛋白的水平。22. The use according to any one of claims 1-5, wherein the levels of at least nine biomarker proteins are measured. 23.权利要求1-5任一项的用途,其中所述个体由于吸烟、饮酒或胰腺癌家族史而具有胰腺癌的高风险。23. The use according to any one of claims 1-5, wherein the individual is at high risk of pancreatic cancer due to smoking, drinking alcohol, or a family history of pancreatic cancer. 24.权利要求1-5任一项的用途,其中所述生物标记选自表18。24. The use of any one of claims 1-5, wherein the biomarker is selected from Table 18. 25.权利要求1-5任一项的用途,其还包含生物标记CA19-9。25. The use according to any one of claims 1-5, further comprising the biomarker CA19-9. 26.权利要求1-5任一项的用途,其中:26. The use according to any one of claims 1-5, wherein: (i)当所述生物标记组包含生物标记CTSB和THBS4的组合,所述生物标记组还包含C5、GDF11、CCL23、KLK3.SERPINA3、MMP7、C2和/或CRP;(i) When the biomarker group contains a combination of biomarkers CTSB and THBS4, the biomarker group also contains C5, GDF11, CCL23, KLK3, SERPINA3, MMP7, C2 and/or CRP; (ii)当所述生物标记组包含生物标记CTSB和C5的组合,所述生物标记组还包含THBS4、GDF11、CCL23、KLK3.SERPINA3、MMP7、C2和/或CRP;或者(ii) When the biomarker group comprises a combination of biomarkers CTSB and C5, the biomarker group also comprises THBS4, GDF11, CCL23, KLK3, SERPINA3, MMP7, C2, and/or CRP; or (iii)当所述生物标记组包含生物标记CTSB和C2的组合,所述生物标记组还包含C5、THBS4、GDF11、CCL23、KLK3.SERPINA3、MMP7和/或CRP。(iii) When the biomarker group contains a combination of biomarkers CTSB and C2, the biomarker group also contains C5, THBS4, GDF11, CCL23, KLK3, SERPINA3, MMP7 and/or CRP. 27.权利要求1-5任一项的用途,其中所述个体为人。27. The use according to any one of claims 1-5, wherein the individual is a person. 28.权利要求1-5任一项的用途,其中基于所述生物标记水平以及至少一项对应于所述个体的额外的生物医学信息,将所述个体分类为患有或不患有胰腺癌、或者确定所述个体患有胰腺癌的似然性。28. The use of any one of claims 1-5, wherein the individual is classified as having or not having pancreatic cancer, or the likelihood of the individual having pancreatic cancer is determined based on the level of the biomarker and at least one additional biomedical information corresponding to the individual. 29.权利要求28的用途,其中所述至少一项额外的生物医学信息独立地选自29. The use of claim 28, wherein the at least one additional biomedical information is independently selected from... (a)对应于所述个体的体重变化的信息,(a) Information corresponding to the individual's weight changes, (b)对应于所述个体的种族的信息,(b) Information corresponding to the race of the individual. (c)对应于所述个体的性别的信息,(c) Information corresponding to the gender of the individual. (d)对应于所述个体的饮酒史的信息,(d) Information corresponding to the individual's drinking history, (e)对应于所述个体的职业史的信息,(e) Information corresponding to the individual's occupational history, (f)对应于所述个体的胰腺癌或其他癌症家族史的信息,(f) Information corresponding to the individual's family history of pancreatic cancer or other cancers. (g)对应于所述个体的临床症状的信息,(h)对应于其他实验室测试的信息,以及(g) Information corresponding to the individual's clinical symptoms, (h) Information corresponding to other laboratory tests, and (i)对应于所述个体暴露于已知致癌物的信息。(i) Information corresponding to the individual's exposure to known carcinogens. 30.权利要求28的用途,其中所述至少一项额外的生物医学信息独立地选自:30. The use of claim 28, wherein the at least one additional biomedical information is independently selected from: -对应于胰腺包块或其他腹部包块存在或不存在的信息,-Information corresponding to the presence or absence of a pancreatic mass or other abdominal mass. -对应于所述个体中与所述个体或所述个体家族成员的癌症高风险相关的至少一种遗传标记存在或不存在的信息,以及-Information on the presence or absence of at least one genetic marker in the individual that is associated with a high risk of cancer in the individual or a member of the individual's family, and -对应于所述个体的基因表达值的信息。- Information corresponding to the gene expression values of the individual. 31.权利要求28的用途,其中所述至少一项额外的生物医学信息选自对应于所述个体中与所述个体或所述个体家族成员的胰腺癌高风险相关的至少一种遗传标记存在或不存在的信息。31. The use of claim 28, wherein the at least one additional biomedical information is selected from information on the presence or absence of at least one genetic marker corresponding to a high risk of pancreatic cancer in the individual or a family member of the individual. 32.权利要求28的用途,其中所述至少一项额外的生物医学信息选自对应于所述个体的吸烟史的信息。32. The use of claim 28, wherein the at least one additional biomedical information is selected from information corresponding to the individual's smoking history.
HK17102552.7A 2010-08-13 2017-03-13 Pancreatic cancer biomarkers and uses thereof HK1229003B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US61/373,687 2010-08-13
US61/418,689 2010-12-01
US61/482,480 2011-05-04
US61/482,347 2011-05-04

Publications (2)

Publication Number Publication Date
HK1229003A1 HK1229003A1 (en) 2017-11-10
HK1229003B true HK1229003B (en) 2019-10-25

Family

ID=

Similar Documents

Publication Publication Date Title
CN106198980B (en) Cancer of pancreas biomarker and application thereof
CN102985819B (en) Lung cancer biomarkers and their uses
CN106168624B (en) Lung cancer biomarkers and application thereof
CN102209968B (en) Use of capture agent for lung cancer biomarker protein in preparation kit
EP3029153B1 (en) Mesothelioma biomarkers and uses thereof
WO2011031344A1 (en) Cancer biomarkers and uses thereof
US20220065872A1 (en) Lung Cancer Biomarkers and Uses Thereof
HK1229003B (en) Pancreatic cancer biomarkers and uses thereof
HK1229003A1 (en) Pancreatic cancer biomarkers and uses thereof
HK1229002A1 (en) Lung cancer biomarkers and uses thereof
HK1229002A (en) Lung cancer biomarkers and uses thereof
HK1229002B (en) Lung cancer biomarkers and uses thereof
HK1196429A (en) Lung cancer biomarkers and uses thereof
HK1196429B (en) Lung cancer biomarkers and uses thereof