[go: up one dir, main page]

WO2004097030A2 - Prognostic breast cancer biomarkers - Google Patents

Prognostic breast cancer biomarkers Download PDF

Info

Publication number
WO2004097030A2
WO2004097030A2 PCT/US2004/013076 US2004013076W WO2004097030A2 WO 2004097030 A2 WO2004097030 A2 WO 2004097030A2 US 2004013076 W US2004013076 W US 2004013076W WO 2004097030 A2 WO2004097030 A2 WO 2004097030A2
Authority
WO
WIPO (PCT)
Prior art keywords
breast cancer
patients
biological sample
mammal
genes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2004/013076
Other languages
French (fr)
Other versions
WO2004097030A3 (en
Inventor
Jonas Bergh
Yudi Pawitan
Per Hall
Lukas C. Amler
Xia Han
Fei Huang
Peter Shaw
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Karolinska Innovations AB
Bristol Myers Squibb Co
Original Assignee
Karolinska Innovations AB
Bristol Myers Squibb Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Karolinska Innovations AB, Bristol Myers Squibb Co filed Critical Karolinska Innovations AB
Publication of WO2004097030A2 publication Critical patent/WO2004097030A2/en
Publication of WO2004097030A3 publication Critical patent/WO2004097030A3/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/106Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development

Definitions

  • the present invention relates generally to the field of pharmacogenomics, and more specifically, to methods and procedures for prognosing breast cancer.
  • HER- 2/neu c-erbB-2
  • c-erbB-2 More recently HER- 2/neu has been added for metastatic breast cancer 9 .
  • the present lack of criteria to help individualize breast cancer treatment indicates a need for a novel technology to develop clinically prognostic tools.
  • Detailed molecular fingerprinting of each individual tumour through gene expression analysis may provide new and useful knowledge that could be applied to improve our prognostic abilities in breast cancer.
  • the microarray technology can simultaneously characterize the RNA expression profile of thousands of genes in a single tumour. Most microarray studies reported so far have utilised highly selected patient populations I0"12 .
  • the microarray based expression profiling has been used for the separation of sporadic versus hereditary breast cancer identifying 6 subgroups of breast cancer with discriminative prognosis 14 , identifying estrogen receptor related genes , identifying profiles predicting risk for axillary lymph node metastases, or overall prognosis using the expression profile of 70 discriminatory genes 15 ' 16 .
  • the invention provides the identification of prognostic biomarkers for breast cancer.
  • the invention also provides a biomarker set that comprises an assembly of two or more biomarkers.
  • the biomarkers and biomarker sets of the invention can be used to determine or predict whether a patient is in need of adjuvant therapy for treatment of breast cancer.
  • the invention includes a method of identifying a mammal at increased risk for developing breast cancer, comprising the steps of: (a) obtaining a biological sample from the mammal; (b) measuring in said biological sample the level of at least one biomarker selected from the biomarkers of Table 4; (c) correlating said level of at least one biomarker with a baseline level; and (d) identifying a mammal at increased risk for developing breast cancer based on said correlation.
  • the invention provides a method for prognosing breast cancer in a mammal having breast cancer, comprising the steps of: (a) obtaining a biological sample from the mammal; (b) measuring in said biological sample the level of at least one biomarker selected from the biomarkers of Table 4; (c) correlating said level of at least one biomarker with a baseline level; and (d) prognosing breast cancer in said mammal based on said correlation.
  • the invention includes a method for identifying breast cancer in a mammal, comprising the steps of: (a) obtaining a biological sample from the mammal; (b) measuring in said biological sample the level of at least one biomarker selected from the biomarkers of Table 4; (c) correlating said level of at least one biomarker with a baseline level; and (d) identifying breast cancer in said mammal based on said correlation.
  • the baseline level used for the correlation can be determined by one of skill in the art. In one aspect, the baseline level is from a normal, non- cancer biological sample.
  • the baseline level is from a patient having breast cancer, such as a biological sample removed an established time period prior to present testing, and is used to establish the prognosis of the patient's breast cancer.
  • a difference between the level of at least one biomarker from the biological sample and the baseline level that is statistically significant can be used in the methods of the invention, i.e., to identify a mammal at increased risk for developing breast cancer, to prognose breast cancer in a mammal having breast cancer, or to identify breast cancer in a mammal.
  • a statistically significant difference between the level of at least one biomarker from the biological sample and the baseline level is readily determined by one of skill in the art and can be, for example, at least a two- fold difference, at least a three-fold difference, or at least a four-fold difference in the level of the biomarker.
  • the biological sample can be, for example, breast tissue.
  • the baseline level can be measured, for example, from a normal, breast cancer-free biological sample.
  • the normal, breast cancer-free biological sample can be, for example, normal breast tissue.
  • the level of the at least one biomarker can be, for example, the level of protein and/or mRNA transcript of the at least one biomarker. In one aspect, the level of at least two biomarkers is measured. In another aspect, more than two biomarkers, such as three, four, or five biomarkers, is measured.
  • the invention in one aspect, includes measuring any combination of the biomarkers provided in Table 4, including for example, measuring all 36 nucleotide biomarkers provided in Table 4.
  • the mammal can be, for example, a human, rat, mouse, dog, rabbit, pig sheep, cow, horse, cat, primate, or monkey.
  • the invention includes a biomarker selected from the biomarkers provided in Table 4.
  • the invention includes biomarker sets that comprise at least two biomarkers selected from Table 4.
  • the biomarker sets of the invention include any combination of the biomarkers provided in Table 4 including, for example, all 36 nucleotide biomarkers provided in Table 4, as well as fragments thereof.
  • the biomarkers and biomarker sets of the invention are used as prognostic indicators of breast cancer.
  • the invention also provides one or more specialized microarrays, e.g., oligonucleotide microarrays or cDNA microarrays, comprising the biomarkers and biomarkers sets of the invention.
  • the invention also provides a kit for determining or predicting whether a patient is in need of adjuvant therapy for treatment of breast cancer.
  • the kit comprises one or more biomarkers of the invention, or one or more biomarker sets of the invention.
  • the invention also provides one or more biomarkers that can serve as targets for the development of therapies for disease treatment. Such targets may be particularly applicable to treatment of cancers or tumours.
  • the invention also provides antibodies, including polyclonal and monoclonal, directed against one or more of the biomarker polypeptides.
  • Such antibodies can be used in a variety of ways, for example, to purify, detect, and target the biomarker polypeptides of the invention, including both in vitro and in vivo diagnostic, detection, screening, and/or therapeutic methods.
  • FIG. 1 illustrates the exclusion criteria for all patients operated for primary breast cancer at the Karolinska Hospital, 1994 through 1996.
  • FIG. 2 A illustrates a pseudocolour plot of the 36 predictive genes with their accession numbers on the 134 patients in the training set.
  • FIG. 2B illustrates a pseudocolour plot of the 36 predictive genes on the 25 patients in the testing set.
  • FIG. 2A and FIG. 2B of this Non- Provisional Application are hereby incorporated by reference.
  • FIG. 3 A illustrates a comparison of disease-free survival between the groups with good and bad prognosis scores for all patients.
  • FIG. 3B illustrates a comparison of disease-free survival between the groups with good and bad prognosis scores for the testing set only.
  • the invention includes the biomarkers of Table 4 and the Sequence Listing.
  • biomarkers include polynucleotide sequences, as well as the polypeptide sequences encoded thereby.
  • the 36 selected genes also referred to herein as “biomarkers” or “prognostic biomarkers” and provided in Table 4, gave a better prognostic separation than the criteria routinely used for breast cancer management, including histological grade (according to Elston-Ellis) ⁇ and tumour stage. Almost all previously evaluated prognostic factors added to these factors have almost always failed to add significant prognostic information, when multivariate models have been applied. The lack of useful prognostic and predictive factors outside tumour size, axillary lymph node status, histological grade, and receptor status has been verified in several consensus documents 6 ' 7 . Thus, the invention enables an improved early management of breast cancer patients aiming at an optimized use of adjuvant systemic therapy.
  • the patient cohort is different compared to most other studies of breast cancer prognosis using microarray gene expression data, since a population-based cohort of patients from a predefined geographical area was used. Patients with both primary lymph node negative and positive disease were included, and the patients were not restricted to premenopausal or postmenopausal women.
  • the breast cancer material was probably genetically very homogeneous while being derived from patients with a very similar Caucasian genetic background. This type of information has not been presented in other studies.
  • cyclin dependent kinase inhibitor 1 C is likely to be a tumour suppression gene that regulates cell proliferation.
  • tumours analysed came from different patient cohorts, which may result in different genes being identified in optimal prognostic gene sets.
  • the population- based derivation and selection criteria are provided herein, but not in the Dutch papers 15 ' 16 .
  • different gene expression platforms were used in the two studies, likely resulting in both different initial gene sets being quantified and examined and different relative quantification values for a given gene.
  • different methodologies may have been used in tumour archiving and RNA preparation.
  • different statistical and filtering approaches were used to obtain a subset of genes that make up the best prognostic gene sets.
  • the bad prognosis indicator had an odds ratio of 3.25 (95% CI 0.73 to 14.28), which was in the expected direction but not significant because of the small sample size.
  • the present population-derived patient material is likely to be more representative than at least some of the previously published reports.
  • the microarray expression data were analysed with statistical strategies aimed at minimizing the risk for overfitting the model.
  • the full leave-one-out cross validation procedure was a key analytical tool to provide unbiased estimates of error rates and unbiased prognostic scores for further multivariate analysis.
  • the cross- validated error rate of 31 % was comparable with the rate found by van't Veer et al who reported 41% error rate in the good prognosis group 15 .
  • the performance of the class prediction in the testing set was as expected (25% error rate) for the bad prognosis group, but in the good prognosis group the error was poorer (47% error rate). However, this figure was only based on 17 cases, so there was a large sampling variability.
  • FIG. 1 shows that of 280 patient tumours available, 159 were used for further analyses.
  • tamoxifen and/or goserelin were normally used for hormonal treatment, while mostly intravenous day 1 and 8 cyclophosphamide, metothrexate, and 5-fluorouracil (CMF) was used as adjuvant chemotherapy except to high risk patients who were offered inclusion in the SBG 9401 study .
  • CMF 5-fluorouracil
  • RNA preparation RNA preparation:
  • RNA extraction was performed according to RNeasy mini protocol (Qiagen, Germany). In brief, a portion of the deep frozen tumour was cut into minute pieces and transferred into test tubes (maximum 40 mg/tube) with RLT buffer (RNeasy lysis Buffer), followed by homogenization for around 30-40 seconds. Proteinase K was then added and the samples were treated for 10 minutes at 55 °C. This step was introduced during the project, because most initial preparations without this step resulted in either or both a poor RNA yield and/or quality. Total RNA was then isolated using Qiagen's microspin technology. DNase treatment was also added to some samples to further increase the RNA quality.
  • RNA quality was assessed by measuring the 28S: 18S ribosomal RNA ratio using an Agilent 2100 bioanalyzer (Agilent Technologies, Rockville, Maryland, USA). All samples with RNA of high quality were then stored at -70 °C until microarray analysis. Microarray profiling:
  • IVT in vitro transcription
  • oligonucleotide array hybridization and scanning were performed according to Affymetrix protocol (Santa Clara, California, USA). In brief, the amount of starting total RNA for each probe preparation varied between 2 to 5 ⁇ g.
  • First-strand cDNA synthesis was generated by using a T7-linked oligo-dT primer, followed by second strand synthesis. IVT reactions were performed in batches to generate biotinylated cRNA targets, which were subsequently chemically fragmented at 95 °C for 35 min.
  • the scanned images were inspected for the presence of obvious defects (artifacts or scratches) on the array.
  • the raw expression data was scaled using Affymetrix ® Microarray Suite 5.0 software.
  • the trimmed mean signal of 100 selected house keeping genes on HG-U133 A and B chips was adjusted to a user-specified target signal value for each array so that a scale factor was derived for each array, which was used to scale and standardize the overall signal of an array.
  • Tumour samples which generated expression data failing any of the following criteria were either re-processed or excluded from further analysis: (a) a scaling factor > 4, (b) "Present” calls ⁇ 30% and (c) an R-squared value of the Pearson product moment correlation coefficient of the expression data on one array compared to all other arrays with ⁇ 0.6. In case of visible microarray artifacts, the sample was rehybridized and rescanned on new chips using the same fragmented probe. Data Analysis:
  • the primary statistical analysis was based on the comparison between bad versus good prognosis groups, where occurrence of distant relapse or death from any cause by five years was defined as bad prognosis.
  • a secondary analysis was performed limiting the definition of bad prognosis to distant relapse and death due to breast cancer.
  • the expression data from 134 patients was used as a training set, and additional expression data from 25 patients was used as a testing set.
  • An optimal set of predictors was chosen using a leave-one-out cross validation procedure performed on the training set. Briefly, this was done as follows: (i) Remove one patient from the training set;
  • Class prediction using k genes was done using a diagonal linear discriminant analysis method 19 , which is a variant of the standard maximum likelihood discrimination rule.
  • x is a vector of the (log-) gene expression value from a tumour to be classified
  • x g is the expression value of gene g
  • m lg and m 0g are the means of the bad and good prognosis groups from the training set
  • v g is the variance
  • a g (m ⁇ g - mo g )/v g
  • b g (m ⁇ g + m 0g )/2.
  • S was assigned to the bad prognosis group, and otherwise to the good prognosis group.
  • S was referred to as the bad prognostic score.
  • This class prediction method is in fact similar to the signal-to-noise method and weighted voting algorithm in Golub et al. .
  • the bad prognosis score S (high-low, with 'high' defined as S>0) was then included in the multivariate logistic regression analysis of the five-year status to see if it had an additional predictive value over the standard clinical variables.
  • the scores for patients in the training set were computed from the leave-one-out procedure, i.e., the score for a patient was computed by first removing the patient prior to computing the coefficients a g and b g from the optimal set of genes.
  • the scores for patients in the testing set were computed using the full training set to compute the class predictor. Hence, these scores were unbiased prognostic scores.
  • the clinical variables were age, tumour grade, tumour size and lymph node metastasis, estrogen receptor (ER) status (positive-negative), and progesterone receptor (PGR) status (positive-negative). Tumour size and lymph node metastases were entered into the model in terms of a stage variable. These clinical predictors were initially compared between the good and bad prognosis groups.
  • obtained cross- validated error rates of 31% on the training set were obtained; 71 (68%) out of the 104 good prognosis patients and 22 (73%) out of 30 bad prognosis patients were correctly classified (Table 3).
  • 10 (40%) were incorrectly classified.
  • the prediction was somewhat better in the bad prognosis group, with 2 (25%) out of 8 correctly classified (Table 3).
  • FIG. 2 A and 2B show the expression pattern of the 36-gene set and the separation between the good and bad prognosis groups in both the training and testing set. The association between gene expression and prognosis is clearly visible.
  • FIG. 2A provides a pseudocolour plot of the 36 predictive genes with their accession numbers on the 134 patients in the training set. Bright red indicates a high value of gene expression, and a bright green indicates low value. The list of genes is given in Table 4, in the same order they appear on the plot. The bar on the right-hand side shows the 5-year status, with a black line indicating a patient with bad prognosis.
  • FIG. 2B provides a similar colour plot of the 36 predictive genes on the 25 patients in the testing set. The genes are in the same order as FIG. 2A.
  • the list of the genes is given in Table 4. Among the genes that have higher expression in good prognosis were cyclin dependent kinase inhibitor IC, spinal-cord- derived growth factor B, myosin binding protein, homeobox A5, insulin-like growth factor 1 and several imcharacterized genes and ESTs from the U133B chip. Of the genes associated with poor prognosis, identified were genes primarily involved in the cell metabolism and cell cycle regulation. TABLE 4 - BIOMARKERS SELECTED FOR PREDICTION
  • the multivariate Cox regression analysis of the breast cancer events produced similar results with the previous logistic regression analysis.
  • the adjusted HR of the bad prognosis score, after adjusting for the clinical factors was 6.59 (95% CI 2.54 to 17.12). No other prognostic variable was statistically significant.
  • PGK1 phosphoglycerate kinase 1

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

A method of identifying a mammal at increased risk for developing breast cancer comprising the steps of obtaining a biological sample from the mammal, measuring in said biological sample the level of at least one biomarker, correlating said level of at least one biomarker with a baseline level, identifying a mammal at increased risk for developing breast cancer based on said correlation.

Description

PROGNOSTIC BREAST CANCER BIOMARKERS
FIELD OF THE INVENTION:
The present invention relates generally to the field of pharmacogenomics, and more specifically, to methods and procedures for prognosing breast cancer.
BACKGROUND OF THE INVENTION: Globally, one million women are diagnosed with invasive breast cancer and 373,000 women die of the disease each year '. Early diagnosis and adjuvant therapy saves a significant number of lives 2"5. However, many patients are subjected to unnecessarily harsh adjuvant therapies, while others relapse and die despite having been treated according to state-of-the-art clinical guidelines 6'7. A long list of factors has been described to have prognostic potential. So far, however, only stage, tumour size, and histological grade have general acceptance as prognostic factors . Estrogen receptor status, sometimes accompanied by progesterone receptor status, is the only globally accepted predictive factor for breast cancer therapy 6. More recently HER- 2/neu (c-erbB-2) has been added for metastatic breast cancer 9. The present lack of criteria to help individualize breast cancer treatment indicates a need for a novel technology to develop clinically prognostic tools. Detailed molecular fingerprinting of each individual tumour through gene expression analysis may provide new and useful knowledge that could be applied to improve our prognostic abilities in breast cancer.
The microarray technology can simultaneously characterize the RNA expression profile of thousands of genes in a single tumour. Most microarray studies reported so far have utilised highly selected patient populations I0"12. The microarray based expression profiling has been used for the separation of sporadic versus hereditary breast cancer identifying 6 subgroups of breast cancer with discriminative prognosis 14, identifying estrogen receptor related genes , identifying profiles predicting risk for axillary lymph node metastases, or overall prognosis using the expression profile of 70 discriminatory genes 15'16.
Adjuvant hormonal therapy and chemotherapy for breast cancer, globally the most common malignancy for females, saves many lives. Presently used prognostic factors lack sensitivity and specificity, resulting in overtreatment or undertreatment of large patient groups. Analysis of the expression profiles from many genes has the potential to identify expression signatures with the potential to improve prognostication. The present invention provides gene expression profiles that predict breast cancer prognosis.
SUMMARY OF THE INVENTION: The invention provides the identification of prognostic biomarkers for breast cancer. The invention also provides a biomarker set that comprises an assembly of two or more biomarkers. The biomarkers and biomarker sets of the invention can be used to determine or predict whether a patient is in need of adjuvant therapy for treatment of breast cancer.
In one aspect, the invention includes a method of identifying a mammal at increased risk for developing breast cancer, comprising the steps of: (a) obtaining a biological sample from the mammal; (b) measuring in said biological sample the level of at least one biomarker selected from the biomarkers of Table 4; (c) correlating said level of at least one biomarker with a baseline level; and (d) identifying a mammal at increased risk for developing breast cancer based on said correlation.
In another aspect, the invention provides a method for prognosing breast cancer in a mammal having breast cancer, comprising the steps of: (a) obtaining a biological sample from the mammal; (b) measuring in said biological sample the level of at least one biomarker selected from the biomarkers of Table 4; (c) correlating said level of at least one biomarker with a baseline level; and (d) prognosing breast cancer in said mammal based on said correlation.
In yet another aspect, the invention includes a method for identifying breast cancer in a mammal, comprising the steps of: (a) obtaining a biological sample from the mammal; (b) measuring in said biological sample the level of at least one biomarker selected from the biomarkers of Table 4; (c) correlating said level of at least one biomarker with a baseline level; and (d) identifying breast cancer in said mammal based on said correlation. As used herein, the baseline level used for the correlation can be determined by one of skill in the art. In one aspect, the baseline level is from a normal, non- cancer biological sample. In another aspect, the baseline level is from a patient having breast cancer, such as a biological sample removed an established time period prior to present testing, and is used to establish the prognosis of the patient's breast cancer. A difference between the level of at least one biomarker from the biological sample and the baseline level that is statistically significant can be used in the methods of the invention, i.e., to identify a mammal at increased risk for developing breast cancer, to prognose breast cancer in a mammal having breast cancer, or to identify breast cancer in a mammal. A statistically significant difference between the level of at least one biomarker from the biological sample and the baseline level is readily determined by one of skill in the art and can be, for example, at least a two- fold difference, at least a three-fold difference, or at least a four-fold difference in the level of the biomarker.
The biological sample can be, for example, breast tissue. The baseline level can be measured, for example, from a normal, breast cancer-free biological sample. The normal, breast cancer-free biological sample can be, for example, normal breast tissue.
The level of the at least one biomarker can be, for example, the level of protein and/or mRNA transcript of the at least one biomarker. In one aspect, the level of at least two biomarkers is measured. In another aspect, more than two biomarkers, such as three, four, or five biomarkers, is measured. The invention, in one aspect, includes measuring any combination of the biomarkers provided in Table 4, including for example, measuring all 36 nucleotide biomarkers provided in Table 4.
The mammal can be, for example, a human, rat, mouse, dog, rabbit, pig sheep, cow, horse, cat, primate, or monkey.
In one aspect, the invention includes a biomarker selected from the biomarkers provided in Table 4. In another aspect, the invention includes biomarker sets that comprise at least two biomarkers selected from Table 4. The biomarker sets of the invention include any combination of the biomarkers provided in Table 4 including, for example, all 36 nucleotide biomarkers provided in Table 4, as well as fragments thereof. The biomarkers and biomarker sets of the invention are used as prognostic indicators of breast cancer. The invention also provides one or more specialized microarrays, e.g., oligonucleotide microarrays or cDNA microarrays, comprising the biomarkers and biomarkers sets of the invention.
The invention also provides a kit for determining or predicting whether a patient is in need of adjuvant therapy for treatment of breast cancer. The kit comprises one or more biomarkers of the invention, or one or more biomarker sets of the invention.
The invention also provides one or more biomarkers that can serve as targets for the development of therapies for disease treatment. Such targets may be particularly applicable to treatment of cancers or tumours.
The invention also provides antibodies, including polyclonal and monoclonal, directed against one or more of the biomarker polypeptides. Such antibodies can be used in a variety of ways, for example, to purify, detect, and target the biomarker polypeptides of the invention, including both in vitro and in vivo diagnostic, detection, screening, and/or therapeutic methods.
Further features of the invention will be better understood upon a reading of the detailed description of the invention when considered in connection with the accompanying figures.
BRIEF DESCRIPTION OF THE FIGURES:
FIG. 1 illustrates the exclusion criteria for all patients operated for primary breast cancer at the Karolinska Hospital, 1994 through 1996.
FIG. 2 A illustrates a pseudocolour plot of the 36 predictive genes with their accession numbers on the 134 patients in the training set. FIG. 2B illustrates a pseudocolour plot of the 36 predictive genes on the 25 patients in the testing set. These figures are also provided in pseudocolour as FIG. 2A and FIG. 2B in corresponding U.S. Non-Provisional Application No. , filed April 27,
2004, and entitled "PROGNOSTIC BREAST CANCER BIOMARKERS". FIG. 2A and FIG. 2B of this Non- Provisional Application are hereby incorporated by reference.
FIG. 3 A illustrates a comparison of disease-free survival between the groups with good and bad prognosis scores for all patients. FIG. 3B illustrates a comparison of disease-free survival between the groups with good and bad prognosis scores for the testing set only.
DETAILED DESCRIPTION OF THE INVENTION: The invention includes the biomarkers of Table 4 and the Sequence Listing.
These biomarkers include polynucleotide sequences, as well as the polypeptide sequences encoded thereby.
The base of this study consisted of all women operated for breast cancer at the Karolinska Hospital, Stockholm, Sweden, during 1994-96. Tumours from which frozen material was available were analyzed. All ages were included and strict exclusion criteria used. Since the material was population-based, the results can be generalised to other non-selected populations of early breast cancer patients.
All routine clinical, pathological, and follow-up data were recorded, based on patient charts combined with cross-checking with register data. Selected small pieces of primary breast cancer were deep frozen from a population-derived breast cancer cohort receiving primary therapy from 1994 through 1996. At least 1.8 μg quality- controlled RNA was isolated from each tumour; the biotin labelled cRNA was hybridized at 45 °C for 16 hours using the Affymetrix Human Genome U133 A and B chips, containing almost 45,000 probe sets derived from approximately 33,000 well- substantiated human genes. Gene expression profiles were then correlated to distant metastases or death within 5 years.
Tumours from 159 patients were analysed and a subset of 36 genes was found to give an optimal prognostic separation, giving significantly better prognostication (P=0.005) compared with histological grading according to Elston-Ellis and stage of the tumour. The gene expression profile of 36 genes has a significantly superior prognostic value compared with established prognostic factors for breast cancer. These results have the potential to identify women in need of adjuvant therapy and thereby save many women with biologically less aggressive tumours from unnecessary additional therapy. The 36 selected genes, also referred to herein as "biomarkers" or "prognostic biomarkers" and provided in Table 4, gave a better prognostic separation than the criteria routinely used for breast cancer management, including histological grade (according to Elston-Ellis) π and tumour stage. Almost all previously evaluated prognostic factors added to these factors have almost always failed to add significant prognostic information, when multivariate models have been applied. The lack of useful prognostic and predictive factors outside tumour size, axillary lymph node status, histological grade, and receptor status has been verified in several consensus documents 6'7. Thus, the invention enables an improved early management of breast cancer patients aiming at an optimized use of adjuvant systemic therapy.
The patient cohort is different compared to most other studies of breast cancer prognosis using microarray gene expression data, since a population-based cohort of patients from a predefined geographical area was used. Patients with both primary lymph node negative and positive disease were included, and the patients were not restricted to premenopausal or postmenopausal women. The breast cancer material was probably genetically very homogeneous while being derived from patients with a very similar Caucasian genetic background. This type of information has not been presented in other studies.
Among the genes associated with good prognosis, cyclin dependent kinase inhibitor 1 C is likely to be a tumour suppression gene that regulates cell proliferation. 1
It was recently found to be down-regulated in metastatic tumours . A few of the 231 genes found to have a prognostic value identified by van't Veer 15 were also identified among the genes associated with bad prognosis. These are CENPF, PGK1 and ANKT. Over expression in PGK1 was known to be associated with multidrug resistance 22.
The relatively small overlap between the best predictive gene set and van't Veer's prognosis set is likely due to several differences between the two studies. First, the tumours analysed came from different patient cohorts, which may result in different genes being identified in optimal prognostic gene sets. The population- based derivation and selection criteria are provided herein, but not in the Dutch papers 15'16. Second, different gene expression platforms were used in the two studies, likely resulting in both different initial gene sets being quantified and examined and different relative quantification values for a given gene. Third, different methodologies may have been used in tumour archiving and RNA preparation. Fourth, different statistical and filtering approaches were used to obtain a subset of genes that make up the best prognostic gene sets. As expected, some genes were observed that did overlap between the present gene set and that of van't Veer 15. This may point to a subset of genes that are more robust, or at least validated in an independent setting, and that could have general utility for prognosis in breast cancer. All but 33 out of 159 patients received adjuvant systemic therapy, chemotherapy, and/or endocrine therapy. Adjuvant therapy decisions are made at many institutions based on St Gallen or NIH therapy recommendations, accordingly many patients will receive adjuvant therapy. The present gene expression profiles were based on these clinical realities. Although ideally only untreated patients should be included in microarray based prognostic studies, such a restriction would define a highly selective patient subgroup, likely representing a low risk group and not representative of the general breast cancer patients. In this subgroup of patients, the bad prognosis indicator had an odds ratio of 3.25 (95% CI 0.73 to 14.28), which was in the expected direction but not significant because of the small sample size. The present population-derived patient material is likely to be more representative than at least some of the previously published reports.
Although the proportions receiving different adjuvant therapies (Table 2) were compared, it is noted that it is not a randomized comparison, as there was no control on the clinical decision leading to therapy assignments. Therefore, in this study there are no conclusions regarding the benefits of any of the adjuvant therapies.
The multivariate analysis (Table 5) seemed to imply that ER positive status was associated with worse prognosis (OR=3.13, 95% CI 0.81 to 12.06), but this was not statistically significant. This may be due to traditional use cut-off values which are lower compared with other groups. In addition, looking more closely at the data, there was a strong correlation between ER and PGR status, and the stated association was observed only in the PGR negative group, so the result was partly due to an interaction between PGR and ER status.
The microarray expression data were analysed with statistical strategies aimed at minimizing the risk for overfitting the model. The full leave-one-out cross validation procedure was a key analytical tool to provide unbiased estimates of error rates and unbiased prognostic scores for further multivariate analysis. The cross- validated error rate of 31 % was comparable with the rate found by van't Veer et al who reported 41% error rate in the good prognosis group 15. The performance of the class prediction in the testing set was as expected (25% error rate) for the bad prognosis group, but in the good prognosis group the error was poorer (47% error rate). However, this figure was only based on 17 cases, so there was a large sampling variability. In the multivariate analysis, had non-cross-validated scores been used, much stronger but biased associations between expression profile and outcome would have been obtained. Using an almost similar list of clinical variables and distant metastasis as the primary endpoint ' obtained a hazard ratio of 4.6 for their bad prognostic score, which is well within the 95% CI from our data (2.99 to 33.37). It is shown herein that gene expression in a set of 36 genes significantly improves prediction of prognosis of breast cancer, compared to the established prognostic factors. The random inclusion of breast cancer patients into conventional studies seems almost too primitive since breast cancer probably consists ofa number of different subgroups with unique prognostic property as recently shown 14.
EXAMPLES: Study population:
Nearly all breast cancer patients in Sweden are treated within the national health care system. A personal identification number is used for all population registers in Sweden and consists of 6 digits for year, month, and day of birth, supplemented with 4 digits representing place of birth, sex, and a check digit. Since 1976, almost all women diagnosed with breast cancer in the Stockholm-Gotland County (approximately 1.9 million inhabitants) have been registered at the Regional Oncological Center at the Karolinska Hospital. Patients in the Stockholm-Gotland region were to be diagnosed and treated according to the predefined written guidelines for the region.
All breast cancer patients that were operated at the Karolinska Hospital from January 1, 1994 through December 31, 1996 (n=524) were included. No frozen tumour material was saved for 231 patients and an additional 13 patients either actively refused to participate (n=6) or had emigrated (n=7), leaving 280 patients with identified tumours available for microarray analyses (FIG. 1 - Description of exclusion criteria for all patients operated for primary breast cancer at the Karolinska Hospital, 1994 through 1996). Sufficient amount and quality of RNA, according to predefined quantity and quality aspects, was obtained for 191 of the 280 patients (FIG. 1). From a number of the patients, microarrays expression data were generated from more than one tumour, e.g. bilateral and multifocal tumours and from time periods before and after the present study period.
FIG. 1 shows that of 280 patient tumours available, 159 were used for further analyses. Expression analysis was not possible on 89 patients due to tumour degradation (n=42), insufficient amount and/or quality of RNA (n=35), and quality control failure (n=12). Expression profiles were analyzed from tumours examined with the Affymetrix U133 A and B chip (n=177). Further exclusions were applied to 12 patients who had received neoadjuvant chemotherapy, 5 patients who had an in situ cancer and 1 patient who had distant metastases at initial diagnosis.
All case records for the included patients were examined for information on tumour size, number of retrieved and metastatic axillary lymph nodes, hormonal receptor status, distant metastases, site and date of relapse, initial therapy, therapy for possible recurrences, date and cause of death.
The different reasons for exclusion were not influenced by age at diagnosis (Table 1). The 231 tumours that were not analyzed using expression profiling had a lower mean diameter, were more often < 21 mm in size, had fewer mean number of affected lymph nodes, and fewer individuals diseased at the end of the study period (Table 1). For those excluded for other reasons, there did not seem to be a selection based on age or stage of the disease, compared with those patients included in the study (Table 1).
TABLE 1 - CHARACTERISTICS OF PATIENTS OPERATED FOR BREAST CANCER AT THE KAROLINSKA HOSPITAL, 1994-96, IN RELATION TO
INCLUSION CRITERIA.
PATIENT ALL NO AVAILABLE EXCLUDED FOR TESTING TRAINING
CATEGORIES PATIENTS TISSUE* OTHER REASONS** SET SET
(N=524) (N=23I) (N=134) (N=25)
(N=134)
MEAN AGE AT BREAST
CANCER DIAGNOSIS 58 57 58 59 58
MEAN TUMOUR SIZE, MM 20 16 24 22 22
PROPORTION OF PATIENTS WITH TUMOUR SIZE <21 MM, % 68 77 57 56 62
PROPORTION OF PATIENTS
WITH POSITIVE LYMPH
NODES, % 26 16 32 36 38
PROPORTION DECEASED, % 20 12 26 32 25
* no frozen tumours in the tumour bank (n=231 )
** living abroad (n=7), actively refused participation (n=6), degraded tumours (n=42), insufficient amount of RNA (n=35), not pass the QC for the arrays (n=12), profiled on the U95 chip (n=14), neoadjuvant chemotherapy (n=12), in situ cancer (n=5), stage IV at diagnosis (n=l)
Stained tissue sections from the primary tumours from patients with array profiles were re-examined and classified using Elston-Ellis grading as this was not
1 7 routine at the time of the diagnosis by an independent pathologist (H.N.), without any knowledge of any clinical data, previous histological diagnosis or array profile characteristics.
In the adjuvant setting, tamoxifen and/or goserelin were normally used for hormonal treatment, while mostly intravenous day 1 and 8 cyclophosphamide, metothrexate, and 5-fluorouracil (CMF) was used as adjuvant chemotherapy except to high risk patients who were offered inclusion in the SBG 9401 study . Therapies were given in accordance with written guidelines from the Stockholm Breast Cancer Group and the therapy decisions were made in multidisciplinary therapy conferences.
After primary therapy, patients were recommended to have regular clinical examinations and yearly mammograms, in addition to laboratory and X-ray tests guided by clinical signs and symptoms. Patients were followed for at least two years, but normally for 5 years, and some for a planned 10 years, more recently based on predefined risk criteria. Patients with follow-up outside Radiumhemmet were tracked using the person unique ten-digit number in combination with the computerized follow-up system, being valid for all inpatient and outpatient visits in the county of Stockholm.
Relapse site and date together with relapse therapy, and date of death were ascertained at the middle of May 2002. The average follow-up was 6.1 years. Cause of death was coded as death due to breast cancer (including those with distant metastases but dying from other causes), death due to other malignancies, and death due to non-malignant disorders. Through the population based Swedish Cancer Registry, second primary malignancies were identified. Hormone receptor status has been analyzed on all newly diagnosed breast cancers for more than 20 years at the Karolinska Hospital. Tumour material not needed for these analyses was frozen on dry ice or in liquid nitrogen and stored in -70 °C freezers. RNA preparation:
RNA extraction was performed according to RNeasy mini protocol (Qiagen, Germany). In brief, a portion of the deep frozen tumour was cut into minute pieces and transferred into test tubes (maximum 40 mg/tube) with RLT buffer (RNeasy lysis Buffer), followed by homogenization for around 30-40 seconds. Proteinase K was then added and the samples were treated for 10 minutes at 55 °C. This step was introduced during the project, because most initial preparations without this step resulted in either or both a poor RNA yield and/or quality. Total RNA was then isolated using Qiagen's microspin technology. DNase treatment was also added to some samples to further increase the RNA quality. The quality of the RNA was assessed by measuring the 28S: 18S ribosomal RNA ratio using an Agilent 2100 bioanalyzer (Agilent Technologies, Rockville, Maryland, USA). All samples with RNA of high quality were then stored at -70 °C until microarray analysis. Microarray profiling:
Preparation of in vitro transcription (IVT) products and oligonucleotide array hybridization and scanning were performed according to Affymetrix protocol (Santa Clara, California, USA). In brief, the amount of starting total RNA for each probe preparation varied between 2 to 5 μg. First-strand cDNA synthesis was generated by using a T7-linked oligo-dT primer, followed by second strand synthesis. IVT reactions were performed in batches to generate biotinylated cRNA targets, which were subsequently chemically fragmented at 95 °C for 35 min. Fragmented and biotinylated cRNA (10 ug) was hybridized at 45 °C for 16 h to Affymetrix high density oligonucleotide array human HG-U133 set chips, which contain approximately 45,000 probe sets representing more than 39,000 transcripts derived from approximately 33,000 well-substantiated human genes. The arrays were then washed, stained with streptavidin-phycoerythrin (SAPE, final concentration of 10 μg/ml). Signal amplification was performed using a biotinylated anti-streptavidin antibody. The array was then scanned according to the manufacturer's instructions (Affymetrix Genechip® Technical Manual, 2001). The scanned images were inspected for the presence of obvious defects (artifacts or scratches) on the array. To minimize discrepancies due to variables such as sample preparation, hybridization conditions, staining, or array lot, the raw expression data was scaled using Affymetrix® Microarray Suite 5.0 software. The trimmed mean signal of 100 selected house keeping genes on HG-U133 A and B chips was adjusted to a user-specified target signal value for each array so that a scale factor was derived for each array, which was used to scale and standardize the overall signal of an array. Tumour samples which generated expression data failing any of the following criteria were either re-processed or excluded from further analysis: (a) a scaling factor > 4, (b) "Present" calls <30% and (c) an R-squared value of the Pearson product moment correlation coefficient of the expression data on one array compared to all other arrays with <0.6. In case of visible microarray artifacts, the sample was rehybridized and rescanned on new chips using the same fragmented probe. Data Analysis:
The expression data from both U133 A and B chips were compiled, Affymetrix control sequences and 100 selected house keeping genes on B chip were removed prior to analysis. This resulted in expression data of 44,692 probe sets for 159 breast primary tumours. A statistical data filter was applied to reduce noise and obtain a useful and relevant probe set to identify biomarkers that were highly correlated to clinical parameters. Probe sets were excluded from further analysis if (i) they were not detected in more than 10% of the tumours (the detection was performed using the Affymetrix GeneChip® Expression Analysis algorithm, with Absent Call p-value >0.06) and (ii) they did not show sufficient biological variation across samples. This led to a final 7,488 probe set for analysis, consisting of 4,736 from U133A and 2,752 from U133B. All analyses were performed using log transformed expression values.
The primary statistical analysis was based on the comparison between bad versus good prognosis groups, where occurrence of distant relapse or death from any cause by five years was defined as bad prognosis. For comparison, a secondary analysis was performed limiting the definition of bad prognosis to distant relapse and death due to breast cancer. Initially, the expression data from 134 patients was used as a training set, and additional expression data from 25 patients was used as a testing set. An optimal set of predictors was chosen using a leave-one-out cross validation procedure performed on the training set. Briefly, this was done as follows: (i) Remove one patient from the training set;
(ii) Using the rest of the patients (n= 133 = 134-1), and order the genes according to the two-sample t-tests (assuming unequal variance) for comparing the good and bad prognosis groups;
(iii) Develop a class prediction using the k genes in the list; and (iv) Predict the prognosis status of the removed case based on the expression values of k genes, and compare the prediction with the actual status. This procedure was repeated by removing each case in the training set in turn, so in the end a set of n=134 out-of-sample predictions and their actual values were obtained. A cross-validated error rate was reported to summarize the prediction performance using k genes. The optimal value of k was chosen by minimizing the cross- validated error rate as a function of k.
Class prediction using k genes was done using a diagonal linear discriminant analysis method 19, which is a variant of the standard maximum likelihood discrimination rule. Suppose x is a vector of the (log-) gene expression value from a tumour to be classified, and xg is the expression value of gene g, and mlg and m0g are the means of the bad and good prognosis groups from the training set, and vg is the variance, and ag = (mιg - mog)/vg, and bg = (mιg + m0g)/2. The class predictor score was given by S = sumg ag (xg - bg), where the summation was over the k top genes. A patient with S>0 was assigned to the bad prognosis group, and otherwise to the good prognosis group. Thus, S was referred to as the bad prognostic score. This class prediction method is in fact similar to the signal-to-noise method and weighted voting algorithm in Golub et al. .
The bad prognosis score S (high-low, with 'high' defined as S>0) was then included in the multivariate logistic regression analysis of the five-year status to see if it had an additional predictive value over the standard clinical variables. To avoid overfitting the data, the scores for patients in the training set were computed from the leave-one-out procedure, i.e., the score for a patient was computed by first removing the patient prior to computing the coefficients ag and bg from the optimal set of genes. The scores for patients in the testing set were computed using the full training set to compute the class predictor. Hence, these scores were unbiased prognostic scores. The clinical variables were age, tumour grade, tumour size and lymph node metastasis, estrogen receptor (ER) status (positive-negative), and progesterone receptor (PGR) status (positive-negative). Tumour size and lymph node metastases were entered into the model in terms ofa stage variable. These clinical predictors were initially compared between the good and bad prognosis groups.
To get a better description of the prognosis experience of the patients, survival analysis of the follow-up data was also performed. The main advantage was that it used all the survival information, not just the 5-year status. The Cox proportional hazard model was used to assess the additional contribution of prognosis score after adjusting for the clinical variables. Results: Characteristics of the patients (n=T 59) in this study (Table 2) showed that those who died or had distant metastases (n=38) had tumours more often > 21 mm in size (P=0.06), had a higher mean diameter (P=0.05), were more often PGR negative (P=0.01), and less often received endocrine therapy (p=0.03). No significant difference was detected in the proportion of patients receiving chemo- or radiotherapy. A roughly similar pattern was observed when the analyses were limited to breast cancer specific deaths.
TABLE 2 - UNIVARIATE COMPARISON OF CLINICAL VARIABLES AMONG
PATIENTS WITH GOOD AND BAD PROGNOSIS. CONTINUOUS VARIABLES
WERE COMPARED USING T TESTS, AND THE PROPORTIONS WERE
COMPARED USING χ2 TESTS
ALIVE (N=12 1) DECEASED (N=38) p-VALUE
PROPORTION OF PATIENTS
WITH BAD PROGNOSIS
SCORE 0.34 0.74 <0.0001
MEAN AGE AT BREAST
CANCER DIAGNOSIS 57.5 (±12.4) 58.8 (±16.8) 0.59
MEAN TUMOUR SIZE, MM 21.3 (±11.5) 25.6 (±12.6) 0.05
PROPORTION OF PATIENTS WITH TUMOUR SIZE <21 MM 0.65 0.47 0.06
PROPORTION OF PATIENTS
WITH POSITIVE LYMPH
NODES 0.37 0.39 0.71
PROPORTION OF PATIENTS
WITH GRADE I 0.23 0.08 0.06*
GRADE II 0.41 0.36
GRADE III 0.36 0.56
PROPORTION OF PATIENTS
WITH ER POSITIVE
TUMOURS** 0.83 0.79 0.61
PROPORTION OF PATIENTS
WITH PGR POSITIVE
TUMOURS** 0.77 0.55 0.01
PROPORTION OF PATIENTS
RECEIVING CHEMO
THERAPY 0.18 0.21 0.69
PROPORTION OF PATIENTS
RECEIVING ENDOCRINE
THERAPY 0.76 0.58 0.03
PROPORTION OF PATIENTS
RECEIVING RADIO
THERAPY 0.51 0.39 0.21
* Combined testing using the chi-squared test. ** Receptor positive tumours >0.05 fmol/ng DNA.
Of the 134 patients in the training set, 30 reached the primary endpoint by 5 years, and were thus defined as poor prognosis group. Of these, 26 patients had distance relapse by 5 years, and 4 died within 5 years without diagnosis of distant relapse. The remaining 104 patients were defined as the good prognosis group. Of these patients, after more than 5 years of follow-up, 4 died without recurrence of breast cancer and 4 had distant relapse (and 2 of these later died).
The leave-one-out procedure suggested k=36 genes as an optimal number of genes for prediction. Using these genes as the set of markers for computing the bad prognosis score, obtained cross- validated error rates of 31% on the training set were obtained; 71 (68%) out of the 104 good prognosis patients and 22 (73%) out of 30 bad prognosis patients were correctly classified (Table 3). In the independent testing set of 25 patients, 10 (40%) were incorrectly classified. The prediction was somewhat better in the bad prognosis group, with 2 (25%) out of 8 correctly classified (Table 3). When distant metastases or deaths due to breast cancer ('breast cancer events') were considered, 21 (81%) out of 26 bad prognosis patients in the training set were correctly classified, and in the testing set, all 5 bad prognosis patients were correctly classified (Table 3).
TABLE 3 - CROSS-VALIDATED PREDICTION OF THE TRAINING SET AND PREDICTION OF THE TESTING SET
ALL EVENTS BREAST CANCER EVENTS
PREDICTION PREDICTION
GOOD BAD GOOD BAD
(A) TRAINING SET
TRUE STATUS
GOOD 71 (68%) 33(32%) 74 (69%) 34 (31%)
BAD 8 (27%) 22(73%) 5 (19%) 21 (81%)
(B) TESTING SET
TRUE STATUS
GOOD 9 (53%) 8 (47%) 11 (55%) 9 (45%)
BAD 2 (25%) 6 (75%) 0 (0%) 5 (100%)
FIG. 2 A and 2B show the expression pattern of the 36-gene set and the separation between the good and bad prognosis groups in both the training and testing set. The association between gene expression and prognosis is clearly visible. FIG. 2A provides a pseudocolour plot of the 36 predictive genes with their accession numbers on the 134 patients in the training set. Bright red indicates a high value of gene expression, and a bright green indicates low value. The list of genes is given in Table 4, in the same order they appear on the plot. The bar on the right-hand side shows the 5-year status, with a black line indicating a patient with bad prognosis. FIG. 2B provides a similar colour plot of the 36 predictive genes on the 25 patients in the testing set. The genes are in the same order as FIG. 2A.
The list of the genes is given in Table 4. Among the genes that have higher expression in good prognosis were cyclin dependent kinase inhibitor IC, spinal-cord- derived growth factor B, myosin binding protein, homeobox A5, insulin-like growth factor 1 and several imcharacterized genes and ESTs from the U133B chip. Of the genes associated with poor prognosis, identified were genes primarily involved in the cell metabolism and cell cycle regulation. TABLE 4 - BIOMARKERS SELECTED FOR PREDICTION
Figure imgf000019_0001
Figure imgf000020_0001
Figure imgf000021_0001
Figure imgf000022_0001
Multivariate logistic regression analysis of the prognosis status (Table 5) showed highly significant predictive value of the bad prognosis score (OR=4.05; 95% CI 1.53 to 10.71) after adjusting for age, stage, grade, ER, and PGR status. Of these clinical variables, only PGR positive status was weakly associated with better prognosis (OR = 0.35; 95% CI 0.12 to 0.99). When breast cancer endpoints were considered, the result for the microarray-based prognostic score was more significant than for overall endpoints (OR=9.99, 95% 2.99 to 33.37).
TABLE 5 - MULTIVARIATE LOGISTIC REGRESSION OF 5-YEAR DISEASE- FREE STATUS IN RELATION TO BAD GENES PROGNOSIS SCORE AND
OTHER CLINICAL VARIABLES
ALL EVENTS (N=159, NUMBER OF EVENTS =38) ODDS-RATIO (95% CI) P- VALUE
BAD PROGNOSIS SCORE 4.05 (1.53-10.71) 0.005
AGE (PER 10 YEARS) 1.05 (0.76-1.46) 0.77
STAGE
STAGE 2 VS 1 2.81 (0.54-14.63) 0.22
STAGE 3 VS 1 2.92 (0.53-16.24) 0.22
GRADE
GRADE 2 VS 1 1.07 (0.33-3.5) 0.91
GRADE 3 VS 1 1.08 (0.41-2.88) 0.87
ER POSITIVE 3.13 (0.81-12.06) 0.10
PGR POSITIVE 0.35 (0.12-0.99) 0.05
In the survival analysis (FIG. 3A and 3B), patients predicted to be in the bad prognosis group using the 36-gene set were significantly worse off compared with the patients predicted to have good prognosis (P-value= 0.0001). FIG. 3A provides a comparison of disease-free survival between the groups with good and bad prognosis scores for all patients (P-value=0.0001). FIG. 3B provides a comparison limited to the testing set only (P-value = 0.26). This was a cross-validated comparison, in the sense that the bad prognosis score for each patient was an unbiased score, computed by first removing the patient from the training set. The comparison within the testing set showed a similar trend, but it was not statistically significant (P-value=0.26) due to small sample size. The multivariate Cox regression analysis of the breast cancer events produced similar results with the previous logistic regression analysis. The adjusted HR of the bad prognosis score, after adjusting for the clinical factors was 6.59 (95% CI 2.54 to 17.12). No other prognostic variable was statistically significant.
References:
1. Parkin D, Muir C, Whelan S, Gao Y-T, Ferlay F, Powell J. Cancer incidence in five continents. Comparability and quality of data. In: Parkin D, Muir C, Whelan
S, Gao Y-T, Ferlay F, Powell J, eds. IARC Scientific Publication. No. 120. Lyon, 1992: 45-173.
2. Nystrom L, Andersson I, Bjurstam N, Frisell J, Nordenskjold B, Rutqvist LE.
Long-term effects of mammography screening: updated overview of the Swedish randomised trials. Lancet 2002;359(9310):909-l 9.
3. EBCTCG. Tamoxifen for early breast cancer: an overview of the randomised trials.
Lancet 1998;351:1451-1467.
4. EBCTCG. Polychemotherapy for early breast cancer: an overview of the randomised trials. Lancet 1998;352:930-942. 5. Bergh J. Where next with stem-cell-supported high-dose therapy for breast cancer? Lancet 2000;355(9208):944-5.
6. Goldhirsch A, Glick JH, Gelber RD, Coates AS, Senn HJ. Meeting highlights:
International Consensus Panel on the Treatment of Primary Breast Cancer. Seventh International Conference on Adjuvant Therapy of Primary Breast Cancer. JClin Oncol 2001;19(18):3817-27.
7. Eifel P, Axelson JA, Costa J, et al. National Institutes of Health Consensus
Development Conference Statement: adjuvant therapy for breast cancer, November 1-3, 2000. JNatl Cancer Inst 2001;93(13):979-89.
8. Winer E, Morrow M, Osborne C, Harris J. Malignant tumors of the breast. In: De Vita V, Hellman S, Rosenberg S, eds. Cancer. Principles & practice of oncology. Philadelphia: Lippincott Williams & Wilkins, 2001: 1651-1726.
9. Slamon DJ, Leyland- Jones B, Shak S, et al. Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. NErtg/J e< 2001;344(l l):783-92. 10. Perou CM, Jeffrey SS, van de Rijn M, et al. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci US
A 1999;96(16):9212-7. 1 1. Perou C, Sόrlie T, Eisen M, et al. Molecular portraits of human breast tumours.
Nature 2000;406:747-752. 12. Ahr A, Holtrich U, Solbach C, et al. Molecular classification of breast cancer patients by gene expression profiling. J athol 2001 ;195(3):312-20. 13. Hedenfalk I, Duggan D, Chen Y, et al. Gene-expression profiles in hereditary breast cancer. NEng/J ? 2001 ;344(8):539-48. 14. Sorlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA 2001;98( 19): 10869-74.
15. van 't Veer LJ, Dai H, van de Vijver MJ, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002;415(6871):530-6.
16. van de Vijver MJ, He YD, van't Veer LJ, et al. A gene-expression signature as a predictor of survival in breast cancer. N EnglJ Med 2002;347(25): 1999-2009.
17. Elston C, Ellis I. Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large styd with long- term follow-up. Histopathology 1991;19:403-410.
18. Bergh J, Wiklund T, Erikstein B, et al. Tailored fluorouracil, epirubicin, and cyclophosphamide compared with marrow-supported high-dose chemotherapy as adjuvant treatment for high- risk breast cancer: a randomised trial. Scandinavian Breast Group 9401 study [In Process Citation]. Lancet 2000;356(9239): 1384-91.
19. Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol 2002;3(7):RESEARCH0036.
20. Golub TR, Slonim DK, Tamayo P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999;286(5439):531-7.
21. Ramaswamy S, Ross KΝ, Lander ES, Golub TR. A molecular signature of metastasis in primary solid tumors. Nat Genet 2003;33(l):49-54.
22. Duan Z, Lamendola DE, Yusuf RZ, Penson RT, Preffer FI, Seiden MV.
Overexpression of human phosphoglycerate kinase 1 (PGK1) induces a multidrug resistance phenotype. Anticancer Res 2002;22(4): 1933-41.

Claims

CLAIMS: What is claimed is:
1. A method of identifying a mammal at increased risk for developing breast cancer, comprising the steps of: (a) obtaining a biological sample from the mammal;
(b) measuring in said biological sample the level of at least one biomarker selected from the biomarkers of Table 4;
(c) correlating said level of at least one biomarker with a baseline level; and
(d) identifying a mammal at increased risk for developing breast cancer based on said correlation.
2. The method of claim 1 wherein said biological sample is breast tissue.
3. The method of claim 1 wherein said baseline level is measured from a normal, breast cancer-free biological sample.
4. The method of claim 3 wherein said normal, breast cancer-free biological sample is normal breast tissue.
5. The method of claim 3 wherein: said biological sample is breast tissue; said normal, breast cancer-free biological sample is normal breast tissue; and said biological sample and said normal, breast cancer-free biological sample are from the same mammal.
6. A method for prognosing breast cancer in a mammal having breast cancer, comprising the steps of:
(a) obtaining a biological sample from the mammal;
(b) measuring in said biological sample the level of at least one biomarker selected from the biomarkers of Table 4;
(c) correlating said level of at least one biomarker with a baseline level; and
(d) prognosing breast cancer in said mammal based on said correlation.
7. A method for identifying breast cancer in a mammal, comprising the steps of: (a) obtaining a biological sample from the mammal;
(b) measuring in said biological sample the level of at least one biomarker selected from the biomarkers of Table 4; (c) correlating said level of at least one biomarker with a baseline level; and
(d) identifying breast cancer in said mammal based on said correlation.
PCT/US2004/013076 2003-04-28 2004-04-28 Prognostic breast cancer biomarkers Ceased WO2004097030A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US46608403P 2003-04-28 2003-04-28
US60/466,084 2003-04-28

Publications (2)

Publication Number Publication Date
WO2004097030A2 true WO2004097030A2 (en) 2004-11-11
WO2004097030A3 WO2004097030A3 (en) 2005-03-24

Family

ID=33418338

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/013076 Ceased WO2004097030A2 (en) 2003-04-28 2004-04-28 Prognostic breast cancer biomarkers

Country Status (1)

Country Link
WO (1) WO2004097030A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009138130A1 (en) * 2008-05-16 2009-11-19 Atlas Antibodies Ab Breast cancer prognostics
US8030014B2 (en) 2005-12-14 2011-10-04 Jcl Bioassay Corporation Detecting agent and therapeutic agent for highly malignant breast cancer
WO2015004248A3 (en) * 2013-07-12 2015-03-19 B.R.A.H.M.S Gmbh Augurin immunoassay
US9089556B2 (en) 2000-08-03 2015-07-28 The Regents Of The University Of Michigan Method for treating cancer using an antibody that inhibits notch4 signaling

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHARPENTIER ET AL.: 'Effects of estrogen on global gene expression: identification of novel targets of estrogen action' CANCER RESEARCH vol. 60, 01 November 2000, pages 5977 - 5983 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9089556B2 (en) 2000-08-03 2015-07-28 The Regents Of The University Of Michigan Method for treating cancer using an antibody that inhibits notch4 signaling
US8030014B2 (en) 2005-12-14 2011-10-04 Jcl Bioassay Corporation Detecting agent and therapeutic agent for highly malignant breast cancer
WO2009138130A1 (en) * 2008-05-16 2009-11-19 Atlas Antibodies Ab Breast cancer prognostics
US8945832B2 (en) 2008-05-16 2015-02-03 Atlas Antibodies Ab Treatment prediction involving HMGCR
WO2015004248A3 (en) * 2013-07-12 2015-03-19 B.R.A.H.M.S Gmbh Augurin immunoassay
CN105474016A (en) * 2013-07-12 2016-04-06 勃拉姆斯有限公司 Augurin immunoassay
US10024872B2 (en) 2013-07-12 2018-07-17 B.R.A.H.M.S Gmbh Augurin immunoassay

Also Published As

Publication number Publication date
WO2004097030A3 (en) 2005-03-24

Similar Documents

Publication Publication Date Title
JP6140202B2 (en) Gene expression profiles to predict breast cancer prognosis
JP2020031642A (en) Method for using gene expression to determine prognosis of prostate cancer
CN113785076A (en) Methods and compositions for predicting cancer prognosis
EP2298936A1 (en) Genomic fingerprint of mammary cancer
WO2010003773A1 (en) Algorithms for outcome prediction in patients with node-positive chemotherapy-treated breast cancer
JP2007532113A (en) Gene expression markers to predict response to chemotherapeutic agents
WO2010076322A1 (en) Prediction of response to taxane/anthracycline-containing chemotherapy in breast cancer
WO2015017537A2 (en) Colorectal cancer recurrence gene expression signature
US20110143946A1 (en) Method for predicting the response of a tumor in a patient suffering from or at risk of developing recurrent gynecologic cancer towards a chemotherapeutic agent
US20250305058A1 (en) Algorithms and Methods for Assessing Late Clinical Endpoints in Prostate Cancer
US20240218451A1 (en) Prostate cancer gene profiles and methods of using the same
US20150344962A1 (en) Methods for evaluating breast cancer prognosis
WO2009037090A1 (en) Molecular markers for tumor cell content in tissue samples
WO2005076005A2 (en) A method for classifying a tumor cell sample based upon differential expression of at least two genes
CN117165688A (en) Marker for urothelial cancer and application thereof
Grisaru et al. Microarray expression identification of differentially expressed genes in serous epithelial ovarian cancer compared with bulk normal ovarian tissue and ovarian surface scrapings
WO2020051293A1 (en) Recurrence gene signature across multiple cancer types
KR20070084488A (en) Methods and Systems for Prognosis and Treatment of Solid Tumors
Schaner et al. Variation in gene expression patterns in effusions and primary tumors from serous ovarian cancer patients
KR20190143058A (en) Method of predicting prognosis of brain tumors
WO2004097030A2 (en) Prognostic breast cancer biomarkers
AU2004294527A1 (en) Predicting response and outcome of metastatic breast cancer anti-estrogen therapy
EP2872651B1 (en) Gene expression profiling using 5 genes to predict prognosis in breast cancer
US20150329914A1 (en) Predictive biomarkers for pre-malignant breast lesions
CN120138160A (en) Breast cancer survival risk gene group and diagnostic products and applications

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase