[go: up one dir, main page]

WO2010058393A2 - Compositions and methods for the prognosis of colon cancer - Google Patents

Compositions and methods for the prognosis of colon cancer Download PDF

Info

Publication number
WO2010058393A2
WO2010058393A2 PCT/IL2009/001075 IL2009001075W WO2010058393A2 WO 2010058393 A2 WO2010058393 A2 WO 2010058393A2 IL 2009001075 W IL2009001075 W IL 2009001075W WO 2010058393 A2 WO2010058393 A2 WO 2010058393A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
expression
mir
patients
nucleic acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/IL2009/001075
Other languages
French (fr)
Other versions
WO2010058393A3 (en
Inventor
Gila Lithwick Yanai
Baruch Brenner
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mor Research Applications Ltd
Rosetta Genomics Ltd
Original Assignee
Mor Research Applications Ltd
Rosetta Genomics Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mor Research Applications Ltd, Rosetta Genomics Ltd filed Critical Mor Research Applications Ltd
Publication of WO2010058393A2 publication Critical patent/WO2010058393A2/en
Publication of WO2010058393A3 publication Critical patent/WO2010058393A3/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/178Oligonucleotides characterized by their use miRNA, siRNA or ncRNA

Definitions

  • the present invention relates to compositions and methods for determining the prognosis of colon cancer. Specifically, the invention relates to microRNA molecules associated with the prognosis of colon cancer, as well as various nucleic acid molecules relating thereto or derived thereof.
  • miRNAs have emerged as an important novel class of regulatory RNAs that has a profound impact on a wide array of biological processes. These small (typically 18-24 nucleotides long) non-coding RNA molecules can modulate protein expression patterns by promoting RNA degradation, inhibiting mRNA translation, and also affecting gene transcription. miRs play pivotal roles in diverse processes such as development and differentiation, control of cell proliferation, stress response and metabolism. The expression of many miRs was found to be altered in numerous types of human cancer, and in some cases strong evidence has been put forward in support of the conjecture that such alterations may play a causative role in tumor progression. There are currently about 900 known human miRs.
  • Colon cancer is the second most frequently diagnosed malignancy in the United States, as well as the second most common cause of cancer death. Approximately
  • the expression profile of nucleic acid sequences and variants thereof (SEQ ID NOS: 1-66 and 71) in biological samples obtained from colon cancer patients is indicative of cancer prognosis: the life expectancy of the patient, the risk of recurrence, the expected disease-free survival and response to treatment.
  • a method for determining a prognosis for colon cancer in a subject comprising:
  • said altered expression level is a change in a score based on a combination of expression levels of said nucleic acid sequences.
  • said nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 2, 5-10, 13, 15-17, 19, 22, 26-32, 35, 36, 38-41, 43,
  • said nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 1, 3, 4, 11, 12, 14, 18, 20, 21, 23-25, 33, 34, 37,
  • the subject is a human.
  • the colon cancer is colon adenocarcinoma.
  • said prognosis is prediction of colon cancer risk of recurrence.
  • the biological sample obtained from the subject is selected from the group consisting of bodily fluid, a cell line, a tissue sample, a biopsy sample, a needle biopsy sample, a fine needle biopsy (FNA) sample, a surgically removed sample, and a sample obtained by tissue-sampling procedures such as colonscopy or laparoscopic methods.
  • the tissue is a fresh, frozen, fixed, wax-embedded or formalin-fixed paraffin-embedded (FFPE) tissue.
  • said tissue is a colon tissue.
  • said colon tissue is colon adenocarcinoma.
  • the expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, and a combination thereof.
  • the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization.
  • the nucleic acid amplification method is quantitative real-time PCR.
  • the PCR method comprises forward and reverse primers.
  • said forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 88-99 and sequences at least about 80% identical thereto.
  • said reverse primer comprises SEQ ID NO: 100, a fragment thereof and a sequence at least about 80% identical thereto.
  • the real-time PCR method further comprises a probe.
  • the probe comprising a nucleic acid sequence that is complementary to a sequence selected from SEQ ID NO: 1-66 and 71; to a fragment thereof or to a sequence at least about 80% identical thereto.
  • the probe comprises a sequence selected from the group consisting of SEQ
  • ID NOS: 76-87 a fragment thereof and a sequence at least about 80% identical thereto.
  • kits for determining the prognosis of a subject with colon cancer may comprise a probe comprising a nucleic acid sequence that is complementary to a sequence selected from SEQ ID NO: 1-66 and 71; to a fragment thereof or to a sequence at least about 80% identical thereto.
  • the probe comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 76-87, a fragment thereof and a sequence at least about 80% identical thereto.
  • the kit further comprises forward and reverse primers.
  • said forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 88-99 and sequences at least about 80% identical thereto.
  • said reverse primer comprises SEQ ID NO: 100, a fragment thereof and a sequence at least about 80% identical thereto.
  • the kit comprises reagents for performing in situ hybridization analysis.
  • prognostic for colon cancer comprises providing the forecast or prediction of (prognostic for) any one or more of the following: duration of survival of a patient susceptible to or diagnosed with colon cancer, duration of recurrence-free survival, duration of progression- free survival of a patient susceptible to or diagnosed with colon cancer, response rate in a group of patients susceptible to or diagnosed with colon cancer, and duration of response in a patient or a group of patients susceptible to or diagnosed with colon cancer, hi some embodiments, duration of survival is forecast or predicted to be increased. In some embodiment, duration of survival is forecast or predicted to be decreased, hi some embodiments, duration of recurrence-free survival is forecast or predicted to be increased.
  • duration of recurrence-free survival is forecast or predicted to be decreased.
  • response rate is forecast or predicted to be increased.
  • response rate is forecast or predicted to be decreased.
  • duration of response is predicted or forecast to be increased.
  • duration of response is predicted or forecast to be decreased.
  • Figure IA is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from colon cancer patients with bad prognosis (recurrence within 3 years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents patients with bad prognosis (24 patients), and the x-axis represents patients with good prognosis (92 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • FIG. 1 in tumor samples obtained from colon cancer patients with bad or good prognosis (as defined in Figure IA). Two boxes are shown. The left box represents the group of patients with bad prognosis, while the right box represents the group of patients with good prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
  • Figure 2A is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage I colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage I colon cancer patients with bad prognosis (recurrence within 3 years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents patients with bad prognosis (8 patients), and the x-axis represents patients with good prognosis (46 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • miRs hsa-miR-196b (SEQ ID NO: 3), hsa-miR-125b (SEQ ID NO: 4), hsa-miR-23a (SEQ ID NO: 5), hsa-miR-24 (SEQ ID NO: 6), MID-00291 (SEQ ID NO: 7), MID-00466 (SEQ ID NO: 8), MID- 00595 (SEQ ID NO: 9) and hsa-miR-21 (SEQ ID NO: 10), are marked with circles. Gray crosses represent untested control probes or median signal ⁇ 300 in both groups.
  • Figure 3A is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage II colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage 2 colon cancer patients with bad prognosis (recurrence within 3 years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents patients with bad prognosis (16 patients), and the x-axis represents patients with good prognosis (46 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • the statistically significant miR (p-value ⁇ 0.05 and adjusting for FDR) is marked with a circle: hsa-miR-29a (SEQ ID NO: 1). Gray crosses represent untested control probes or median signal ⁇ 300 in both groups.
  • Figure 4B is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage I) grouped by expression levels of hsa-miR-29c* (SEQ ID NO: 12) (logrank p-value-0.048).
  • Figure 4D is a Kaplan-Meier plot for persistence of disease- free status (survival) of colon cancer patients (stage I) grouped by expression levels of hsa-miR-196b (SEQ ID NO: 1
  • Figure 5A is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage II) grouped by expression levels of hsa-miR-29a (SEQ ID NO: 1
  • Figure 5D is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage II) grouped by expression levels of hsa-miR-527 (SEQ ID NO: 1
  • Figure 10 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage I colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage I colon cancer patients with bad prognosis (recurrence within 3 years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents patients with bad prognosis (8 patients), and the x-axis represents patients with good prognosis (46 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • Statistically significant miRs are marked with circles: hsa-miR-196b (SEQ ID NO: 3), hsa-miR-125b (SEQ ID NO: 4), hsa-let-7b (SEQ ID NO: 66) and hsa-miR-21 (SEQ ID NO: 10). Gray crosses represent untested control probes or median signal ⁇ 300 in both groups.
  • Figure HA is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage I colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage I colon cancer patients with bad prognosis (recurrence within 3 years following surgery). Median values of each miR were compared in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents patients with bad prognosis (8 patients), and the x-axis represents patients with good prognosis (7 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • the statistically significant miR hsa-miR-196b (SEQ ID NO: 3) (p-value ⁇ 0.05), is marked with a circle.
  • Differentially expressed miRs are marked with squares: hsa-miR-126 (SEQ ID NO: 2), hsa-miR-125b (SEQ ID NO: 4), hsa-let-7b (SEQ ID NO: 66) and hsa-miR-21 (SEQ ID NO: 10).
  • Gray crosses represent untested control probes or median signal ⁇ 300 in both groups.
  • Figure HB is a scatter plot showing differential expression of the same set of miRs as described in Figure HA (in normalized fluorescence units, as measured by
  • hsa-miR-126 SEQ ID NO: 2
  • hsa-miR-125b SEQ ID NO: 4
  • hsa-let-7b SEQ ID NO: 66
  • hsa-miR-21 SEQ ID NO: 10
  • Gray crosses represent untested control probes or median signal ⁇ 300 in both groups.
  • Figures 12A-E show boxplot representations comparing distributions of the expression of the statistically significant miRs, as measured by microarray, in tumor samples obtained from stage I colon cancer patients with bad or good prognosis (as defined in Figure 3A). For each miR two boxes are shown.
  • the left box represents the group of patients with good prognosis, while the right box represents the group of patients with bad prognosis.
  • the line in the box indicates the median value.
  • the box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
  • Figures 13A-E show boxplot representations comparing distributions of the expression of the same statistically significant miRs as shown in Figures 12A-E, as measured by PCR, in tumor samples obtained from stage I colon cancer patients with bad or good prognosis (as defined in Figure 3A). For each miR two boxes are shown. The left box represents the group of patients with good prognosis, while the right box represents the group of patients with bad prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
  • Figure 14A is a Kaplan-Meier plot for persistence of disease- free status (survival) of colon cancer patients (stage I) grouped by expression levels of hsa-miR-196b (SEQ BD).
  • Figure 15A is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage II colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage II colon cancer patients with bad prognosis (recurrence within 3 years following surgery). Median values of each miR were compared in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents patients with bad prognosis (7 patients), and the x-axis represents patients with good prognosis (8 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • the statistically significant miR, hsa- miR-29a (SEQ ID NO: 1) (p-value ⁇ 0.05), is marked with a circle. Differentially expressed miRs are marked with squares: hsa-miR-155 (SEQ ID NO: 14), hsa-miR-29b (SEQ ID NO: 54), and hsa-let-7b (SEQ ID NO: 66). Gray crosses represent untested control probes or median signal ⁇ 300 in both groups.
  • Figure 15B is a scatter plot showing differential expression of the same set of miRs as described in Figure 15A (in normalized fluorescence units, as measured by PCR) in samples obtained from stage II colon cancer patients with good prognosis (no recurrence within 3 years following surgery) versus samples obtained from stage II colon cancer patients with bad prognosis (recurrence within 3 years following surgery). Median values of each miR were compared in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents stage II patients with bad prognosis (7 patients), and the x-axis represents stage II patients with good prognosis (8 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • hsa-miR-29a SEQ ID NO: 1
  • hsa-let-7b SEQ ID NO: 66
  • Gray crosses represent untested control probes or median signal ⁇ 300 in both groups.
  • Figures 16A-D show boxplot representations comparing distributions of the expression of the statistically significant miRs, as measured by microarray, in tumor samples obtained from stage II colon cancer patients with bad or good prognosis (as defined in Figure 3A). For each miR two boxes are shown. The left box represents the group of patients with good prognosis, while the right box represents the group of patients with bad prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
  • Figures 17A-D show boxplot representations comparing distributions of the expression of the same statistically significant miRs as shown in Figures 16A-E, as measured by PCR, in tumor samples obtained from stage II colon cancer patients with bad or good prognosis (as defined in Figure 3A). For each miR two boxes are shown. The left box includes the group of patients with good prognosis, while the right box includes the group of patients with bad prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
  • miRNA expression can serve as a novel tool for determining the prognosis of colon cancer. More particularly, it may serve for determining the risk of recurrence of said cancer.
  • each intervening number there between with the same degree of precision is explicitly contemplated.
  • the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the numbers 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.
  • antisense refers to nucleotide sequences which are complementary to a specific DNA or RNA sequence.
  • antisense strand is used in reference to a nucleic acid strand that is complementary to the "sense" strand.
  • Antisense molecules may be produced by any method, including synthesis by ligating the gene(s) of interest in a reverse orientation to a viral promoter which permits the synthesis of a complementary strand. Once introduced into a cell, this transcribed strand combines with natural sequences produced by the cell to form duplexes. These duplexes then block either the further transcription or translation. In this manner, mutant phenotypes may be generated. attached
  • “Attached” or “immobilized”, as used herein to refer to a probe and a solid support, may mean that the binding between the probe and the solid support is sufficient to be stable under conditions of binding, washing, analysis, and removal.
  • the binding may be covalent or non-covalent. Covalent bonds may be formed directly between the probe and the solid support or may be formed by a cross linker or by inclusion of a specific reactive group on either the solid support or the probe or both molecules.
  • Non- covalent binding may be one or more of electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent attachment of a molecule, such as streptavidin, to the support and the non-covalent binding of a biotinylated probe to the streptavidin. Immobilization may also involve a combination of covalent and non- covalent interactions.
  • biological sample such as streptavidin
  • Biological sample may mean a sample of biological tissue or fluid that comprises nucleic acids. Such samples include, but are not limited to, tissue isolated from animals. Biological samples may also include sections of tissues such as biopsy and autopsy samples, frozen sections taken for histological purposes, blood, plasma, serum, sputum, stool, tears, mucus, urine, effusions, amniotic fluid, ascitic fluid, hair, and skin. A biological sample may also be provided by fine-needle aspiration (FNA). Biological samples also include explants and primary and/or transformed cell cultures derived from patient tissues.
  • FNA fine-needle aspiration
  • a biological sample may be provided by removing a sample of cells from an animal, but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), or by performing the methods described herein in vivo.
  • Archival tissues such as those having treatment or outcome history, may also be used. cancer prognosis
  • cancer prognosis includes the forecast or prediction of any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of disease-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response rate in a group of patients susceptible to or diagnosed with a cancer, duration of response in a patient or a group of patients susceptible to or diagnosed with a cancer.
  • prognostic for cancer means providing a forecast or prediction of the probable course or outcome of the cancer.
  • prognostic for cancer comprises providing the forecast or prediction of (prognostic for) any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of disease-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response rate in a group of patients susceptible to or diagnosed with a cancer, and duration of response in a patient or a group of patients susceptible to or diagnosed with a cancer, wherein any of the above may be increased or decreased.
  • “Complement” or “complementary”, as used herein to refer to a nucleic acid may mean Watson-Crick (e.g., A-TAJ and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules.
  • a full complement or fully complementary may mean 100% complementary base pairing between nucleotides or nucleotide analogs of nucleic acid molecules, hi some embodiments, the complementary sequence has a reverse orientation (5 '-3').
  • Ct Ct signals represent the first cycle of PCR where amplification crosses a threshold (cycle threshold) of fluorescence. Accordingly, low values of Ct represent high abundance or expression levels of the microRNA.
  • the PCR Ct signal is normalized such that the normalized Ct remains inversed from the expression level. In other embodiments the PCR Ct signal may be normalized and then inverted such that low normalized-inverted Ct represents low abundance or expression levels of the microRNA. detection
  • Detection means detecting the presence of a component in a sample. Detection also means detecting the absence of a component. Detection also means measuring the level of a component, either quantitatively or qualitatively. Detection also means identifying or diagnosing cancer in a subject. “Early detection” means identifying or diagnosing cancer in a subject at an early stage of the disease, especially before it causes symptoms. differential expression
  • differential expression may mean qualitative or quantitative differences in the temporal and/or cellular gene expression patterns within and among cells and tissue.
  • a differentially expressed gene can qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus disease tissue.
  • Genes may be turned on or turned off in a particular state, relative to another state thus permitting comparison of two or more states.
  • a qualitatively regulated gene will exhibit an expression pattern within a state or cell type that may be detectable by standard techniques. Some genes will be expressed in one state or cell type, but not in both.
  • the difference in expression may be quantitative, e.g., in that expression is modulated, up-regulated, resulting in an increased amount of transcript, or down- regulated, resulting in a decreased amount of transcript.
  • the degree to which expression differs need only be large enough to quantify via standard characterization techniques such as expression arrays, quantitative reverse transcriptase PCR, northern analysis, and RNase protection. disease-free survival
  • Disease-free survival refers to the length of time after treatment for a specific disease during which a patient survives with no sign of the disease. Disease-free survival may be used in a clinical study or trial to help measure how well a new treatment works. Disease-free survival may also refer to a survival assessment where the end points are either tumor recurrence (e.g., the cancer comes back as a consequence of distant metastasis to other sites in the body) or death.
  • the term “disease-free survival”, as used herein, is defined as a time between diagnosis or surgery to treat a cancer patient and reoccurrence.
  • Expression profile may mean a genomic expression profile, e.g., an expression profile of microRNAs. Profiles may be generated by any convenient means for determining a level of a nucleic acid sequence e.g. quantitative hybridization of microRNA, labeled microRNA, amplified microRNA, cRNA, etc., quantitative PCR, ELISA for quantification, and the like, and allow the analysis of differential gene expression between two samples.
  • a subject or patient tumor sample e.g., cells or collections thereof, e.g., tissues, is assayed. Samples are collected by any convenient method, as known in the art.
  • Nucleic acid sequences of interest are nucleic acid sequences that are found to be predictive, including the nucleic acid sequences provided above, where the expression profile may include expression data for 5, 10, 20, 25, 50, 100 or more of, including all of the listed nucleic acid sequences.
  • the term "expression profile” may also mean measuring the abundance of the nucleic acid sequences in the measured samples.
  • expression ratio "Expression ratio”, as used herein, refers to relative expression levels of two or more nucleic acids as determined by detecting the relative expression levels of the corresponding nucleic acids in a biological sample.
  • Fram is used herein to indicate a non-full length part of a nucleic acid or polypeptide.
  • a fragment is itself also a nucleic acid or polypeptide, respectively.
  • Gene may be a natural (e.g., genomic) or synthetic gene comprising transcriptional and/or translational regulatory sequences and/or a coding region and/or non-translated sequences (e.g., introns, 5'- and 3 '-untranslated sequences).
  • the coding region of a gene may be a nucleotide sequence coding for an amino acid sequence or a functional RNA, such as tRNA, rRNA, catalytic RNA, siRNA, miRNA or antisense RNA.
  • a gene may also be a mRNA or cDNA corresponding to the coding regions (e.g., exons and miRNA) optionally comprising 5'- or 3 '-untranslated sequences linked thereto.
  • a gene may also be an amplified nucleic acid molecule produced in vitro comprising all or a part of the coding region and/or 5'- or 3 '-untranslated sequences linked thereto.
  • groove binder/minor groove binder may be used interchangeably and refer to small molecules that fit into the minor groove of double-stranded DNA, typically in a sequence-specific manner.
  • Minor groove binders may be long, flat molecules that can adopt a crescent-like shape and thus, fit snugly into the minor groove of a double helix, often displacing water.
  • Minor groove binding molecules may typically comprise several aromatic rings connected by bonds with torsional freedom such as furan, benzene, or pyrrole rings.
  • Minor groove binders may be antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic anti -tumor drugs such as chromomycin and mithramycin, CC- 1065, dihydrocyclopyrroloindole tripeptide (DPI 3 ), l,2-dihydro-(3H)-pyrrolo[3,2-e]indole-7- carboxylate (CDPI 3 ), and related compounds and analogues, including those described in Nucleic Acids in Chemistry and Biology, 2d ed., Blackburn and Gait, eds., Oxford University Press, 1996, and PCT Published Application No.
  • antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic anti -tumor drugs such as chro
  • a minor groove binder may be a component of a primer, a probe, a hybridization tag complement, or combinations thereof. Minor groove binders may increase the T n , of the primer or a probe to which they are attached, allowing such primers or probes to effectively hybridize at higher temperatures.
  • identity as used herein in the context of two or more nucleic acids or polypeptide sequences, may mean that the sequences have a specified percentage of residues that are the same over a specified region.
  • the percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity.
  • the residues of single sequence are included in the denominator but not the numerator of the calculation.
  • Label may mean a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means.
  • useful labels include 32 P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and other entities which can be made detectable.
  • a label may be incorporated into nucleic acids and proteins at any position. logistic regression
  • Logistic regression is part of a category of statistical models called generalized linear models. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these. The dependent or response variable is dichotomous, for example, one of two possible types of cancer. Logistic regression models the natural log of the odds ratio, i.e., the ratio of the probability of belonging to the first group (P) over the probability of belonging to the second group (1-P), as a linear combination of the different expression levels (in log-space) and of other explaining variables.
  • the logistic regression output can be used as a classifier by prescribing that a case or sample will be classified into the first type if P is greater than 0.5 or 50%.
  • the calculated probability P can be used as a variable in other contexts such as a ID or 2D threshold classifier.
  • 1D/2D threshold classifier may mean an algorithm for classifying a case or sample such as a cancer sample into one of two possible types such as two types of cancer or two types of prognosis (e.g., good and bad).
  • the decision is based on one variable and one predetermined threshold value; the sample is assigned to one class if the variable exceeds the threshold and to the other class if the variable is less than the threshold.
  • a 2D threshold classifier is an algorithm for classifying into one of two types based on the values of two variables.
  • a score may be calculated as a function (usually a continuous function) of the two variables; the decision is then reached by comparing the score to the predetermined threshold, similar to the ID threshold classifier. mismatch
  • NPV Negative predictive value
  • Nucleic acid or “oligonucleotide” or “polynucleotide”, as used herein, may mean at least two nucleotides covalently linked together.
  • the depiction of a single strand also defines the sequence of the complementary strand.
  • a nucleic acid also encompasses the complementary strand of a depicted single strand.
  • Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid.
  • a nucleic acid also encompasses substantially identical nucleic acids and complements thereof.
  • a single strand provides a probe that may hybridize to a target sequence under stringent hybridization conditions.
  • a nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions.
  • Nucleic acids may be single-stranded or double-stranded, or may contain portions of both double-stranded and single-stranded sequence.
  • the nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleo tides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine and isoguanine.
  • Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods.
  • a nucleic acid will generally contain phosphodiester bonds, although nucleic acid analogs may be included that may have at least one different linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite linkages and peptide nucleic acid backbones and linkages.
  • Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are incorporated herein by reference.
  • Nucleic acids containing one or more non-naturally occurring or modified nucleotides are also included within one definition of nucleic acids.
  • the modified nucleotide analog may be located for example at the 5'-end and/or the 3 '-end of the nucleic acid molecule.
  • Representative examples of nucleotide analogs may be selected from sugar- or backbone-modified ribonucleotides.
  • nucleobase-modified ribonucleotides i.e., ribonucleotides, containing a non-naturally occurring nucleobase instead of a naturally occurring nucleobase such as uridine or cytidine modified at the 5-position, e.g., 5-(2-amino)propyl uridine, 5-bromo uridine; adenosine and guanosine modified at the 8-position, e.g., 8-bromo guanosine; deaza nucleotides, e.g., 7-deaza-adenosine; O- and N-alkylated nucleotides, e.g., N6- methyl adenosine are suitable.
  • the 2'-OH-group may be replaced by a group selected from H, OR, R, halo, SH, SR, NH 2 , NHR, NR 2 or CN, wherein R is Ci-C 6 alkyl, alkenyl or alkynyl and halo is F, Cl, Br or I.
  • Modified nucleotides also include nucleotides conjugated with cholesterol through, e.g., a hydroxyprolinol linkage as described in Krutzfeldt et al, Nature 438:685-689 (2005), Soutschek et al, Nature 432:173-178 (2004), and U.S. Patent Publication No. 20050107325, which are incorporated herein by reference.
  • Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments, to enhance diffusion across cell membranes, or as probes on a biochip.
  • the backbone modification may also enhance resistance to degradation, such as in the harsh endocytic environment of cells.
  • the backbone modification may also reduce nucleic acid clearance by hepatocytes, such as in the liver and kidney. Mixtures of naturally occurring nucleic acids and analogs may be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. positive predictive value
  • PSV Positive predictive value
  • PPV may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example the probability of a patient to have specific condition, given a positive diagnosis.
  • the PPV for class A is the proportion of cases that are correctly diagnosed as belonging to class "A” by the test out of the cases that are diagnosed as belonging to class "A”, as determined by some absolute or gold standard. probe
  • Probe may mean an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. There may be any number of base pair mismatches which will interfere with hybridization between the target sequence and the single stranded nucleic acids described herein. However, if the number of mutations is so great that no hybridization can occur under even the least stringent of hybridization conditions, the sequence is not a complementary target sequence.
  • a probe may be single stranded or partially single and partially double stranded. The strandedness of the probe is dictated by the structure, composition, and properties of the target sequence. Probes may be directly labeled or indirectly labeled such as with biotin to which a streptavidin complex may later bind. reference expression profile
  • the term "reference expression profile” means a value that statistically correlates to a particular outcome when compared to an assay result, hi some embodiments, the reference value is determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes. In some embodiments, the reference value may be based on the abundance of the nucleic acids, hi some embodiments, the reference value may be based on a threshold score value or a cutoff score value. Typically, a reference value will be a threshold above which one outcome is more probable and below which an alternative threshold is more probable. As used herein, the phrase “reference expression profile” refers to a criterion expression profile to which measured values are compared in order to determine the prognosis of a subject with colon cancer. reference value
  • the term "reference value” means a value that statistically correlates to a particular outcome when compared to an assay result.
  • the reference value is determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes.
  • the reference value may be a threshold score value or a cutoff score value. Typically a reference value will be a threshold above which one outcome is more probable and below which an alternative threshold is more probable. sensitivity
  • “Sensitivity”, as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example how frequently it correctly classifies a cancer into the correct type out of two possible types.
  • the sensitivity for class A is the proportion of cases that are determined to belong to class "A” by the test out of the cases that are in class "A”, as determined by some absolute or gold standard.
  • Specificity may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example how frequently it correctly classifies a cancer into the correct type out of two possible types.
  • the specificity for class A is the proportion of cases that are determined to belong to class "not A” by the test out of the cases that are in class "not A”, as determined by some absolute or gold standard. stage of cancer
  • stage of cancer refers to a numerical measurement of the level of advancement of a cancer. Criteria used to determine the stage of a cancer include, but are not limited to, the depth of tumor invasion, whether the tumor has spread to other parts of the body and where the cancer has spread (e.g., within the same organ or region of the body or to another organ).
  • stringent hybridization conditions may mean conditions under which a first nucleic acid sequence (e.g., probe) will hybridize to a second nucleic acid sequence (e.g., target), such as in a complex mixture of nucleic acids. Stringent conditions are sequence-dependent and will be different in different circumstances.
  • Stringent conditions may be selected to be about 5-10°C lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength pH.
  • the T m may be the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are occupied at equilibrium).
  • Stringent conditions may be those in which the salt concentration is less than about 1.0 M sodium ion, such as about 0.01-1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., about 10-50 nucleotides) and at least about 60°C for long probes (e.g., greater than about 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal may be at least 2 to 10 times background hybridization.
  • Exemplary stringent hybridization conditions include the following: 50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 0.1% SDS at 65 0 C. subject
  • substantially complementary refers to a mammal, including both human and other mammals.
  • the methods of the present invention are preferably applied to human subjects.
  • substantially complementary may mean that a first sequence is at least 60%-99% identical to the complement of a second sequence over a region of 8-50 or more nucleotides, or that the two sequences hybridize under stringent hybridization conditions.
  • substantially identical may mean that a first and second sequence are at least 60%-99% identical over a region of 8-50 or more nucleotides or amino acids, or with respect to nucleic acids, if the first sequence is substantially complementary to the complement of the second sequence.
  • survival rate refers to the percentage of people in a study or treatment group who are alive for a certain period of time after they were diagnosed with or treated for a disease, such as cancer.
  • the overall survival rate is often stated as a five-year survival rate, which is the percentage of people in a study or treatment group who are alive five years after diagnosis or treatment.
  • terapéuticaally effective amount refers to dosage that provides the specific pharmacological response for which the drug is administered in a significant number of subjects in need of such treatment.
  • the “therapeutically effective amount” may vary according to, for example, the physical condition of the patient, the age of the patient and the severity of the disease. tissue sample
  • tissue sample is tissue obtained from a tissue biopsy using methods well known to those of ordinary skill in the related medical arts.
  • the phrase "suspected of being cancerous" as used herein means a cancer tissue sample believed by one of ordinary skill in the medical arts to contain cancerous cells. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, microdissection, laser- based microdissection, or other art-known cell-separation methods.
  • Treat” or “treating”, used herein when referring to protection of a subject from a condition, may mean preventing, suppressing, repressing, or eliminating the condition.
  • Preventing the condition involves administering a composition described herein to a subject prior to onset of the condition.
  • Suppressing the condition involves administering the composition to a subject after induction of the condition but before its clinical appearance.
  • Repressing the condition involves administering the composition to a subject after clinical appearance of the condition such that the condition is reduced or prevented from worsening.
  • Elimination of the condition involves administering the composition to a subject after clinical appearance of the condition such that the subject no longer suffers from the condition.
  • Tumor refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues.
  • variant "Variant”, used herein to refer to a nucleic acid may mean (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to the referenced nucleic acid, complement thereof, or a sequences substantially identical thereto.
  • MicroRNA and its processing may mean (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic
  • a gene coding for a miRNA may be transcribed leading to production of a miRNA precursor known as the pri-miRNA.
  • the pri-miRNA may be part of a polycistronic RNA comprising multiple pri-miRNAs.
  • the pri-miRNA may form a hairpin with a stem and loop.
  • the stem may comprise mismatched bases.
  • the hairpin structure of the pri-miRNA may be recognized by Drosha, which is an RNase III endonuclease. Drosha may recognize terminal loops in the pri-miRNA and cleave approximately two helical turns into the stem to produce a 30-200 nt precursor known as the pre-miRNA. Drosha may cleave the pri-miRNA with a staggered cut typical of RNase III endonucleases yielding a pre-miRNA stem loop with a 5' phosphate and ⁇ 2 nucleotide 3' overhang. Approximately one helical turn of stem (-10 nucleotides) extending beyond the Drosha cleavage site may be essential for efficient processing. The pre-miRNA may then be actively transported from the nucleus to the cytoplasm by Ran- GTP and the export receptor Ex-portin-5.
  • the pre-miRNA may be recognized by Dicer, which is also an RNase III endonuclease. Dicer may recognize the double-stranded stem of the pre-miRNA. Dicer may also recognize the 5' phosphate and 3' overhang at the base of the stem loop. Dicer may cleave off the terminal loop two helical turns away from the base of the stem loop leaving an additional 5' phosphate and ⁇ 2 nucleotide 3' overhang. The resulting siRNA- like duplex, which may comprise mismatches, comprises the mature miRNA and a similar-sized fragment known as the miRNA*. The miRNA and miRNA* may be derived from opposing arms of the pri-miRNA and pre-miRNA. MiRNA* sequences may be found in libraries of cloned miRNAs but typically at lower frequency than the miRNAs.
  • RISC RNA-induced silencing complex
  • the miRNA* When the miRNA strand of the miRNA:miRNA* duplex is loaded into the RISC, the miRNA* may be removed and degraded.
  • the strand of the miRNA:miRNA* duplex that is loaded into the RISC may be the strand whose 5' end is less tightly paired, hi cases where both ends of the miRNA:miRNA* have roughly equivalent 5' pairing, both miRNA and miRNA* may have gene silencing activity.
  • the RISC may identify target nucleic acids based on high levels of complementarity between the miRNA and the mRNA, especially by nucleotides 2-8 of the miRNA. Only one case has been reported in animals where the interaction between the miRNA and its target was along the entire length of the miRNA. This was shown for miR-196 and Hox B8 and it was further shown that miR-196 mediates the cleavage of the Hox B8 mRNA (Yekta et al, Science 2004; 304:594-596). Otherwise, such interactions are known only in plants (Bartel & Bartel, Plant Physiol 2003; 132:709-717).
  • the target sites in the mRNA may be in the 5' UTR, the 3' UTR or in the coding region.
  • multiple miRNAs may regulate the same mRNA target by recognizing the same or multiple sites.
  • the presence of multiple miRNA binding sites in most genetically identified targets may indicate that the cooperative action of multiple RISCs provides the most efficient translational inhibition.
  • miRNAs may direct the RISC to down-regulate gene expression by either of two mechanisms: mRNA cleavage or translational repression.
  • the miRNA may specify cleavage of the mRNA if the mRNA has a certain degree of complementarity to the miRNA. When a miRNA guides cleavage, the cut may be between the nucleotides pairing to residues 10 and 11 of the miRNA. Alternatively, the miRNA may repress translation if the miRNA does not have the requisite degree of complementarity to the miRNA. Translational repression may be more prevalent in animals since animals may have a lower degree of complementarity between the
  • any pair of miRNA and miRNA* there may be variability in the 5' and 3' ends of any pair of miRNA and miRNA*. This variability may be due to variability in the enzymatic processing of Drosha and Dicer with respect to the site of cleavage. Variability at the 5 ' and 3' ends of miRNA and miRNA* may also be due to mismatches in the stem structures of the pri-miRNA and pre-miRNA. The mismatches of the stem strands may lead to a population of different hairpin structures. Variability in the stem structures may also lead to variability in the products of cleavage by Drosha and Dicer. c. Nucleic Acids
  • nucleic acids are provided herein.
  • the nucleic acid may comprise the sequence of SEQ ID NOS: 1-100 or variants thereof.
  • the variant may be a complement of the referenced nucleotide sequence.
  • the variant may also be a nucleotide sequence that is substantially identical to the referenced nucleotide sequence or the complement thereof.
  • the variant may also be a nucleotide sequence which hybridizes under stringent conditions to the referenced nucleotide sequence, complements thereof, or nucleotide sequences substantially identical thereto.
  • the nucleic acid may have a length of from 10 to 250 nucleotides.
  • the nucleic acid may have a length of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200 or 250 nucleotides.
  • the nucleic acid may be synthesized or expressed in a cell (in vitro or in vivo) using a synthetic gene described herein.
  • the nucleic acid may be synthesized as a single strand molecule and hybridized to a substantially complementary nucleic acid to form a duplex.
  • the nucleic acid may be introduced to a cell, tissue or organ in a single- or double-stranded form or capable of being expressed by a synthetic gene using methods well known to those skilled in the art, including as described in U.S. Patent No. 6,506,559 which is incorporated herein by reference.
  • the microRNA name is the miRBase registry name (release 10).
  • the nucleic acid may further comprise one or more of the following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an antibody fragment, a Fab fragment, and an aptamer.
  • the nucleic acid may also comprise a protamine-antibody fusion protein as described in Song et al. (Nature Biotechnology 2005;23:709-17) and Rossi (Nature Biotechnology 2005;23:682-4), the contents of which are incorporated herein by reference.
  • the protamine may readily interact with the nucleic acid.
  • the protamine may comprise the entire 51 -amino acid protamine peptide or a fragment thereof.
  • the protamine may be covalently attached to another protein, which may be a Fab.
  • the Fab may bind to a receptor expressed on a cell surface.
  • Pri-miRNA The nucleic acid may comprise a sequence of a pri-miRNA or a variant thereof.
  • the pri-miRNA sequence may comprise from 45-30,000, 50-25,000, 100-20,000, 1,000- 1,500 or 80-100 nucleotides.
  • the sequence of the pri-miRNA may comprise a pre- miRNA, miRNA and miRNA*, as set forth herein, and variants thereof.
  • the sequence of the pri-miRNA may comprise the sequence of SEQ ID NOS: 1-75 or variants thereof.
  • the pri-miRNA may form a hairpin structure.
  • the hairpin may comprise first and second nucleic acid sequences that are substantially complementary.
  • the first and second nucleic acid sequence may be from 37-50 nucleotides.
  • the first and second nucleic acid sequence may be separated by a third sequence of from 8-12 nucleotides.
  • the hairpin structure may have a free energy less than -25 Kcal/mole, as calculated by the Vienna algorithm, with default parameters, as described in Hofacker et al. (Monatshefte f. Chemie 1994;125:167-188), the contents of which are incorporated herein.
  • the hairpin may comprise a terminal loop of 4-20, 8-12 or 10 nucleotides.
  • the pri-miRNA may comprise at least 19% adenosine nucleotides, at least 16% cytosine nucleotides, at least 23% thymine nucleotides and at least 19% guanine nucleotides. iii. Pre-miRNA
  • the nucleic acid may also comprise a sequence of a pre-miRNA or a variant thereof.
  • the pre-miRNA sequence may comprise from 45-200, 60-80 or 60-70 nucleotides.
  • the sequence of the pre-miRNA may comprise a miRNA and a miRNA* as set forth herein.
  • the sequence of the pre-miRNA may also be that of a pri-miRNA excluding from 0-160 nucleotides from the 5' and 3' ends of the pri-miRNA.
  • the sequence of the pre-miRNA may comprise the sequence of SEQ ID NOS: 1-75 or variants thereof. iv. miRNA
  • the nucleic acid may also comprise a sequence of a miRNA (including miRNA*) or a variant thereof.
  • the miRNA sequence may comprise from 13-33, 18-24 or 21-23 nucleotides.
  • the miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides.
  • the sequence of the miRNA may be the first 13-33 nucleotides of the pre-miRNA.
  • the sequence of the miRNA may also be the last 13-33 nucleotides of the pre-miRNA.
  • the sequence of the miRNA may comprise the sequence of SEQ ID NOS: 1-20, 45-54, 66-70 or variants thereof.
  • the nucleic acid may also comprise a sequence of an anti-miRNA that is capable of blocking the activity of a miRNA or miRNA*, such as by binding to the pri -miRNA, pre-miRNA, miRNA or miRNA* (e.g., antisense or RNA silencing), or by binding to the target binding site.
  • the anti-miRNA may comprise a total of 5-100 or 10-60 nucleotides.
  • the anti-miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides.
  • the sequence of the anti-miRNA may comprise (a) at least 5 nucleotides that are substantially identical or complementary to the 5' of a miRNA and at least 5-12 nucleotides that are substantially complementary to the flanking regions of the target site from the 5' end of the miRNA, or (b) at least 5-12 nucleotides that are substantially identical or complementary to the 3 ' of a miRNA and at least 5 nucleotide that are substantially complementary to the flanking region of the target site from the 3' end of the miRNA.
  • the sequence of the anti-miRNA may comprise the complement of SEQ ID NOS: 1-75, or variants thereof. vi.
  • the nucleic acid may also comprise a sequence of a target miRNA binding site, or a variant thereof.
  • the target site sequence may comprise a total of 5-100 or 10-60 nucleotides.
  • the target site sequence may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62 or 63 nucleotides.
  • the target site sequence may comprise at least 5 nucleotides of the complementary sequence of SEQ ID NOS: 1-75.
  • a synthetic gene comprising a nucleic acid described herein operably linked to a transcriptional and/or translational regulatory sequence.
  • the synthetic gene may be capable of modifying the expression of a target gene with a binding site for a nucleic acid described herein. Expression of the target gene may be modified in a cell, tissue or organ.
  • the synthetic gene may be synthesized or derived from naturally-occurring genes by standard recombinant techniques.
  • the synthetic gene may also comprise terminators at the 3 '-end of the transcriptional unit of the synthetic gene sequence.
  • the synthetic gene may also comprise a selectable marker.
  • Probes A probe is also provided comprising a nucleic acid described herein. Probes may be used for screening and diagnostic methods, as outlined below. The probe may be attached or immobilized to a solid substrate, such as a biochip.
  • the probe may have a length of from 8 to 500, 10 to 100 or 20 to 60 nucleotides.
  • the probe may also have a length of at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140,
  • the probe may further comprise a linker sequence of from 10-60 nucleotides.
  • the biochip may comprise a solid substrate comprising an attached probe or plurality of probes described herein.
  • the probes may be capable of hybridizing to a target sequence under stringent hybridization conditions.
  • the probes may be attached at spatially defined address on the substrate. More than one probe per target sequence may be used, with either overlapping probes or probes to different sections of a particular target sequence.
  • the probes may be capable of hybridizing to target sequences associated with a single disorder appreciated by those in the art.
  • the probes may either be synthesized first, with subsequent attachment to the biochip, or may be directly synthesized on the biochip.
  • the solid substrate may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the probes and is amenable to at least one detection method.
  • substrates include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics.
  • the substrates may allow optical detection without appreciably fluorescing.
  • the substrate may be planar, although other configurations of substrates may be used as well. For example, probes may be placed on the inside surface of a tube, for flow-through sample analysis to minimize sample volume.
  • the substrate may be flexible, such as a flexible foam, including closed cell foams made of particular plastics.
  • the biochip and the probe may be derivatized with chemical functional groups for subsequent attachment of the two.
  • the biochip may be derivatized with a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups.
  • the probes may be attached using functional groups on the probes either directly or indirectly using a linker.
  • the probes may be attached to the solid support by either the 5' terminus, 3' terminus, or via an internal nucleotide.
  • the probe may also be attached to the solid support non-covalently.
  • biotinylated oligonucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment.
  • probes may be synthesized on the surface using techniques such as photopolymerization and photolithography.
  • a method of diagnosis comprises detecting a differential expression level of colon cancer-associated nucleic acid in a biological sample.
  • the sample may be derived from a patient. Diagnosis of a disease state in a patient may allow for prognosis and selection of therapeutic and follow-up strategy. Furthermore, the developmental stage of cells may be classified by determining temporarily expressed colon cancer-associated nucleic acids.
  • In situ hybridization of labeled probes to tissue sections may be performed.
  • the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the nucleic acids which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.
  • Biomarkers are also provided.
  • One type of cancer screening test involves the detection of a biomarker, such as a tumor marker, in a fluid or tissue obtained from a patient.
  • Tumor markers are substances produced by cancer cells that are not typically produced by normal cells. These substances generally can be detected in the body fluids or tissues of patients with cancer.
  • Another important use for tumor markers is for monitoring patients being treated for advanced cancer. Measuring tumor markers for this purpose can be less invasive, less time-consuming, as well as less expensive, than other complicated tests, to determine if a therapy is reducing the cancer.
  • a further important use for tumor markers is for determining a prognosis of survival of a cancer patient.
  • Such prognostic methods can be used to identify surgically treated patients likely to experience cancer recurrence so that they can be offered additional therapeutic options.
  • kits may comprise a nucleic acid described herein together with any or all of the following: assay reagents, buffers, probes and/or primers, and sterile saline or another pharmaceutically acceptable emulsion and suspension base.
  • the kits may include instructional materials containing directions (e.g., protocols) for the practice of the methods described herein.
  • the kit may be a kit for the amplification, detection, identification or quantification of a target nucleic acid sequence.
  • the kit may comprise a poly(T) primer, a forward primer, a reverse primer, and a probe.
  • Colon adenocarcinoma tumor specimens (formalin-fixed, paraffin-embedded, FFPE) obtained from colon cancer patients stages I-III, without adjuvant therapy and with at least 3 years of follow-up, were used for this research.
  • Total RNA enriched in microRNA was isolated from the FFPE tumor specimens, and all RNAs extracted were hybridized onto microarrays according to the RNA extraction and array platform protocols described below.
  • Custom microarrays were produced by printing DNA oligonucleotide probes representing 688 human microRNAs. Each probe, printed in triplicate, carried up to 22-nt linker at the 3' end of the microRNA's complement sequence, in addition to an amine group used to couple the probes to coated glass slides. Each probe (20 ⁇ M) was dissolved in 2X SSC + 0.0035% SDS and spotted in triplicate on Schott Nexterion®
  • RNAs were spiked to the RNA before labeling to verify labeling efficiency; and (ii) probes for abundant small RNA (e.g., small nuclear RNAs (U43, U49, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA) were spotted on the array to verify RNA quality.
  • small RNA e.g., small nuclear RNAs (U43, U49, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA
  • the slides were blocked in a solution containing 50 mM ethanolamine, 1 M Tris (pH9.0) and 0.1% SDS for 20 min at 50 0 C, then thoroughly rinsed with water and spun dry.
  • RNA-linker p-rCrU-Cy/dye (Dharmacon)
  • Dharmacon an RNA-linker
  • the labeling reaction contained total RNA, spikes (0.1-20 fmoles), 300 ng RNA- linker-dye, 15% DMSO, Ix ligase buffer and 20 units of T4 RNA ligase (NEB) and proceeded at 4 0 C for 1 h, followed by 1 h at 37 0 C.
  • hsa-miR-214 (SEQ ID NO: 67), hsA-miR-185 (SEQ ID NO: 68), hsa-miR-141 (SEQ ID NO: 69)and hsa-miR-221 (SEQ ID NO: 70) served as miR normalizers for the PCR validation.
  • the cycle threshold (C T , the PCR cycle at which probe signal reaches the threshold) was determined for each microRNA. To allow comparison with results from the microarray, each value received was subtracted from 50. This 50-C ⁇ expression for each microRNA for each patient was compared with the signal obtained by the microarray method. Table 2: Sequences used for RT-PCR validation
  • Measurements of the expression of miRs on the microarray were log- transformed before all further analysis. Normalization of samples was performed by calculating a median reference vector. For each sample, the best fit to this reference vector was calculated using a 2 nd degree polynomial.
  • microRNAs are able to predict the prognosis of colon cancer patients
  • the median expression values of hsa-miR-23a (SEQ ID NO: 5), hsa-miR-24 (SEQ ID NO: 6), MID- 00291 (SEQ ID NO: 7), MID-00466 (SEQ ID NO: 8), MID-00595 (SEQ ID NO: 9) and hsa-miR-21 (SEQ ID NO: 10) were found to be above the median expression values of the patients with good prognosis, whereas the median expression values of hsa-miR-196b (SEQ ID NO: 3) and hsa-miR-125b (SEQ ID NO: 4) were found to be below the median expression of the patients with good prognosis.
  • Figure 3 demonstrates differential expression of miRs in samples obtained from stage II colon cancer patients with good prognosis and from stage II colon cancer patients with bad prognosis.
  • the median expression value of hsa-miR-29a (SEQ ID NO: 1) was found to be below the median expression of the patients with good prognosis. Accordingly, relatively high expression values of hsa-miR-29a (SEQ ID NO: 1) were demonstrated to be indicative of good prognosis.
  • Example 3 miR expression patterns in patients with stage I colon cancer correlate with prognosis
  • DFS survival and disease-free survival
  • Table 3 Up- and down-regulation of miRs in patients defined as having colon cancer with bad prognosis
  • FIGS 4A-4D show Kaplan- Meier plots of survival time and disease-free survival curves plotted for each of the three expression-level groups.
  • hsa-miR-361-3p SEQ ID NO: 11
  • hsa-miR-29c* SEQ ID NO: 12
  • hsa-miR-196b SEQ ID NO: 3
  • hsa-miR- 194 (SEQ ID NO: 13) was up-regulated in samples obtained from stage I patients with bad prognosis. Accordingly, relatively high expression values of this miR were demonstrated to be indicative of bad prognosis.
  • DFS survival and disease-free survival
  • FIGS 5A-5D show Kaplan- Meier plots of disease-free survival curves plotted for each of the three expression-level groups.
  • hsa-miR-29a SEQ ID NO: 1
  • hsa-miR-155 SEQ ID NO: 14
  • hsa-miR-498 SEQ ID NO: 15
  • hsa-miR-527 SEQ ID NO: 16
  • hsa-miR-103 SEQ ID NO: 17
  • miR-27a SEQ ID NO: 19
  • hsa-miR-196b (SEQ ID NO: 3) was down- regulated in samples obtained from stage I patients with bad prognosis. Accordingly, expression thresholds for these miRs were demonstrated to be indicative of good prognosis.
  • miR expression performance to predict a subgroup of good prognosis patients with stage II colon cancer and a subgroup of bad prognosis patients with stage II colon cancer miR performance is defined by the existence of an expression cutoff that classifies the good and bad prognosis groups such that 80% of one group had only good prognosis patients and this group contains at least 30% of the good prognosis patients (i.e. ppv >0.8, sensitivity >0.3).
  • hsa-miR-29a SEQ ID NO: 1 was down-regulated in samples obtained from stage II patients with bad prognosis. Accordingly, relatively high expression values of this miR are indicative of good prognosis.
  • hsa-miR-29a could also be used to identify a subgroup of bad prognosis patients with stage II colon cancer (npv >0.6, specificity >0.3).
  • Example 7 Using miR expression performance to predict a subgroup of good prognosis patients with stage I and II colon cancer and a subgroup of bad prognosis patients with stage I and II colon cancer miR performance is defined by the existence of an expression cutoff that classifies the good and bad prognosis groups such that 90% of one group had only good prognosis patients and this group contains at least 30% of the good prognosis patients (i.e. ppv >0.9, sensitivity >0.3).
  • hsa-miR-27a SEQ ID NO: 19
  • hsa-miR-451 SEQ ID NO: 45
  • hsa-miR-768-3p SEQ ID NO: 46
  • hsa- miR-199a-5p SEQ ID NO: 47
  • hsa-miR-lOa SEQ ID NO: 51
  • hsa-miR-423-3p SEQ ID NO: 18
  • hsa-miR-378 SEQ ID NO: 48
  • hsa-miR-612 SEQ ID NO: 49
  • hsa-miR-200b SEQ ID NO: 20
  • hsa-miR-429 SEQ ID NO: 50
  • MID-00689 SEQ ID NO: 52
  • hsa-miR-30d SEQ ID NO: 53
  • hsa-miR-29a (SEQ ID NO: 1) could be used to identify a subgroup of bad prognosis patients, with npv >0.8 and specificity >0.2. Low expression of this miR is indicative of bad prognosis in the combined stage I and stage II patients.
  • the miR expression patterns of the statistically significant miRs in patients with stage I colon cancer were measured by microarray and subsequently validated by PCR.
  • the distribution of expression, as measured by microarray, in tumor samples obtained from stage I colon cancer patients with good prognosis (no recurrence within 3 years following surgery) or bad prognosis (recurrence within 3 years following surgery) is shown in Figure 12.
  • Logrank p-value ⁇ 0.05 and fold-change values for the microarray results are presented in Table 4.
  • Figure 13 presents the distribution of expression, as measured by PCR, in the same patient group.
  • Logrank p-value ⁇ 0.05 and fold-change values for the PCR results are presented in Table 5.
  • Table 4 Distribution of miR expression in stage I colon cancer patients as tested by microarray
  • Table 5 Distribution of miR expression in stage I colon cancer patients as tested by PCR
  • FIG. 14 A A Kaplan-Meier plot for persistence (survival) of disease-free status of colon cancer patients (stage I) grouped by expression levels of hsa-miR-196b, as measured by microarray, is shown in Figure 14 A.
  • Figure 14B presents the Kaplan-Meier plot for the same group of colon cancer patients (stage I), as measured by PCR.
  • Table 6 Distribution of miR expression in stage II colon cancer patients as tested by microarray
  • Table 7 Distribution of miR expression in stage II colon cancer patients as tested by PCR
  • FIG. 18 A A Kaplan-Meier plot for persistence (survival) of disease-free status of colon cancer patients (stage II) grouped by expression levels of hsa-miR-29a, as measured by microarray, is shown in Figure 18 A.
  • Figure 18B presents the Kaplan-Meier plot for the same group of colon cancer patients (stage II), as measured by PCR.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Described herein are compositions and methods for prognosis of colon cancer. The compositions are microRNA molecules associated with the prognosis of colon cancer, as well as various nucleic acid molecules relating thereto or derived thereof.

Description

COMPOSITIONS AND METHODS FOR THE PROGNOSIS OF COLON
CANCER
CROSS REFERENCE TO RELATED APPLICATIONS
The present application claims priority under 35 U.S. C. § 119(e) to U.S.
Provisional Application No. 61/116,301, filed November 20, 2008, which is herein incorporated by reference in its entirety.
FIELD OF THE INVENTION
The present invention relates to compositions and methods for determining the prognosis of colon cancer. Specifically, the invention relates to microRNA molecules associated with the prognosis of colon cancer, as well as various nucleic acid molecules relating thereto or derived thereof.
BACKGROUND OF THE INVENTION
In recent years, microRNAs (miRs, miRNAs) have emerged as an important novel class of regulatory RNAs that has a profound impact on a wide array of biological processes. These small (typically 18-24 nucleotides long) non-coding RNA molecules can modulate protein expression patterns by promoting RNA degradation, inhibiting mRNA translation, and also affecting gene transcription. miRs play pivotal roles in diverse processes such as development and differentiation, control of cell proliferation, stress response and metabolism. The expression of many miRs was found to be altered in numerous types of human cancer, and in some cases strong evidence has been put forward in support of the conjecture that such alterations may play a causative role in tumor progression. There are currently about 900 known human miRs.
Colon cancer is the second most frequently diagnosed malignancy in the United States, as well as the second most common cause of cancer death. Approximately
150,000 and 1,000,000 new cases of colon cancer are diagnosed each year in the USA and globally, respectively. The five-year survival rate for patients with colorectal cancer detected at an early localized stage is 92%; unfortunately, only 37% of colorectal cancer is diagnosed at this stage. The survival rate drops to 64% if the cancer is allowed to spread to adjacent organs or lymph nodes, and to 7% in patients with distant metastases. The prognosis of colon cancer is directly related to the degree of penetration of the tumor through the bowel wall and the presence or absence of nodal involvement. However, none of the biological factors that are recognized today, including the extent of the locoregional spread of the tumor, can accurately predict a patient's outcome. For example, even without adjuvant treatment, 50% of patients with advanced stage III disease will not experience recurrence of disease, while 10% of those diagnosed with apparently early stage I disease will experience such an event. Even the attempt to combine several clinicopathological factors with known prognostic impact has not significantly improved our current ability to predict the prognosis of an individual patient. Improved prediction of the individual risk of recurrence and death has very important practical implications in that it may help to avoid or direct post-operative adjuvant treatment.
Thus, there exists a need for identification of biomarkers that can be used as prognostic indicators for colon cancer.
SUMMARY OF THE INVENTION
According to some embodiments of the present invention, the expression profile of nucleic acid sequences and variants thereof (SEQ ID NOS: 1-66 and 71) in biological samples obtained from colon cancer patients is indicative of cancer prognosis: the life expectancy of the patient, the risk of recurrence, the expected disease-free survival and response to treatment.
According to one aspect of the invention, a method for determining a prognosis for colon cancer in a subject is provided, the method comprising:
(a) obtaining a biological sample from the subject;
(b) determining an expression profile in said sample of nucleic acid sequences selected from the group consisting of SEQ ID NOS: 1-66 and 71 or a sequence having at least about 80% identity thereto; and
(c) comparing said expression profile to a reference value, whereby altered expression levels of the nucleic acid sequences is indicative of the prognosis of said subject. According to some embodiments, said altered expression level is a change in a score based on a combination of expression levels of said nucleic acid sequences. According to one embodiment, said nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 2, 5-10, 13, 15-17, 19, 22, 26-32, 35, 36, 38-41, 43,
45-47, 51, 55-57, 61, 66 and 71, and sequences at least about 80% identical thereto and said expression levels above said reference value is indicative of poor prognosis in said subject.
According to another embodiment, said nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 1, 3, 4, 11, 12, 14, 18, 20, 21, 23-25, 33, 34, 37,
42, 44, 48-50, 52-54, 58-60 and 62-65, and sequences at least about 80% identical thereto, and said expression levels below said reference value is indicative of poor prognosis in said subject.
In certain embodiments, the subject is a human.
In certain embodiments, the colon cancer is colon adenocarcinoma.
In certain embodiments, said prognosis is prediction of colon cancer risk of recurrence. In certain embodiments, the biological sample obtained from the subject is selected from the group consisting of bodily fluid, a cell line, a tissue sample, a biopsy sample, a needle biopsy sample, a fine needle biopsy (FNA) sample, a surgically removed sample, and a sample obtained by tissue-sampling procedures such as colonscopy or laparoscopic methods. In certain embodiments the tissue is a fresh, frozen, fixed, wax-embedded or formalin-fixed paraffin-embedded (FFPE) tissue.
In certain embodiments said tissue is a colon tissue. In certain embodiments said colon tissue is colon adenocarcinoma.
According to some embodiments, the expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, and a combination thereof. According to some embodiments, the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization.
According to other embodiments, the nucleic acid amplification method is quantitative real-time PCR. According to some embodiments, the PCR method comprises forward and reverse primers. According to some embodiments, said forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 88-99 and sequences at least about 80% identical thereto. According to some embodiments, said reverse primer comprises SEQ ID NO: 100, a fragment thereof and a sequence at least about 80% identical thereto. According to some embodiments, the real-time PCR method further comprises a probe.
According to some embodiments, the probe comprising a nucleic acid sequence that is complementary to a sequence selected from SEQ ID NO: 1-66 and 71; to a fragment thereof or to a sequence at least about 80% identical thereto. According to some embodiments, the probe comprises a sequence selected from the group consisting of SEQ
ID NOS: 76-87, a fragment thereof and a sequence at least about 80% identical thereto.
A kit for determining the prognosis of a subject with colon cancer is also provided. The kit may comprise a probe comprising a nucleic acid sequence that is complementary to a sequence selected from SEQ ID NO: 1-66 and 71; to a fragment thereof or to a sequence at least about 80% identical thereto. According to some embodiments, the probe comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 76-87, a fragment thereof and a sequence at least about 80% identical thereto. According to some embodiments, the kit further comprises forward and reverse primers. According to some embodiments, said forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 88-99 and sequences at least about 80% identical thereto. According to some embodiments, said reverse primer comprises SEQ ID NO: 100, a fragment thereof and a sequence at least about 80% identical thereto. According to other embodiments, the kit comprises reagents for performing in situ hybridization analysis.
In some embodiments, prognostic for colon cancer comprises providing the forecast or prediction of (prognostic for) any one or more of the following: duration of survival of a patient susceptible to or diagnosed with colon cancer, duration of recurrence-free survival, duration of progression- free survival of a patient susceptible to or diagnosed with colon cancer, response rate in a group of patients susceptible to or diagnosed with colon cancer, and duration of response in a patient or a group of patients susceptible to or diagnosed with colon cancer, hi some embodiments, duration of survival is forecast or predicted to be increased. In some embodiment, duration of survival is forecast or predicted to be decreased, hi some embodiments, duration of recurrence-free survival is forecast or predicted to be increased. In some embodiments, duration of recurrence-free survival is forecast or predicted to be decreased. In some embodiments, response rate is forecast or predicted to be increased. In some embodiments, response rate is forecast or predicted to be decreased. In some embodiments, duration of response is predicted or forecast to be increased. In some embodiments, duration of response is predicted or forecast to be decreased.
These and other embodiments of the present invention will become apparent in conjunction with the figures, description and claims that follow.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure IA is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from colon cancer patients with bad prognosis (recurrence within 3 years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group. The y-axis represents patients with bad prognosis (24 patients), and the x-axis represents patients with good prognosis (92 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. Statistically significant (p-value<0.05) miRs, hsa-miR-126 (SEQ ID NO: 2) and hsa-miR-29a (SEQ ID NO: 1), are marked with circles. P-values are calculated by Student's t-test. Gray crosses represent untested control probes or median signal <300 in both groups.
Figure IB is a boxplot representation comparing distribution of the expression of the statistically significant (p-value=0.018031 ; fold change 1.2) miR, hsa-miR-29a (SEQ
ID NO: 1), in tumor samples obtained from colon cancer patients with bad or good prognosis (as defined in Figure IA). Two boxes are shown. The left box represents the group of patients with bad prognosis, while the right box represents the group of patients with good prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
Figure 2A is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage I colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage I colon cancer patients with bad prognosis (recurrence within 3 years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group. The y-axis represents patients with bad prognosis (8 patients), and the x-axis represents patients with good prognosis (46 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. Statistically significant (p-value<0.05) miRs, hsa-miR-196b (SEQ ID NO: 3), hsa-miR-125b (SEQ ID NO: 4), hsa-miR-23a (SEQ ID NO: 5), hsa-miR-24 (SEQ ID NO: 6), MID-00291 (SEQ ID NO: 7), MID-00466 (SEQ ID NO: 8), MID- 00595 (SEQ ID NO: 9) and hsa-miR-21 (SEQ ID NO: 10), are marked with circles. Gray crosses represent untested control probes or median signal <300 in both groups.
Figure 2B is a boxplot representation comparing distribution of the expression of the statistically significant (p-value=0.007924; fold change 6.8) miR, hsa-miR-196b (SEQ ID NO: 3), in tumor samples obtained from colon cancer patients with bad or good prognosis (as defined in Figure 2A). Two boxes are shown. The left box represents the group of patients with bad prognosis, while the right box represents the group of patients with good prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
Figure 3A is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage II colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage 2 colon cancer patients with bad prognosis (recurrence within 3 years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group. The y-axis represents patients with bad prognosis (16 patients), and the x-axis represents patients with good prognosis (46 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. The statistically significant miR (p-value<0.05 and adjusting for FDR) is marked with a circle: hsa-miR-29a (SEQ ID NO: 1). Gray crosses represent untested control probes or median signal <300 in both groups.
Figure 3B is a boxplot presentation comparing distributions of the expression of the statistically significant miR: hsa-miR-29a (SEQ ID NO: 1) (p-value=0.000225; fold change 1.4), in tumor samples obtained from stage II colon cancer patients with bad or good prognosis (as defined in Figure 3A). For each miR two boxes are shown. The left box represents the group of patients with bad prognosis, while the right box represents the group of patients with good prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
Figure 4A is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage I) grouped by expression levels of hsa-miR-361-3p (SEQ ID NO: 11) (logrank p-value=0.0095). The y-axis depicts fraction of survival and the x- axis depicts months of survival, with the dashed dotted line representing the patients with the highest expression for this miR (n=48, Iog2(expression) >8.47), the dashed line depicting the intermediate scoring (n=18) and the solid line depicting the lowest scoring (n=18, Iog2(expression) <7.14). Figure 4B is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage I) grouped by expression levels of hsa-miR-29c* (SEQ ID NO: 12) (logrank p-value-0.048). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=18, Iog2(expression) >7.80), the dashed line depicting the intermediate scoring (n=18) and the solid line depicting the lowest scoring (n=18, Iog2(expression) <5.64).
Figure 4C is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage I) grouped by expression levels of hsa-miR-194 (SEQ ID NO: 13) (logrank p-value=0.049). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=18, Iog2(expression) >13.25), the dashed line depicting the intermediate scoring (n=18) and the solid line depicting the lowest scoring (n=18, Iog2(expression) <12.79).
Figure 4D is a Kaplan-Meier plot for persistence of disease- free status (survival) of colon cancer patients (stage I) grouped by expression levels of hsa-miR-196b (SEQ ID
NO: 3) (logrank p-value=0.10). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=18, Iog2(expression) >10.47), the dashed line depicting the intermediate scoring (n=18) and the solid line depicting the lowest scoring (n=18, Iog2(expression) <8.30).
Figure 5A is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage II) grouped by expression levels of hsa-miR-29a (SEQ ID
NO: 1) (logrank p-value=0.011). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=21, Iog2(expression) >13.72), the dashed line depicting the intermediate scoring (n=20) and the solid line depicting the lowest scoring (n=21, Iog2(expression) <13.38). Figure 5B is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage II) grouped by expression levels of hsa-miR-155 (SEQ ID NO: 14) (logrank p-value=0.024). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=21, Iog2(expression) >9.67), the dashed line depicting the intermediate scoring (n=20) and the solid line depicting the lowest scoring (n=21, Iog2(expression) <8.68).
Figure 5C is a Kaplan-Meier plot for persistence of disease- free status (survival) of colon cancer patients (stage II) grouped by expression levels of hsa-miR-498 (SEQ ID NO: 15) (logrank p-value=0.035). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=21, Iog2(expression) >7.95), the dashed line depicting the intermediate scoring (n=20) and the solid line depicting the lowest scoring (n=21, Iog2(expression) <6.63).
Figure 5D is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage II) grouped by expression levels of hsa-miR-527 (SEQ ID
NO: 16) (logrank p-value=0.043). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=21, Iog2(expression) >8.3O), the dashed line depicting the intermediate scoring (n=20) and the solid line depicting the lowest scoring (n=21, Iog2(expression) <7.00).
Figure 6A is a Kaplan-Meier plot showing the disease-free status (survival) of colon cancer patients (stage I). Samples were divided into two groups by the expression of hsa-miR-103 (SEQ ID NO: 17) (logrank p-value=0.049). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed line (labeled as bad prognosis) representing the highest expression for this miR (n=32, Iog2(expression) >12.15) and the solid line (labeled good prognosis) depicting the lowest scoring (n=22, Iog2(expression) <12.15). Figure 6B is a Kaplan-Meier plot showing the disease-free status (survival) of colon cancer patients (stage I). Samples were divided into two groups by the expression of hsa-miR-423-3p (SEQ ID NO: 18) (logrank p-value=0.093). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed line (labeled good prognosis) representing the highest expression for this miR (n=20, Iog2(expression) >10.03) and the solid line (labeled bad prognosis) depicting the lowest scoring (n=34, Iog2(expression) <10.03).
Figure 6C is a Kaplan-Meier plot showing the disease-free status (survival) of colon cancer patients (stage I). Samples were divided into two groups by the expression of hsa-miR-27a (SEQ ID NO: 19) (logrank p-value=0.086). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed line (labeled bad prognosis) representing the highest expression for this miR (n=34, Iog2(expression) >12.52) and the solid line (labeled good prognosis) depicting the lowest scoring (n=20, Iog2(expression) <12.52). Figure 6D is a Kaplan-Meier plot showing the disease- free status (survival) of colon cancer patients (stage I). Samples were divided into two groups by the expression of hsa-miR-125b (SEQ ID NO: 4) (logrank p-value=0.0013). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed line (labeled good prognosis) representing the highest expression for this miR (n=46, Iog2(expression) >12.75) and the solid line (labeled bad prognosis) depicting the lowest scoring (n=8, Iog2(expression) <12.75).
Figure 7 is a Kaplan-Meier plot showing the disease-free status (survival) of colon cancer patients (stage I). Samples were divided into two groups by the expression of hsa-miR-196b (SEQ ID NO: 3) (logrank p-value=0.24). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed line (labeled good prognosis) representing the highest expression for this miR (n=16, Iog2(expression) >10.53) and the solid line (labeled bad prognosis) depicting the lowest scoring (n=38, Iog2(expression) <10.53).
Figure 8 is a Kaplan-Meier plot showing the disease-free status (survival) of colon cancer patients (stage II). Samples were divided into three groups by the expression of hsa-miR-29a (SEQ ID NO: 1) (logrank p-value=5.09e-005). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line (labeled good prognosis) representing the highest expression for this miR (n=36, Iog2(expression) >13.45), the dashed line (labeled intermediate prognosis) representing intermediate expression (n=18) and the solid line (labeled bad prognosis) depicting the lowest scoring (n=8, Iog2(expression) <12.98.
Figure 9 A is a Kaplan-Meier plot showing the disease-free status (survival) of colon cancer patients (stage I and stage II). Samples were divided into two groups by the expression of hsa-miR-27a (SEQ ID NO: 19) (logrank p-value=0.25). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed line
(labeled bad prognosis) representing the highest expression for this miR (n=85,
Iog2(expression) >12.36) and the solid line (labeled good prognosis) depicting the lowest scoring (n=31, Iog2(expression) <12.36).
Figure 9B is a Kaplan-Meier plot showing the disease-free status (survival) of colon cancer patients (stage I and stage II). Samples were divided into two groups by the expression of hsa-miR-423-3p (SEQ E) NO: 18) (logrank p-value=0.26). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed line (labeled good prognosis) representing the highest expression of this miR (n=40, Iog2(expression) >10.03) and the solid line (labeled bad prognosis) depicting the lowest scoring (n=76, Iog2(expression) <10.03).
Figure 9C is a Kaplan-Meier plot showing the disease-free status (survival) of colon cancer patients (stage I and stage II). Samples were divided into two groups by the expression of hsa-miR-200b (SEQ ID NO: 20) (logrank p-value=O.OO55). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed line (labeled good prognosis) representing the highest expression of this miR (n=64, Iog2(expression) > 14.43) and the solid line (labeled bad prognosis) depicting the lowest scoring (n=52, Iog2(expression) <14.43). Figure 9D is a Kaplan-Meier plot showing the disease-free status (survival) of colon cancer patients (stage I and stage II). Samples were divided into two groups by the expression of hsa-miR-29a (SEQ ID NO: 1) (logrank p-value=l.l le-005). The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed line (labeled good prognosis) representing the highest expression of this miR (n=110, Iog2(expression) >12.77) and the solid line (labeled bad prognosis) depicting the lowest scoring (n=6, Iog2(expression) <12.77).
Figure 10 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage I colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage I colon cancer patients with bad prognosis (recurrence within 3 years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group. The y-axis represents patients with bad prognosis (8 patients), and the x-axis represents patients with good prognosis (46 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. Statistically significant miRs (p-value<0.05) are marked with circles: hsa-miR-196b (SEQ ID NO: 3), hsa-miR-125b (SEQ ID NO: 4), hsa-let-7b (SEQ ID NO: 66) and hsa-miR-21 (SEQ ID NO: 10). Gray crosses represent untested control probes or median signal <300 in both groups.
Figure HA is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage I colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage I colon cancer patients with bad prognosis (recurrence within 3 years following surgery). Median values of each miR were compared in all patients in one group with the corresponding median for members of the other group. The y-axis represents patients with bad prognosis (8 patients), and the x-axis represents patients with good prognosis (7 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. The statistically significant miR, hsa-miR-196b (SEQ ID NO: 3) (p-value<0.05), is marked with a circle. Differentially expressed miRs are marked with squares: hsa-miR-126 (SEQ ID NO: 2), hsa-miR-125b (SEQ ID NO: 4), hsa-let-7b (SEQ ID NO: 66) and hsa-miR-21 (SEQ ID NO: 10). Gray crosses represent untested control probes or median signal <300 in both groups.
Figure HB is a scatter plot showing differential expression of the same set of miRs as described in Figure HA (in normalized fluorescence units, as measured by
PCR) in samples obtained from stage I colon cancer patients with good prognosis (no recurrence within 3 years following surgery) versus samples obtained from stage I colon cancer patients with bad prognosis (recurrence within 3 years following surgery). Median values of each miR were compared in all patients in one group with the corresponding median for members of the other group. The y-axis represents stage I patients with bad prognosis (8 patients), and the x-axis represents stage I patients with good prognosis (7 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. The statistically significant miR, hsa-miR-196b (SEQ ID NO: 3) (p- value<0.05), is marked with a circle. Differentially expressed miRs are marked with squares: hsa-miR-126 (SEQ ID NO: 2), hsa-miR-125b (SEQ ID NO: 4), and hsa-let-7b (SEQ ID NO: 66) and hsa-miR-21 (SEQ ID NO: 10). Gray crosses represent untested control probes or median signal <300 in both groups. Figures 12A-E show boxplot representations comparing distributions of the expression of the statistically significant miRs, as measured by microarray, in tumor samples obtained from stage I colon cancer patients with bad or good prognosis (as defined in Figure 3A). For each miR two boxes are shown. The left box represents the group of patients with good prognosis, while the right box represents the group of patients with bad prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group. Figure 12 A: hsa-miR-196b (SEQ ID NO: 3) (p- value=O.OOOO35; fold change 17.2). Figure 12B: hsa-miR-125b (SEQ ID NO: 4) (p- value=0.33; fold change 1.4). Figure 12C: hsa-let-7b (SEQ ID NO: 66) (p-value=0.21; fold change 1.1). Figure 12D: hsa-miR-126 (SEQ ID NO: 2) (p-value=0.12; fold change 2.7). Figure 12E: hsa-miR-21 (SEQ ID NO: 10) (p-value=0.11; fold change 1.2).
Figures 13A-E show boxplot representations comparing distributions of the expression of the same statistically significant miRs as shown in Figures 12A-E, as measured by PCR, in tumor samples obtained from stage I colon cancer patients with bad or good prognosis (as defined in Figure 3A). For each miR two boxes are shown. The left box represents the group of patients with good prognosis, while the right box represents the group of patients with bad prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group. Figure 13 A: hsa-miR-196b (SEQ ID NO: 3) (p- value=0.00056; fold change 2.7). Figure 13B: hsa-miR-125b (SEQ ID NO: 4) (p- value=0.42; fold change 1.3). Figure 13C: hsa-let-7b (SEQ ID NO: 66) (p-value=0.65; fold change 1.3). Figure 13D: hsa-miR-126 (SEQ ID NO: 2) (p-value=0.23; fold change 1.3). Figure 13E: hsa-miR-21 (SEQ ID NO: 10) (p-value=0.38; fold change 1.1).
Figure 14A is a Kaplan-Meier plot for persistence of disease- free status (survival) of colon cancer patients (stage I) grouped by expression levels of hsa-miR-196b (SEQ BD
NO: 3) (logrank p-value=0.0039), as measured by microarray. The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=5, Iog2(expression) >11.03), the dashed line depicting the intermediate scoring (n=6) and the solid line depicting the lowest scoring (n=5, Iog2(expression) <7.32).
Figure 14B is a Kaplan-Meier plot for persistence of disease- free status (survival) of colon cancer patients (stage I) grouped by expression levels of hsa-miR-196b (SEQ ID NO: 3) (logrank p-value=0.005), as measured by PCR. The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=5, Iog2(expression) >19.63), the dashed line depicting the intermediate scoring (n=6) and the solid line depicting the lowest scoring (n=5, Iog2(expression) <18.19). Figure 15A is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from stage II colon cancer patients with good prognosis (no recurrence within 3 years following surgery) and from stage II colon cancer patients with bad prognosis (recurrence within 3 years following surgery). Median values of each miR were compared in all patients in one group with the corresponding median for members of the other group. The y-axis represents patients with bad prognosis (7 patients), and the x-axis represents patients with good prognosis (8 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. The statistically significant miR, hsa- miR-29a (SEQ ID NO: 1) (p-value<0.05), is marked with a circle. Differentially expressed miRs are marked with squares: hsa-miR-155 (SEQ ID NO: 14), hsa-miR-29b (SEQ ID NO: 54), and hsa-let-7b (SEQ ID NO: 66). Gray crosses represent untested control probes or median signal <300 in both groups.
Figure 15B is a scatter plot showing differential expression of the same set of miRs as described in Figure 15A (in normalized fluorescence units, as measured by PCR) in samples obtained from stage II colon cancer patients with good prognosis (no recurrence within 3 years following surgery) versus samples obtained from stage II colon cancer patients with bad prognosis (recurrence within 3 years following surgery). Median values of each miR were compared in all patients in one group with the corresponding median for members of the other group. The y-axis represents stage II patients with bad prognosis (7 patients), and the x-axis represents stage II patients with good prognosis (8 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. Statistically significant miRs, hsa-miR-29a (SEQ ID NO: 1) and hsa-let-7b (SEQ ID NO: 66) (p-value<0.05), are marked with circles. Differentially expressed miRs are marked with squares: hsa-miR-155 (SEQ K) NO: 14), hsa-miR-29b (SEQ ID NO: 54). Gray crosses represent untested control probes or median signal <300 in both groups.
Figures 16A-D show boxplot representations comparing distributions of the expression of the statistically significant miRs, as measured by microarray, in tumor samples obtained from stage II colon cancer patients with bad or good prognosis (as defined in Figure 3A). For each miR two boxes are shown. The left box represents the group of patients with good prognosis, while the right box represents the group of patients with bad prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group. Figure 16A: hsa-miR-29a (SEQ ID NO: 1) (p-value=0.000002; fold change 2.7). Figure 16B: hsa-miR-29b (SEQ ID NO: 54) (p-value=0.74; fold change 1.8). Figure 16C: hsa-miR-126 (SEQ ID NO: 2) (p-value=0.52; fold change 1.3). Figure 16D: hsa-let-7b (SEQ ID NO: 66) (p-value=0.39; fold change 1.1).
Figures 17A-D show boxplot representations comparing distributions of the expression of the same statistically significant miRs as shown in Figures 16A-E, as measured by PCR, in tumor samples obtained from stage II colon cancer patients with bad or good prognosis (as defined in Figure 3A). For each miR two boxes are shown. The left box includes the group of patients with good prognosis, while the right box includes the group of patients with bad prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group. Figure 17A: hsa-miR-29a (SEQ ID NO: 1) (p- value=0.004; fold change 1.8). Figure 17B: hsa-miR-29b (SEQ ID NO: 54) (p- value=0.44; fold change 1.7). Figure 17C: hsa-miR-126 (SEQ ID NO: 2) (p-value=0.11; fold change 1.3). Figure 17D: hsa-let-7b (SEQ ID NO: 66) (p-value=0.0067; fold change 1.7).
Figure 18A is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage II) grouped by expression levels of hsa-miR-29a (SEQ ID NO: 1) (logrank p-value=0.004), as measured by microarray. The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=5, Iog2(expression) >14.25), the dashed line depicting the intermediate scoring (n=6) and the solid line depicting the lowest scoring (n=5, Iog2(expression) < 12.87). Figure 18B is a Kaplan-Meier plot for persistence of disease-free status (survival) of colon cancer patients (stage II) grouped by expression levels of hsa-miR-29a (SEQ ID NO: 1) (logrank p-value=0.004), as measured by PCR. The y-axis depicts fraction of survival and the x-axis depicts months of survival, with the dashed dotted line representing patients with the highest expression for this miR (n=5, Iog2(expression) >22.28), the dashed line depicting the intermediate scoring (n=6) and the solid line depicting the lowest scoring (n=5, Iog2(expression) <21.58).
DETAILED DESCRIPTION According to the present invention miRNA expression can serve as a novel tool for determining the prognosis of colon cancer. More particularly, it may serve for determining the risk of recurrence of said cancer.
Methods and compositions are provided for determining the prognosis of colon cancer. Other aspects of the invention will become apparent to the skilled artisan by the following description of the invention.
Before the present compositions and methods are disclosed and described, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting. It must be noted that, as used in the specification and the appended claims, the singular forms "a," "an" and "the" include plural referents unless the context clearly dictates otherwise.
For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 6-9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the numbers 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.
a. Definitions about
As used herein, the term "about" refers to +/-10%. antisense
The term "antisense," as used herein, refers to nucleotide sequences which are complementary to a specific DNA or RNA sequence. The term "antisense strand" is used in reference to a nucleic acid strand that is complementary to the "sense" strand. Antisense molecules may be produced by any method, including synthesis by ligating the gene(s) of interest in a reverse orientation to a viral promoter which permits the synthesis of a complementary strand. Once introduced into a cell, this transcribed strand combines with natural sequences produced by the cell to form duplexes. These duplexes then block either the further transcription or translation. In this manner, mutant phenotypes may be generated. attached
"Attached" or "immobilized", as used herein to refer to a probe and a solid support, may mean that the binding between the probe and the solid support is sufficient to be stable under conditions of binding, washing, analysis, and removal. The binding may be covalent or non-covalent. Covalent bonds may be formed directly between the probe and the solid support or may be formed by a cross linker or by inclusion of a specific reactive group on either the solid support or the probe or both molecules. Non- covalent binding may be one or more of electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent attachment of a molecule, such as streptavidin, to the support and the non-covalent binding of a biotinylated probe to the streptavidin. Immobilization may also involve a combination of covalent and non- covalent interactions. biological sample
"Biological sample", as used herein, may mean a sample of biological tissue or fluid that comprises nucleic acids. Such samples include, but are not limited to, tissue isolated from animals. Biological samples may also include sections of tissues such as biopsy and autopsy samples, frozen sections taken for histological purposes, blood, plasma, serum, sputum, stool, tears, mucus, urine, effusions, amniotic fluid, ascitic fluid, hair, and skin. A biological sample may also be provided by fine-needle aspiration (FNA). Biological samples also include explants and primary and/or transformed cell cultures derived from patient tissues. A biological sample may be provided by removing a sample of cells from an animal, but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), or by performing the methods described herein in vivo. Archival tissues, such as those having treatment or outcome history, may also be used. cancer prognosis
A forecast or prediction of the probable course or outcome of the cancer. As used herein, cancer prognosis includes the forecast or prediction of any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of disease-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response rate in a group of patients susceptible to or diagnosed with a cancer, duration of response in a patient or a group of patients susceptible to or diagnosed with a cancer. As used herein, "prognostic for cancer" means providing a forecast or prediction of the probable course or outcome of the cancer. In some embodiments, "prognostic for cancer" comprises providing the forecast or prediction of (prognostic for) any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of disease-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response rate in a group of patients susceptible to or diagnosed with a cancer, and duration of response in a patient or a group of patients susceptible to or diagnosed with a cancer, wherein any of the above may be increased or decreased. complement
"Complement" or "complementary", as used herein to refer to a nucleic acid, may mean Watson-Crick (e.g., A-TAJ and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. A full complement or fully complementary may mean 100% complementary base pairing between nucleotides or nucleotide analogs of nucleic acid molecules, hi some embodiments, the complementary sequence has a reverse orientation (5 '-3'). Ct Ct signals represent the first cycle of PCR where amplification crosses a threshold (cycle threshold) of fluorescence. Accordingly, low values of Ct represent high abundance or expression levels of the microRNA.
In some embodiments the PCR Ct signal is normalized such that the normalized Ct remains inversed from the expression level. In other embodiments the PCR Ct signal may be normalized and then inverted such that low normalized-inverted Ct represents low abundance or expression levels of the microRNA. detection
"Detection" means detecting the presence of a component in a sample. Detection also means detecting the absence of a component. Detection also means measuring the level of a component, either quantitatively or qualitatively. Detection also means identifying or diagnosing cancer in a subject. "Early detection" means identifying or diagnosing cancer in a subject at an early stage of the disease, especially before it causes symptoms. differential expression
"Differential expression" may mean qualitative or quantitative differences in the temporal and/or cellular gene expression patterns within and among cells and tissue. Thus, a differentially expressed gene can qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus disease tissue. Genes may be turned on or turned off in a particular state, relative to another state thus permitting comparison of two or more states. A qualitatively regulated gene will exhibit an expression pattern within a state or cell type that may be detectable by standard techniques. Some genes will be expressed in one state or cell type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in that expression is modulated, up-regulated, resulting in an increased amount of transcript, or down- regulated, resulting in a decreased amount of transcript. The degree to which expression differs need only be large enough to quantify via standard characterization techniques such as expression arrays, quantitative reverse transcriptase PCR, northern analysis, and RNase protection. disease-free survival
"Disease-free survival" (DFS) or disease-free survival time, as used herein, refers to the length of time after treatment for a specific disease during which a patient survives with no sign of the disease. Disease- free survival may be used in a clinical study or trial to help measure how well a new treatment works. Disease-free survival may also refer to a survival assessment where the end points are either tumor recurrence (e.g., the cancer comes back as a consequence of distant metastasis to other sites in the body) or death. The term "disease-free survival", as used herein, is defined as a time between diagnosis or surgery to treat a cancer patient and reoccurrence. expression profile "Expression profile", as used herein, may mean a genomic expression profile, e.g., an expression profile of microRNAs. Profiles may be generated by any convenient means for determining a level of a nucleic acid sequence e.g. quantitative hybridization of microRNA, labeled microRNA, amplified microRNA, cRNA, etc., quantitative PCR, ELISA for quantification, and the like, and allow the analysis of differential gene expression between two samples. A subject or patient tumor sample, e.g., cells or collections thereof, e.g., tissues, is assayed. Samples are collected by any convenient method, as known in the art. Nucleic acid sequences of interest are nucleic acid sequences that are found to be predictive, including the nucleic acid sequences provided above, where the expression profile may include expression data for 5, 10, 20, 25, 50, 100 or more of, including all of the listed nucleic acid sequences. The term "expression profile" may also mean measuring the abundance of the nucleic acid sequences in the measured samples. expression ratio "Expression ratio", as used herein, refers to relative expression levels of two or more nucleic acids as determined by detecting the relative expression levels of the corresponding nucleic acids in a biological sample.
FDR
When performing multiple statistical tests, for example in comparing the signal between two groups in multiple data features, there is an increasingly high probability of obtaining false positive results, by random differences between the groups that can reach levels that would otherwise be considered as statistically significant. In order to limit the proportion of such false discoveries, statistical significance is defined only for data features in which the differences reached a p-value (such as by a two-sided t-test) below a threshold, which is dependent on the number of tests performed and the distribution of p-values obtained in these tests. FDR or false discovery rate is the probability that one of the "significant" results was actually false. fragment
"Fragment" is used herein to indicate a non-full length part of a nucleic acid or polypeptide. Thus, a fragment is itself also a nucleic acid or polypeptide, respectively. gene
"Gene", as used herein, may be a natural (e.g., genomic) or synthetic gene comprising transcriptional and/or translational regulatory sequences and/or a coding region and/or non-translated sequences (e.g., introns, 5'- and 3 '-untranslated sequences). The coding region of a gene may be a nucleotide sequence coding for an amino acid sequence or a functional RNA, such as tRNA, rRNA, catalytic RNA, siRNA, miRNA or antisense RNA. A gene may also be a mRNA or cDNA corresponding to the coding regions (e.g., exons and miRNA) optionally comprising 5'- or 3 '-untranslated sequences linked thereto. A gene may also be an amplified nucleic acid molecule produced in vitro comprising all or a part of the coding region and/or 5'- or 3 '-untranslated sequences linked thereto.
groove binder/minor groove binder (MGB) "Groove binder" and/or "minor groove binder" may be used interchangeably and refer to small molecules that fit into the minor groove of double-stranded DNA, typically in a sequence-specific manner. Minor groove binders may be long, flat molecules that can adopt a crescent-like shape and thus, fit snugly into the minor groove of a double helix, often displacing water. Minor groove binding molecules may typically comprise several aromatic rings connected by bonds with torsional freedom such as furan, benzene, or pyrrole rings. Minor groove binders may be antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic anti -tumor drugs such as chromomycin and mithramycin, CC- 1065, dihydrocyclopyrroloindole tripeptide (DPI3), l,2-dihydro-(3H)-pyrrolo[3,2-e]indole-7- carboxylate (CDPI3), and related compounds and analogues, including those described in Nucleic Acids in Chemistry and Biology, 2d ed., Blackburn and Gait, eds., Oxford University Press, 1996, and PCT Published Application No. WO 03/078450, the contents of which are incorporated herein by reference. A minor groove binder may be a component of a primer, a probe, a hybridization tag complement, or combinations thereof. Minor groove binders may increase the Tn, of the primer or a probe to which they are attached, allowing such primers or probes to effectively hybridize at higher temperatures. identity "Identical" or "identity", as used herein in the context of two or more nucleic acids or polypeptide sequences, may mean that the sequences have a specified percentage of residues that are the same over a specified region. The percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity. In cases where the two sequences are of different lengths or the alignment produces one or more staggered ends and the specified region of comparison includes only a single sequence, the residues of single sequence are included in the denominator but not the numerator of the calculation. When comparing DNA and RNA, thymine (T) and uracil (U) may be considered equivalent. Identity may be performed manually or by using a computer sequence algorithm such as BLAST or BLAST 2.0. label "Label", as used herein, may mean a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. For example, useful labels include 32P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and other entities which can be made detectable. A label may be incorporated into nucleic acids and proteins at any position. logistic regression
Logistic regression is part of a category of statistical models called generalized linear models. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these. The dependent or response variable is dichotomous, for example, one of two possible types of cancer. Logistic regression models the natural log of the odds ratio, i.e., the ratio of the probability of belonging to the first group (P) over the probability of belonging to the second group (1-P), as a linear combination of the different expression levels (in log-space) and of other explaining variables. The logistic regression output can be used as a classifier by prescribing that a case or sample will be classified into the first type if P is greater than 0.5 or 50%. Alternatively, the calculated probability P can be used as a variable in other contexts such as a ID or 2D threshold classifier. 1D/2D threshold classifier "1D/2D threshold classifier", as used herein, may mean an algorithm for classifying a case or sample such as a cancer sample into one of two possible types such as two types of cancer or two types of prognosis (e.g., good and bad). For a ID threshold classifier, the decision is based on one variable and one predetermined threshold value; the sample is assigned to one class if the variable exceeds the threshold and to the other class if the variable is less than the threshold. A 2D threshold classifier is an algorithm for classifying into one of two types based on the values of two variables. A score may be calculated as a function (usually a continuous function) of the two variables; the decision is then reached by comparing the score to the predetermined threshold, similar to the ID threshold classifier. mismatch
"Mismatch" means a nucleobase of a first nucleic acid that is not capable of pairing with a nucleobase at a corresponding position of a second nucleic acid. negative predictive value "Negative predictive value" (NPV), as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example the probability of a patient to have specific condition, given a negative diagnosis. The NPV for class A is the proportion of cases that are correctly diagnosed as belonging to class "not A" by the test out of the cases that are diagnosed as belonging to class "not A", as determined by some absolute or gold standard. nucleic acid
"Nucleic acid" or "oligonucleotide" or "polynucleotide", as used herein, may mean at least two nucleotides covalently linked together. The depiction of a single strand also defines the sequence of the complementary strand. Thus, a nucleic acid also encompasses the complementary strand of a depicted single strand. Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid. Thus, a nucleic acid also encompasses substantially identical nucleic acids and complements thereof. A single strand provides a probe that may hybridize to a target sequence under stringent hybridization conditions. Thus, a nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions.
Nucleic acids may be single-stranded or double-stranded, or may contain portions of both double-stranded and single-stranded sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleo tides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine and isoguanine. Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods.
A nucleic acid will generally contain phosphodiester bonds, although nucleic acid analogs may be included that may have at least one different linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite linkages and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are incorporated herein by reference. Nucleic acids containing one or more non-naturally occurring or modified nucleotides are also included within one definition of nucleic acids. The modified nucleotide analog may be located for example at the 5'-end and/or the 3 '-end of the nucleic acid molecule. Representative examples of nucleotide analogs may be selected from sugar- or backbone-modified ribonucleotides. It should be noted, however, that also nucleobase-modified ribonucleotides, i.e., ribonucleotides, containing a non-naturally occurring nucleobase instead of a naturally occurring nucleobase such as uridine or cytidine modified at the 5-position, e.g., 5-(2-amino)propyl uridine, 5-bromo uridine; adenosine and guanosine modified at the 8-position, e.g., 8-bromo guanosine; deaza nucleotides, e.g., 7-deaza-adenosine; O- and N-alkylated nucleotides, e.g., N6- methyl adenosine are suitable. The 2'-OH-group may be replaced by a group selected from H, OR, R, halo, SH, SR, NH2, NHR, NR2 or CN, wherein R is Ci-C6 alkyl, alkenyl or alkynyl and halo is F, Cl, Br or I. Modified nucleotides also include nucleotides conjugated with cholesterol through, e.g., a hydroxyprolinol linkage as described in Krutzfeldt et al, Nature 438:685-689 (2005), Soutschek et al, Nature 432:173-178 (2004), and U.S. Patent Publication No. 20050107325, which are incorporated herein by reference. Additional modified nucleotides and nucleic acids are described in U.S. Patent Publication No. 20050182005, which is incorporated herein by reference. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments, to enhance diffusion across cell membranes, or as probes on a biochip. The backbone modification may also enhance resistance to degradation, such as in the harsh endocytic environment of cells. The backbone modification may also reduce nucleic acid clearance by hepatocytes, such as in the liver and kidney. Mixtures of naturally occurring nucleic acids and analogs may be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. positive predictive value
"Positive predictive value" (PPV), as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example the probability of a patient to have specific condition, given a positive diagnosis. The PPV for class A is the proportion of cases that are correctly diagnosed as belonging to class "A" by the test out of the cases that are diagnosed as belonging to class "A", as determined by some absolute or gold standard. probe
"Probe", as used herein, may mean an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. There may be any number of base pair mismatches which will interfere with hybridization between the target sequence and the single stranded nucleic acids described herein. However, if the number of mutations is so great that no hybridization can occur under even the least stringent of hybridization conditions, the sequence is not a complementary target sequence. A probe may be single stranded or partially single and partially double stranded. The strandedness of the probe is dictated by the structure, composition, and properties of the target sequence. Probes may be directly labeled or indirectly labeled such as with biotin to which a streptavidin complex may later bind. reference expression profile
As used herein, the term "reference expression profile" means a value that statistically correlates to a particular outcome when compared to an assay result, hi some embodiments, the reference value is determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes. In some embodiments, the reference value may be based on the abundance of the nucleic acids, hi some embodiments, the reference value may be based on a threshold score value or a cutoff score value. Typically, a reference value will be a threshold above which one outcome is more probable and below which an alternative threshold is more probable. As used herein, the phrase "reference expression profile" refers to a criterion expression profile to which measured values are compared in order to determine the prognosis of a subject with colon cancer. reference value
As used herein, the term "reference value" means a value that statistically correlates to a particular outcome when compared to an assay result. In preferred embodiments the reference value is determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes. The reference value may be a threshold score value or a cutoff score value. Typically a reference value will be a threshold above which one outcome is more probable and below which an alternative threshold is more probable. sensitivity
"Sensitivity", as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example how frequently it correctly classifies a cancer into the correct type out of two possible types. The sensitivity for class A is the proportion of cases that are determined to belong to class "A" by the test out of the cases that are in class "A", as determined by some absolute or gold standard. specificity
"Specificity", as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example how frequently it correctly classifies a cancer into the correct type out of two possible types. The specificity for class A is the proportion of cases that are determined to belong to class "not A" by the test out of the cases that are in class "not A", as determined by some absolute or gold standard. stage of cancer
As used herein, the term "stage of cancer" refers to a numerical measurement of the level of advancement of a cancer. Criteria used to determine the stage of a cancer include, but are not limited to, the depth of tumor invasion, whether the tumor has spread to other parts of the body and where the cancer has spread (e.g., within the same organ or region of the body or to another organ). stringent hybridization conditions "Stringent hybridization conditions", as used herein, may mean conditions under which a first nucleic acid sequence (e.g., probe) will hybridize to a second nucleic acid sequence (e.g., target), such as in a complex mixture of nucleic acids. Stringent conditions are sequence-dependent and will be different in different circumstances. Stringent conditions may be selected to be about 5-10°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The Tm may be the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 50% of the probes are occupied at equilibrium). Stringent conditions may be those in which the salt concentration is less than about 1.0 M sodium ion, such as about 0.01-1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., about 10-50 nucleotides) and at least about 60°C for long probes (e.g., greater than about 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal may be at least 2 to 10 times background hybridization. Exemplary stringent hybridization conditions include the following: 50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 0.1% SDS at 650C. subject
As used herein, the term "subject" refers to a mammal, including both human and other mammals. The methods of the present invention are preferably applied to human subjects. substantially complementary "Substantially complementary", as used herein, may mean that a first sequence is at least 60%-99% identical to the complement of a second sequence over a region of 8-50 or more nucleotides, or that the two sequences hybridize under stringent hybridization conditions. substantially identical "Substantially identical", as used herein, may mean that a first and second sequence are at least 60%-99% identical over a region of 8-50 or more nucleotides or amino acids, or with respect to nucleic acids, if the first sequence is substantially complementary to the complement of the second sequence. survival rate As used in, the term "survival rate" refers to the percentage of people in a study or treatment group who are alive for a certain period of time after they were diagnosed with or treated for a disease, such as cancer. The overall survival rate is often stated as a five-year survival rate, which is the percentage of people in a study or treatment group who are alive five years after diagnosis or treatment. therapeutically effective amount
As used herein the term "therapeutically effective amount" or "therapeutically efficient" as to a drug dosage, refer to dosage that provides the specific pharmacological response for which the drug is administered in a significant number of subjects in need of such treatment. The "therapeutically effective amount" may vary according to, for example, the physical condition of the patient, the age of the patient and the severity of the disease. tissue sample
As used herein, a tissue sample is tissue obtained from a tissue biopsy using methods well known to those of ordinary skill in the related medical arts. The phrase "suspected of being cancerous" as used herein means a cancer tissue sample believed by one of ordinary skill in the medical arts to contain cancerous cells. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, microdissection, laser- based microdissection, or other art-known cell-separation methods. treat
"Treat" or "treating", used herein when referring to protection of a subject from a condition, may mean preventing, suppressing, repressing, or eliminating the condition.
Preventing the condition involves administering a composition described herein to a subject prior to onset of the condition. Suppressing the condition involves administering the composition to a subject after induction of the condition but before its clinical appearance. Repressing the condition involves administering the composition to a subject after clinical appearance of the condition such that the condition is reduced or prevented from worsening. Elimination of the condition involves administering the composition to a subject after clinical appearance of the condition such that the subject no longer suffers from the condition. tumor
"Tumor", as used herein, refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues. variant "Variant", used herein to refer to a nucleic acid, may mean (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to the referenced nucleic acid, complement thereof, or a sequences substantially identical thereto. b. MicroRNA and its processing
A gene coding for a miRNA may be transcribed leading to production of a miRNA precursor known as the pri-miRNA. The pri-miRNA may be part of a polycistronic RNA comprising multiple pri-miRNAs. The pri-miRNA may form a hairpin with a stem and loop. The stem may comprise mismatched bases.
The hairpin structure of the pri-miRNA may be recognized by Drosha, which is an RNase III endonuclease. Drosha may recognize terminal loops in the pri-miRNA and cleave approximately two helical turns into the stem to produce a 30-200 nt precursor known as the pre-miRNA. Drosha may cleave the pri-miRNA with a staggered cut typical of RNase III endonucleases yielding a pre-miRNA stem loop with a 5' phosphate and ~2 nucleotide 3' overhang. Approximately one helical turn of stem (-10 nucleotides) extending beyond the Drosha cleavage site may be essential for efficient processing. The pre-miRNA may then be actively transported from the nucleus to the cytoplasm by Ran- GTP and the export receptor Ex-portin-5.
The pre-miRNA may be recognized by Dicer, which is also an RNase III endonuclease. Dicer may recognize the double-stranded stem of the pre-miRNA. Dicer may also recognize the 5' phosphate and 3' overhang at the base of the stem loop. Dicer may cleave off the terminal loop two helical turns away from the base of the stem loop leaving an additional 5' phosphate and ~2 nucleotide 3' overhang. The resulting siRNA- like duplex, which may comprise mismatches, comprises the mature miRNA and a similar-sized fragment known as the miRNA*. The miRNA and miRNA* may be derived from opposing arms of the pri-miRNA and pre-miRNA. MiRNA* sequences may be found in libraries of cloned miRNAs but typically at lower frequency than the miRNAs.
Although initially present as a double-stranded species with miRNA*, the miRNA may eventually become incorporated as a single-stranded RNA into a ribonucleoprotein complex known as the RNA-induced silencing complex (RISC). Various proteins can form the RISC, which can lead to variability in specifity for miRNA/miRNA* duplexes, binding site of the target gene, activity of miRNA (repress or activate), and which strand of the miRNA/miRNA* duplex is loaded in to the RISC.
When the miRNA strand of the miRNA:miRNA* duplex is loaded into the RISC, the miRNA* may be removed and degraded. The strand of the miRNA:miRNA* duplex that is loaded into the RISC may be the strand whose 5' end is less tightly paired, hi cases where both ends of the miRNA:miRNA* have roughly equivalent 5' pairing, both miRNA and miRNA* may have gene silencing activity.
The RISC may identify target nucleic acids based on high levels of complementarity between the miRNA and the mRNA, especially by nucleotides 2-8 of the miRNA. Only one case has been reported in animals where the interaction between the miRNA and its target was along the entire length of the miRNA. This was shown for miR-196 and Hox B8 and it was further shown that miR-196 mediates the cleavage of the Hox B8 mRNA (Yekta et al, Science 2004; 304:594-596). Otherwise, such interactions are known only in plants (Bartel & Bartel, Plant Physiol 2003; 132:709-717). A number of studies have looked at the base-pairing requirement between miRNA and its mRNA target for achieving efficient inhibition of translation (reviewed by Bartel, Cell 2004; 116:281-297). In mammalian cells, the first 8 nucleotides of the miRNA may be important (Doench & Sharp, GenesDev 2004;18:504-511). However, other parts of the microRNA may also participate in mRNA binding. Moreover, sufficient base pairing at the 3' can compensate for insufficient pairing at the 5' (Brennecke et al, PIoS Biol 2005; 3:e85). Computation studies, in which miRNA binding on whole genomes is analyzed, have suggested a specific role for bases 2-7 at the 5' of the miRNA in target binding, but the role of the first nucleotide, found usually to be "A", was also recognized (Lewis et al, Cell 2005; 120: 15-20). Similarly, nucleotides 1-7 or 2-8 were used by Krek et al., Nat Genet 2005; 37:495-500) to identify and validate targets.
The target sites in the mRNA may be in the 5' UTR, the 3' UTR or in the coding region. Interestingly, multiple miRNAs may regulate the same mRNA target by recognizing the same or multiple sites. The presence of multiple miRNA binding sites in most genetically identified targets may indicate that the cooperative action of multiple RISCs provides the most efficient translational inhibition. miRNAs may direct the RISC to down-regulate gene expression by either of two mechanisms: mRNA cleavage or translational repression. The miRNA may specify cleavage of the mRNA if the mRNA has a certain degree of complementarity to the miRNA. When a miRNA guides cleavage, the cut may be between the nucleotides pairing to residues 10 and 11 of the miRNA. Alternatively, the miRNA may repress translation if the miRNA does not have the requisite degree of complementarity to the miRNA. Translational repression may be more prevalent in animals since animals may have a lower degree of complementarity between the miRNA and binding site.
It should be noted that there may be variability in the 5' and 3' ends of any pair of miRNA and miRNA*. This variability may be due to variability in the enzymatic processing of Drosha and Dicer with respect to the site of cleavage. Variability at the 5 ' and 3' ends of miRNA and miRNA* may also be due to mismatches in the stem structures of the pri-miRNA and pre-miRNA. The mismatches of the stem strands may lead to a population of different hairpin structures. Variability in the stem structures may also lead to variability in the products of cleavage by Drosha and Dicer. c. Nucleic Acids
Nucleic acids are provided herein. The nucleic acid may comprise the sequence of SEQ ID NOS: 1-100 or variants thereof. The variant may be a complement of the referenced nucleotide sequence. The variant may also be a nucleotide sequence that is substantially identical to the referenced nucleotide sequence or the complement thereof. The variant may also be a nucleotide sequence which hybridizes under stringent conditions to the referenced nucleotide sequence, complements thereof, or nucleotide sequences substantially identical thereto.
The nucleic acid may have a length of from 10 to 250 nucleotides. The nucleic acid may have a length of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200 or 250 nucleotides. The nucleic acid may be synthesized or expressed in a cell (in vitro or in vivo) using a synthetic gene described herein. The nucleic acid may be synthesized as a single strand molecule and hybridized to a substantially complementary nucleic acid to form a duplex. The nucleic acid may be introduced to a cell, tissue or organ in a single- or double-stranded form or capable of being expressed by a synthetic gene using methods well known to those skilled in the art, including as described in U.S. Patent No. 6,506,559 which is incorporated herein by reference.
Table 1: miR name and miR and hairpin sequence identification numbers
Figure imgf000031_0001
Figure imgf000032_0001
The microRNA name is the miRBase registry name (release 10).
MID-00291, MID-00466, MID-00595 and MID-00689 were cloned at Rosetta
Genomics. i. Nucleic acid complex
The nucleic acid may further comprise one or more of the following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an antibody fragment, a Fab fragment, and an aptamer. The nucleic acid may also comprise a protamine-antibody fusion protein as described in Song et al. (Nature Biotechnology 2005;23:709-17) and Rossi (Nature Biotechnology 2005;23:682-4), the contents of which are incorporated herein by reference. The protamine may readily interact with the nucleic acid. The protamine may comprise the entire 51 -amino acid protamine peptide or a fragment thereof. The protamine may be covalently attached to another protein, which may be a Fab. The Fab may bind to a receptor expressed on a cell surface. ii. Pri-miRNA The nucleic acid may comprise a sequence of a pri-miRNA or a variant thereof.
The pri-miRNA sequence may comprise from 45-30,000, 50-25,000, 100-20,000, 1,000- 1,500 or 80-100 nucleotides. The sequence of the pri-miRNA may comprise a pre- miRNA, miRNA and miRNA*, as set forth herein, and variants thereof. The sequence of the pri-miRNA may comprise the sequence of SEQ ID NOS: 1-75 or variants thereof. The pri-miRNA may form a hairpin structure. The hairpin may comprise first and second nucleic acid sequences that are substantially complementary. The first and second nucleic acid sequence may be from 37-50 nucleotides. The first and second nucleic acid sequence may be separated by a third sequence of from 8-12 nucleotides. The hairpin structure may have a free energy less than -25 Kcal/mole, as calculated by the Vienna algorithm, with default parameters, as described in Hofacker et al. (Monatshefte f. Chemie 1994;125:167-188), the contents of which are incorporated herein. The hairpin may comprise a terminal loop of 4-20, 8-12 or 10 nucleotides. The pri-miRNA may comprise at least 19% adenosine nucleotides, at least 16% cytosine nucleotides, at least 23% thymine nucleotides and at least 19% guanine nucleotides. iii. Pre-miRNA
The nucleic acid may also comprise a sequence of a pre-miRNA or a variant thereof. The pre-miRNA sequence may comprise from 45-200, 60-80 or 60-70 nucleotides. The sequence of the pre-miRNA may comprise a miRNA and a miRNA* as set forth herein. The sequence of the pre-miRNA may also be that of a pri-miRNA excluding from 0-160 nucleotides from the 5' and 3' ends of the pri-miRNA. The sequence of the pre-miRNA may comprise the sequence of SEQ ID NOS: 1-75 or variants thereof. iv. miRNA The nucleic acid may also comprise a sequence of a miRNA (including miRNA*) or a variant thereof. The miRNA sequence may comprise from 13-33, 18-24 or 21-23 nucleotides. The miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides. The sequence of the miRNA may be the first 13-33 nucleotides of the pre-miRNA. The sequence of the miRNA may also be the last 13-33 nucleotides of the pre-miRNA. The sequence of the miRNA may comprise the sequence of SEQ ID NOS: 1-20, 45-54, 66-70 or variants thereof. v. Anti-miRNA
The nucleic acid may also comprise a sequence of an anti-miRNA that is capable of blocking the activity of a miRNA or miRNA*, such as by binding to the pri -miRNA, pre-miRNA, miRNA or miRNA* (e.g., antisense or RNA silencing), or by binding to the target binding site. The anti-miRNA may comprise a total of 5-100 or 10-60 nucleotides. The anti-miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides. The sequence of the anti-miRNA may comprise (a) at least 5 nucleotides that are substantially identical or complementary to the 5' of a miRNA and at least 5-12 nucleotides that are substantially complementary to the flanking regions of the target site from the 5' end of the miRNA, or (b) at least 5-12 nucleotides that are substantially identical or complementary to the 3 ' of a miRNA and at least 5 nucleotide that are substantially complementary to the flanking region of the target site from the 3' end of the miRNA. The sequence of the anti-miRNA may comprise the complement of SEQ ID NOS: 1-75, or variants thereof. vi. Binding Site of Target The nucleic acid may also comprise a sequence of a target miRNA binding site, or a variant thereof. The target site sequence may comprise a total of 5-100 or 10-60 nucleotides. The target site sequence may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62 or 63 nucleotides. The target site sequence may comprise at least 5 nucleotides of the complementary sequence of SEQ ID NOS: 1-75.
d. Synthetic Gene
A synthetic gene is also provided comprising a nucleic acid described herein operably linked to a transcriptional and/or translational regulatory sequence. The synthetic gene may be capable of modifying the expression of a target gene with a binding site for a nucleic acid described herein. Expression of the target gene may be modified in a cell, tissue or organ. The synthetic gene may be synthesized or derived from naturally-occurring genes by standard recombinant techniques. The synthetic gene may also comprise terminators at the 3 '-end of the transcriptional unit of the synthetic gene sequence. The synthetic gene may also comprise a selectable marker.
e. Probes A probe is also provided comprising a nucleic acid described herein. Probes may be used for screening and diagnostic methods, as outlined below. The probe may be attached or immobilized to a solid substrate, such as a biochip.
The probe may have a length of from 8 to 500, 10 to 100 or 20 to 60 nucleotides.
The probe may also have a length of at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140,
160, 180, 200, 220, 240, 260, 280 or 300 nucleotides. The probe may further comprise a linker sequence of from 10-60 nucleotides.
f. Biochip A biochip is also provided. The biochip may comprise a solid substrate comprising an attached probe or plurality of probes described herein. The probes may be capable of hybridizing to a target sequence under stringent hybridization conditions. The probes may be attached at spatially defined address on the substrate. More than one probe per target sequence may be used, with either overlapping probes or probes to different sections of a particular target sequence. The probes may be capable of hybridizing to target sequences associated with a single disorder appreciated by those in the art. The probes may either be synthesized first, with subsequent attachment to the biochip, or may be directly synthesized on the biochip.
The solid substrate may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the probes and is amenable to at least one detection method. Representative examples of substrates include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics. The substrates may allow optical detection without appreciably fluorescing.
The substrate may be planar, although other configurations of substrates may be used as well. For example, probes may be placed on the inside surface of a tube, for flow-through sample analysis to minimize sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed cell foams made of particular plastics.
The biochip and the probe may be derivatized with chemical functional groups for subsequent attachment of the two. For example, the biochip may be derivatized with a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups. Using these functional groups, the probes may be attached using functional groups on the probes either directly or indirectly using a linker. The probes may be attached to the solid support by either the 5' terminus, 3' terminus, or via an internal nucleotide.
The probe may also be attached to the solid support non-covalently. For example, biotinylated oligonucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment. Alternatively, probes may be synthesized on the surface using techniques such as photopolymerization and photolithography.
g. Diagnosis
A method of diagnosis is also provided. The method comprises detecting a differential expression level of colon cancer-associated nucleic acid in a biological sample. The sample may be derived from a patient. Diagnosis of a disease state in a patient may allow for prognosis and selection of therapeutic and follow-up strategy. Furthermore, the developmental stage of cells may be classified by determining temporarily expressed colon cancer-associated nucleic acids.
In situ hybridization of labeled probes to tissue sections may be performed. When comparing the fingerprints between an individual and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the nucleic acids which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.
h. Biomarkers
Biomarkers are also provided. One type of cancer screening test involves the detection of a biomarker, such as a tumor marker, in a fluid or tissue obtained from a patient. Tumor markers are substances produced by cancer cells that are not typically produced by normal cells. These substances generally can be detected in the body fluids or tissues of patients with cancer. Another important use for tumor markers is for monitoring patients being treated for advanced cancer. Measuring tumor markers for this purpose can be less invasive, less time-consuming, as well as less expensive, than other complicated tests, to determine if a therapy is reducing the cancer.
A further important use for tumor markers is for determining a prognosis of survival of a cancer patient. Such prognostic methods can be used to identify surgically treated patients likely to experience cancer recurrence so that they can be offered additional therapeutic options.
i. Kits
A kit is also provided and may comprise a nucleic acid described herein together with any or all of the following: assay reagents, buffers, probes and/or primers, and sterile saline or another pharmaceutically acceptable emulsion and suspension base. Pn addition, the kits may include instructional materials containing directions (e.g., protocols) for the practice of the methods described herein.
For example, the kit may be a kit for the amplification, detection, identification or quantification of a target nucleic acid sequence. The kit may comprise a poly(T) primer, a forward primer, a reverse primer, and a probe.
Having now generally described the invention, the same will be more readily understood through reference to the following examples, which are provided by way of illustration and are not intended to be limiting of the present invention.
EXAMPLES
Example 1 Materials and Methods a. Biological samples
Colon adenocarcinoma tumor specimens (formalin-fixed, paraffin-embedded, FFPE) obtained from colon cancer patients stages I-III, without adjuvant therapy and with at least 3 years of follow-up, were used for this research. Total RNA enriched in microRNA was isolated from the FFPE tumor specimens, and all RNAs extracted were hybridized onto microarrays according to the RNA extraction and array platform protocols described below. Good prognosis was defined as no recurrence within 3 years following surgery (n=92, stage I: 46 samples, stage II: 46 samples). Bad prognosis was defined as recurrence within 3 years following surgery (n=24, stage I: 8 samples, stage II: 16 samples).
b. RNA extraction
Total RNA was isolated from seven to ten 10-μm-thick FFPE tissue sections using the miR extraction protocol developed at Rosetta Genomics. Briefly, the sample was incubated a few times in xylene at 570C to remove paraffin excess, followed by ethanol washes. Proteins were degraded by proteinase K solution at 450C for few hours. The RNA was extracted with acid phenolxhloroform, followed by ethanol precipitation and DNAse digestion. Total RNA quantity and quality was checked by spectrophotometer (Nanodrop ND- 1000).
c. Microarray platform
Custom microarrays were produced by printing DNA oligonucleotide probes representing 688 human microRNAs. Each probe, printed in triplicate, carried up to 22-nt linker at the 3' end of the microRNA's complement sequence, in addition to an amine group used to couple the probes to coated glass slides. Each probe (20 μM) was dissolved in 2X SSC + 0.0035% SDS and spotted in triplicate on Schott Nexterion®
Slide E-coated microarray slides using a Genomic Solutions® BioRobotics MicroGrid II according the MicroGrid manufacturer's directions. Fifty-four negative control probes were designed using the sense sequences of different microRNAs. Two groups of positive control probes were designed to hybridize to microarray: (i) synthetic small
RNAs were spiked to the RNA before labeling to verify labeling efficiency; and (ii) probes for abundant small RNA (e.g., small nuclear RNAs (U43, U49, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA) were spotted on the array to verify RNA quality.
The slides were blocked in a solution containing 50 mM ethanolamine, 1 M Tris (pH9.0) and 0.1% SDS for 20 min at 500C, then thoroughly rinsed with water and spun dry.
d. Cv-dye labeling of microRNA for microarray
Five μg of total RNA were labeled by ligation (Thomson et al., Nature Methods 2004, 1 :47-53) of an RNA-linker, p-rCrU-Cy/dye (Dharmacon), to the 3' -end with Cy3 or Cy5. The labeling reaction contained total RNA, spikes (0.1-20 fmoles), 300 ng RNA- linker-dye, 15% DMSO, Ix ligase buffer and 20 units of T4 RNA ligase (NEB) and proceeded at 40C for 1 h, followed by 1 h at 370C. The labeled RNA was mixed with 3x hybridization buffer (Ambion), heated to 950C for 3 min and then added on top of the miRdicator™ array. Slides were hybridized 12-16 h at 420C, followed by two washes at room temperature with IxSSC and 0.2% SDS and a final wash with 0. IxSSC.
Arrays were scanned using an Agilent Microarray Scanner Bundle G2565BA (resolution of 10 μm at 100% power). Array images were analyzed using SpotReader software (Niles Scientific). e. RT-PCR Total RNA was incubated in the presence of poly (A) polymerase (Poly (A)
Polymerase NEB- M0276L), NEB 10x poly A buffer (B02765) and 10 mg ATP for 1 hour at 37°C. Then, using an oligodT primer harboring a consensus sequence, reverse transcription was performed on polyadenylated RNA using Superscript II RT (Invitrogen, Carlsbad, CA). Next, the cDNA was amplified by RT-PCR; this reaction contained a microRNA-specific forward primer, a TaqMan (MGB) probe complementary to the 3 ' of the specific microRNA sequence, as well as to part of the polyA adaptor sequence, and a universal reverse primer complementary to the consensus 3 ' sequence of the oligodT tail. Sequences of MGB probes and forward primers are shown below in Table 2. hsa-miR-214 (SEQ ID NO: 67), hsA-miR-185 (SEQ ID NO: 68), hsa-miR-141 (SEQ ID NO: 69)and hsa-miR-221 (SEQ ID NO: 70) served as miR normalizers for the PCR validation.
The cycle threshold (CT, the PCR cycle at which probe signal reaches the threshold) was determined for each microRNA. To allow comparison with results from the microarray, each value received was subtracted from 50. This 50-Cτ expression for each microRNA for each patient was compared with the signal obtained by the microarray method. Table 2: Sequences used for RT-PCR validation
t-O
Figure imgf000040_0001
f. Statistical analyses
Measurements of the expression of miRs on the microarray were log- transformed before all further analysis. Normalization of samples was performed by calculating a median reference vector. For each sample, the best fit to this reference vector was calculated using a 2nd degree polynomial.
For analyses comparing the expression of miRNAs in two distinct groups ("good" vs. "bad" prognosis), the normalized expression values in the two groups was compared using Student's t-test, with a p-value cutoff of 0.05. FDR=Cl was used to correct for multiple hypothesis testing. For Kaplan-Meier analyses of miRs, the expression of each miR was sorted and logrank was calculated between the upper and lower tertiles. P-value<0.05 was considered as significant.
In order to find miRs which would attain a high predictive value (good performance, e.g., Examples 5-7), different expression thresholds were examined. Since the prevalence of disease recurrence in this cohort is similar to the general prevalence in the population, positive predictive value (PPV) was calculated as the fraction of patients with no recurrence in the group of patients predicted to have no recurrence (according to a specific miR expression threshold). Likewise, negative predictive value (NPV) was calculated as the fraction of patients with recurrence of the disease in the group of patients predicted to have recurrence (according to a specific miR expression threshold. Sensitivity and specificity thresholds were also applied.
In both Student's t-tests and Kaplan-Meier analyses, only miRs whose normalized median expression was >300 in at least one of the groups compared were considered.
Example 2
Specific microRNAs are able to predict the prognosis of colon cancer patients
The statistical analysis of the microarray results and comparison of the median values of miR expression in tumor samples obtained from colon cancer patients having good prognosis (no recurrence within 3 years following surgery) (92 patients) with the median values of miR expression in tumor samples obtained from patients with bad prognosis (recurrence within 3 years following surgery) (24 patients) revealed significant differences in the expression pattern of specific miRs as shown in Table 3 and Figure 1. In the group of patients with bad prognosis the median expression values of hsa-miR-126 (SEQ ID NO: 2) were found to be above the median expression values of the patients with good prognosis, whereas the median expression values of hsa-miR-29a (SEQ ID NO: 1) were found to be below the median expression of the patients with good prognosis. Accordingly, relatively high expression values of hsa-miR-126 (SEQ ID NO:
2) were demonstrated to be indicative of poor prognosis of colon cancer, whereas relatively high expression values of hsa-miR-29a (SEQ ID NO: 1) were demonstrated to be indicative of good prognosis.
Figure 2 shows differential expression of miRs in samples obtained from stage I colon cancer patients with good prognosis (n=46) and from stage I colon cancer patients with bad prognosis (n=8). In the group of patients with bad prognosis the median expression values of hsa-miR-23a (SEQ ID NO: 5), hsa-miR-24 (SEQ ID NO: 6), MID- 00291 (SEQ ID NO: 7), MID-00466 (SEQ ID NO: 8), MID-00595 (SEQ ID NO: 9) and hsa-miR-21 (SEQ ID NO: 10) were found to be above the median expression values of the patients with good prognosis, whereas the median expression values of hsa-miR-196b (SEQ ID NO: 3) and hsa-miR-125b (SEQ ID NO: 4) were found to be below the median expression of the patients with good prognosis. Accordingly, relatively high expression values of hsa-miR-hsa-miR-23a (SEQ ID NO: 5), hsa-miR-24 (SEQ ID NO: 6), MID- 00291 (SEQ ID NO: 7), MID-00466 (SEQ ID NO: 8), MID-00595 (SEQ ID NO: 9) and hsa-miR-21 (SEQ ID NO: 10) were demonstrated to be indicative of poor prognosis of colon cancer, whereas relatively high expression values of hsa-miR-196b (SEQ ID NO:
3) and hsa-miR-125b (SEQ ID NO: 4) were demonstrated to be indicative of good prognosis.
Figure 3 demonstrates differential expression of miRs in samples obtained from stage II colon cancer patients with good prognosis and from stage II colon cancer patients with bad prognosis. In the group of patients with bad prognosis the median expression value of hsa-miR-29a (SEQ ID NO: 1) was found to be below the median expression of the patients with good prognosis. Accordingly, relatively high expression values of hsa-miR-29a (SEQ ID NO: 1) were demonstrated to be indicative of good prognosis. Example 3 miR expression patterns in patients with stage I colon cancer correlate with prognosis
The prognoses of groups of patients with stage I colon, stratified according the expression levels of individual microRNAs, were compared. For each microRNA the samples were divided into tertiles according to high (n=18), intermediate (n=18) or low (n=18) expression level of the microRNA.
Survival and disease-free survival (DFS) were compared between the two groups with high and low microRNA expression levels. The microRNAs associated with significant differences (logrank p-value<0.05) in DFS are presented in Table 3 (see stage I, Analysis, DFS).
Table 3: Up- and down-regulation of miRs in patients defined as having colon cancer with bad prognosis
Figure imgf000043_0001
Figure imgf000044_0001
The correlation between miR expression and survival time and disease-free survival in stage I patients is further indicated in Figures 4A-4D, which show Kaplan- Meier plots of survival time and disease-free survival curves plotted for each of the three expression-level groups. hsa-miR-361-3p (SEQ ID NO: 11), hsa-miR-29c* (SEQ ID NO: 12) and hsa-miR-196b (SEQ ID NO: 3) were down-regulated in samples obtained from stage I patients with bad prognosis. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of good prognosis. In contrast, hsa-miR- 194 (SEQ ID NO: 13) was up-regulated in samples obtained from stage I patients with bad prognosis. Accordingly, relatively high expression values of this miR were demonstrated to be indicative of bad prognosis. Example 4 miR expression patterns in patients with stage II colon cancer correlate with prognosis
The prognosis of groups of patients with stage II colon cancer, stratified according the expression levels of individual microRNAs, was compared. For each microRNA the samples were divided into tertiles according to high (n=21), intermediate (n=20) or low (n=21) expression level of the microRNA.
Survival and disease-free survival (DFS) were compared between the two groups with high and low microRNA expression levels. The microRNAs associated with significant differences (logrank p-value<0.05) in DFS are presented in Table 3 (see stage II, Analysis, DFS).
The correlation between miR expression and survival time and disease-free survival in stage II patients is further indicated in Figures 5A-5D, which show Kaplan- Meier plots of disease-free survival curves plotted for each of the three expression-level groups. hsa-miR-29a (SEQ ID NO: 1) and hsa-miR-155 (SEQ ID NO: 14) were down- regulated in samples obtained from stage II patients with bad prognosis. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of good prognosis. In contrast, hsa-miR-498 (SEQ ID NO: 15) and hsa-miR-527 (SEQ ID NO: 16) were up-regulated in samples obtained from stage II patients with bad prognosis. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of bad prognosis.
Example 5
Using miR expression performance to predict a subgroup of good prognosis patients with stage I colon cancer and a subgroup of bad prognosis patients with stage I colon cancer miR performance is defined by the existence of an expression cutoff that classifies the good and bad prognosis groups such that 100% of one group had only good prognosis patients and this group contains at least 40% of the good prognosis patients (i.e., ppv=l, sensitivity >0.4). As shown in Figure 6 and Table 3, hsa-miR-103 (SEQ ID NO: 17) and miR-27a (SEQ ID NO: 19) were down-regulated in samples obtained from stage I patients with good prognosis. Accordingly, expression thresholds for these miRs were demonstrated to be indicative of bad prognosis. In contrast, hsa-miR-423-3p (SEQ
ID NO: 18) was up-regulated in samples obtained from stage I patients with good prognosis (Figure 6B) and down-regulated in bad prognosis (Table 3). For hsa-miR-125b (SEQ ID NO: 4), an expression cutoff could be found that divided the population of stage I patients into two groups such that one group with higher expression of this miR contains at least 50% only good prognosis patients and this group contains at least 50% of the good prognosis patients (i.e. ppv >0.5, sensitivity >0.5) (Table 3).
As shown in Figure 7 and Table 3, hsa-miR-196b (SEQ ID NO: 3) was down- regulated in samples obtained from stage I patients with bad prognosis. Accordingly, expression thresholds for these miRs were demonstrated to be indicative of good prognosis.
Example 6
Using miR expression performance to predict a subgroup of good prognosis patients with stage II colon cancer and a subgroup of bad prognosis patients with stage II colon cancer miR performance is defined by the existence of an expression cutoff that classifies the good and bad prognosis groups such that 80% of one group had only good prognosis patients and this group contains at least 30% of the good prognosis patients (i.e. ppv >0.8, sensitivity >0.3). As shown in Figure 8 and Table 3, hsa-miR-29a (SEQ ID NO: 1) was down-regulated in samples obtained from stage II patients with bad prognosis. Accordingly, relatively high expression values of this miR are indicative of good prognosis. hsa-miR-29a could also be used to identify a subgroup of bad prognosis patients with stage II colon cancer (npv >0.6, specificity >0.3).
Example 7 Using miR expression performance to predict a subgroup of good prognosis patients with stage I and II colon cancer and a subgroup of bad prognosis patients with stage I and II colon cancer miR performance is defined by the existence of an expression cutoff that classifies the good and bad prognosis groups such that 90% of one group had only good prognosis patients and this group contains at least 30% of the good prognosis patients (i.e. ppv >0.9, sensitivity >0.3). As shown in Figure 9 and Table 3, hsa-miR-27a (SEQ ID NO: 19), hsa-miR-451 (SEQ ID NO: 45), hsa-miR-768-3p (SEQ ID NO: 46), hsa- miR-199a-5p (SEQ ID NO: 47) and hsa-miR-lOa (SEQ ID NO: 51) were down- regulated in samples obtained from stage I and II patients with good prognosis. Accordingly, relatively low expression values of these miRs were demonstrated to be indicative of good prognosis.
In contrast, hsa-miR-423-3p (SEQ ID NO: 18), hsa-miR-378 (SEQ ID NO: 48), hsa-miR-612 (SEQ ID NO: 49), hsa-miR-200b (SEQ ID NO: 20), hsa-miR-429 (SEQ ID NO: 50), MID-00689 (SEQ ID NO: 52) and hsa-miR-30d (SEQ ID NO: 53) were up- regulated in samples obtained from stage I and II patients with good prognosis. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of good prognosis. hsa-miR-29a (SEQ ID NO: 1) could be used to identify a subgroup of bad prognosis patients, with npv >0.8 and specificity >0.2. Low expression of this miR is indicative of bad prognosis in the combined stage I and stage II patients.
Example 8
Specific microRNAs are able to predict the prognosis of colon cancer patients Comparison of the median values of miR expression in tumor samples obtained from stage I colon cancer patients having good prognosis (no recurrence within 3 years following surgery) (46 patients) with the median values of miR expression in tumor samples obtained from patients with bad prognosis (recurrence within 3 years following surgery) (8 patients) revealed significant differences in the expression pattern of specific miRs as shown in Figure 10. In the group of patients with bad prognosis the median expression values of hsa-miR-let-7b (SEQ ID NO: 66) and hsa-miR-21 (SEQ ID NO: 10) were found to be above the median expression values of the patients with good prognosis, whereas the median expression values of hsa-miR-196b (SEQ ID NO: 3) and hsa-miR-125b (SEQ ID NO: 4) were found to be below the median expression of the patients with good prognosis. Accordingly, relatively high expression values of hsa-let- 7b (SEQ ID NO: 66) and miR-21 (SEQ ID NO: 10) were demonstrated to be indicative of poor prognosis of colon cancer, whereas relatively high expression values of hsa-miR- 196b (SEQ ID NO: 3) and hsa-miR-125b (SEQ ID NO: 4) were demonstrated to be indicative of good prognosis. Differential expression of miRs in samples obtained from stage I colon cancer patients with good prognosis (n=7) and from stage I colon cancer patients with bad prognosis (n=8) is shown in Figure 1 IA (as measured by microarray) and in Figure 1 IB (as measured by PCR). In the group of patients with bad prognosis the median expression values of hsa-miR-126 (SEQ ID NO: 2) and hsa-miR-21 (SEQ ID NO: 10) were found to be above the median expression values of the patients with good prognosis, whereas the median expression values of hsa-miR-196b (SEQ ID NO: 3) and hsa-miR-125b (SEQ ID NO: 4) were found to be below the median expression of the patients with good prognosis. Accordingly, relatively high expression values of hsa-miR-126 (SEQ ID NO: 2) and hsa-miR-21 (SEQ ID NO: 10) were demonstrated to be indicative of poor prognosis of colon cancer, whereas relatively high expression values of hsa-miR-196b (SEQ ID NO: 3) and hsa-miR-125b (SEQ ID NO: 4) were demonstrated to be indicative of good prognosis.
Example 9
PCR validation of microarray-determined miR expression patterns in patients with stage I colon cancer
The miR expression patterns of the statistically significant miRs in patients with stage I colon cancer were measured by microarray and subsequently validated by PCR. The distribution of expression, as measured by microarray, in tumor samples obtained from stage I colon cancer patients with good prognosis (no recurrence within 3 years following surgery) or bad prognosis (recurrence within 3 years following surgery) is shown in Figure 12. Logrank p-value<0.05 and fold-change values for the microarray results are presented in Table 4. Figure 13 presents the distribution of expression, as measured by PCR, in the same patient group. Logrank p-value<0.05 and fold-change values for the PCR results are presented in Table 5.
Table 4: Distribution of miR expression in stage I colon cancer patients as tested by microarray
Figure imgf000048_0001
Neither up- nor down-regulated in bad prognosis hsa-let-7b O .21 1.1 66 71
Table 5: Distribution of miR expression in stage I colon cancer patients as tested by PCR
Figure imgf000049_0001
A Kaplan-Meier plot for persistence (survival) of disease-free status of colon cancer patients (stage I) grouped by expression levels of hsa-miR-196b, as measured by microarray, is shown in Figure 14 A. Figure 14B presents the Kaplan-Meier plot for the same group of colon cancer patients (stage I), as measured by PCR.
Example 10
PCR validation of microarray-determined miR expression patterns in patients with stage II colon cancer
Comparison of the median values of miR expression in tumor samples obtained from stage II colon cancer patients having good prognosis (no recurrence within 3 years following surgery) (8 patients) with the median values of miR expression in tumor samples obtained from patients with bad prognosis (recurrence within 3 years following surgery) (7 patients) revealed significant differences in the expression pattern of specific miRs as shown in Figure 15 A, which illustrates differential expression as measured by microarray). Figure 15B illustrates differential expression of the same set of miRs as in Figure 15 A, as measured by PCR. hi the group of patients with bad prognosis the median expression values of hsa-miR-155 (SEQ ID NO: 14) were found to be above the median expression values of the patients with good prognosis, whereas the median expression values of hsa-miR-29b (SEQ ID NO: 54) and hsa-miR-29a (SEQ ID NO: 1) were found to be below the median expression of the patients with good prognosis. Accordingly, relatively high expression values of hsa-miR-29a (SEQ ID NO: 1) were demonstrated to be indicative of good prognosis. The miR expression patterns of the statistically significant miRs in patients with stage II colon cancer were measured by microarray and subsequently validated by PCR. The distribution of expression, as measured by microarray, in tumor samples obtained from stage II colon cancer patients with good prognosis (no recurrence within 3 years following surgery) or bad prognosis (recurrence within 3 years following surgery) is shown in Figure 16. Logrank p-value<0.05 and fold-change values for the microarray results are presented in Table 6. Figure 17 presents the distribution of expression, as measured by PCR, in the same patient group. Logrank p-value<0.05 and fold-change values for the PCR results are presented in Table 7.
Table 6: Distribution of miR expression in stage II colon cancer patients as tested by microarray
Figure imgf000050_0001
Table 7: Distribution of miR expression in stage II colon cancer patients as tested by PCR
Figure imgf000051_0001
A Kaplan-Meier plot for persistence (survival) of disease-free status of colon cancer patients (stage II) grouped by expression levels of hsa-miR-29a, as measured by microarray, is shown in Figure 18 A. Figure 18B presents the Kaplan-Meier plot for the same group of colon cancer patients (stage II), as measured by PCR.
The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without undue experimentation and without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. Although the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims.
It should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.

Claims

1. A method of determining the prognosis of colon cancer in a subject comprising:
(a) obtaining a biological sample from the subject;
(b) determining the expression levels in said sample of nucleic acid sequences selected from the group consisting of SEQ ID NOS: 1-66 and 71 or a sequence having at least about 80% identity thereto; and
(c) comparing said expression levels to a reference value, whereby an altered expression level of the nucleic acid sequence is indicative of the prognosis of said subject.
2. The method of claim 1, wherein said altered expression level is a change in a score based on a combination of expression levels of said nucleic acid sequences.
3. The method of claim 1, wherein said nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 2, 5-10, 13, 15-17, 19, 22, 26-32, 35, 36, 38- 41, 43, 45-47, 51, 55-57, 61, 66 and 71, and sequences at least about 80% identical thereto, and said expression levels above said reference value are indicative of poor prognosis in said subject.
4. The method of claim 1, wherein said nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 1, 3, 4, 11, 12, 14, 18, 20, 21, 23-25, 33, 34, 37, 42, 44, 48-50, 52-54, 58-60 and 62-65, and sequences at least about 80% identical thereto, and said expression levels below said reference value are indicative of poor prognosis in said subject.
5. The method of claim 1, wherein the subject is a human.
6. The method of claim 1, wherein said colon cancer is colon adenocarcinoma.
7. The method of claim 1, wherein said prognosis is prediction of colon cancer risk of recurrence.
8. The method of claim 1, wherein said biological sample is selected from the group consisting of bodily fluid, a cell line, a tissue sample, a biopsy sample, a needle biopsy sample, a fine needle biopsy (FNA) sample, a surgically removed sample, and a sample obtained by tissue-sampling procedures such as colonscopy or laparoscopic methods.
9. The method of claim 8, wherein said tissue is a fresh, frozen, fixed, wax- embedded or formalin-fixed paraffin-embedded (FFPE) tissue.
10. The method of claim 9, wherein said tissue is a colon tissue.
11. The method of claim 10, wherein said colon tissue is colon adenocarcinoma.
12. The method of claim 1, wherein the expression level is determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, and a combination thereof.
13. The method of claim 12, wherein the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization.
14. The method of claim 12, wherein the nucleic acid amplification is performed using real-time PCR.
15. The method of claim 14, wherein the PCR method comprises forward and reverse primers.
16. The method of claim 15, wherein said forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 88-99 and sequences at least about 80% identical thereto.
17. The method of claim 15, wherein the reverse primer comprises SEQ ID NO: 100, a fragment thereof and a sequence at least about 80% identical thereto.
18. The method of claim 14, wherein the real-time PCR method further comprises a probe.
19. The method of claim 18, wherein the probe comprising a nucleic acid sequence that is complementary to a sequence selected from SEQ ID NO: 1-66 and 71; to a fragment thereof or to a sequence at least about 80% identical thereto.
20. The method of claim 19, wherein the probe comprises a sequence selected from the group consisting of SEQ ID NOS: 76-87, a fragment thereof and a sequence at least about 80% identical thereto.
21. A kit for determining a prognosis of a subject with colon cancer, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from SEQ ID NO: 1-66 and 71; to a fragment thereof or to a sequence at least about 80% identical thereto.
22. The kit of claim 21, wherein said probe comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 76-87, a fragment thereof and a sequence at least about 80% identical thereto.
23. The kit of claim 21, wherein the kit further comprises forward and reverse primers.
24. The kit of claim 23, wherein said forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 88-99 and sequences at least about 80% identical thereto.
25. The kit of claim 23, wherein the reverse primer comprises SEQ ID NO: 100, a fragment thereof and a sequence at least about 80% identical thereto.
26. The kit of claim 21, wherein the kit comprises reagents for performing in situ hybridization analysis.
PCT/IL2009/001075 2008-11-20 2009-11-15 Compositions and methods for the prognosis of colon cancer Ceased WO2010058393A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11630108P 2008-11-20 2008-11-20
US61/116,301 2008-11-20

Publications (2)

Publication Number Publication Date
WO2010058393A2 true WO2010058393A2 (en) 2010-05-27
WO2010058393A3 WO2010058393A3 (en) 2010-08-12

Family

ID=42198596

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2009/001075 Ceased WO2010058393A2 (en) 2008-11-20 2009-11-15 Compositions and methods for the prognosis of colon cancer

Country Status (1)

Country Link
WO (1) WO2010058393A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012072750A1 (en) * 2010-12-01 2012-06-07 INSERM (Institut National de la Santé et de la Recherche Médicale) Method for predicting the outcome of colon cancer by analysing mirna expression
CN103384829A (en) * 2011-01-13 2013-11-06 财团法人工业技术研究院 Biomarkers for predicting recurrence of colorectal cancer
EP2714934A4 (en) * 2011-05-24 2015-06-10 Rosetta Genomics Ltd METHODS AND COMPOSITIONS FOR DETERMINING HEART FAILURE OR RISK OF CARDIAC INSUFFICIENCY
EP2785872A4 (en) * 2011-12-01 2015-12-09 Univ Ohio State MATERIALS AND METHODS RELATED TO NSAID CHEMIO-PREVENTION IN COLORECTAL CANCER

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2505668A3 (en) * 2006-01-05 2013-01-09 The Ohio State University Research Foundation MicroRNA-based methods for the diagnosis of colon, lung, and pancreatic cancer
EP2038432B1 (en) * 2006-06-30 2017-02-08 Rosetta Genomics Ltd Method for detecting and quantifying a target nucleic acid generated by rt-pcr of mirna

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012072750A1 (en) * 2010-12-01 2012-06-07 INSERM (Institut National de la Santé et de la Recherche Médicale) Method for predicting the outcome of colon cancer by analysing mirna expression
JP2014502157A (en) * 2010-12-01 2014-01-30 アンスティチュ ナショナル ドゥ ラ サンテ エ ドゥ ラ ルシェルシュ メディカル Method for predicting colorectal cancer outcome by analyzing miRNA expression
EP2942403A1 (en) * 2010-12-01 2015-11-11 INSERM (Institut National de la Santé et de la Recherche Médicale) Method for predicting the outcome of a cancer by analysing mirna expression
JP2016182125A (en) * 2010-12-01 2016-10-20 アンスティチュ ナショナル ドゥ ラ サンテ エ ドゥ ラ ルシェルシュ メディカル METHOD FOR PREDICTING OUTCOME OF COLON CANCER BY ANALYSING miRNA EXPRESSION
EP3214185A1 (en) * 2010-12-01 2017-09-06 INSERM (Institut National de la Santé et de la Recherche Médicale) Method for predicting the outcome of colon cancer by analysing mirna expression
CN103384829A (en) * 2011-01-13 2013-11-06 财团法人工业技术研究院 Biomarkers for predicting recurrence of colorectal cancer
EP2619587A4 (en) * 2011-01-13 2013-11-13 Ind Tech Res Inst BIOMARKERS FOR RECURRENCE PREDICTION OF COLORECTAL CANCER
JP2014503221A (en) * 2011-01-13 2014-02-13 インダストリアル テクノロジー リサーチ インスティテュート Biomarkers for predicting recurrence of colorectal cancer
EP2778237A1 (en) * 2011-01-13 2014-09-17 Industrial Technology Research Institute Biomarkers for recurrence prediction of colorectal cancer
EP2714934A4 (en) * 2011-05-24 2015-06-10 Rosetta Genomics Ltd METHODS AND COMPOSITIONS FOR DETERMINING HEART FAILURE OR RISK OF CARDIAC INSUFFICIENCY
US9631236B2 (en) 2011-05-24 2017-04-25 Rosetta Genomics Ltd. Methods and compositions for determining heart failure or a risk of heart failure
EP2785872A4 (en) * 2011-12-01 2015-12-09 Univ Ohio State MATERIALS AND METHODS RELATED TO NSAID CHEMIO-PREVENTION IN COLORECTAL CANCER

Also Published As

Publication number Publication date
WO2010058393A3 (en) 2010-08-12

Similar Documents

Publication Publication Date Title
US9334540B2 (en) Methods and compositions for diagnosing complications of pregnancy
WO2010018563A2 (en) Compositions and methods for the prognosis of lymphoma
EP2800820B1 (en) Methods and kits for detecting subjects having pancreatic cancer
US20150099665A1 (en) Methods for distinguishing between specific types of lung cancers
US9988690B2 (en) Compositions and methods for prognosis of ovarian cancer
KR20100084619A (en) Diagnosis and prognosis of specific cancers by means of differential detection of micro-rnas/mirnas
US20220267863A1 (en) Compositions and methods for determining the prognosis of bladder urothelial cancer
EP2691545B1 (en) Methods for lung cancer classification
US9068232B2 (en) Gene expression signature for classification of kidney tumors
US9765334B2 (en) Compositions and methods for prognosis of gastric cancer
US9834821B2 (en) Diagnosis and prognosis of various types of cancers
WO2010004562A2 (en) Methods and compositions for detecting colorectal cancer
WO2010058393A2 (en) Compositions and methods for the prognosis of colon cancer
US9340823B2 (en) Gene expression signature for classification of kidney tumors
WO2010070637A2 (en) Method for distinguishing between adrenal tumors
WO2011039757A2 (en) Compositions and methods for prognosis of renal cancer
WO2010018585A2 (en) Compositions and methods for prognosis of melanoma
HK1204016B (en) Methods and kits for detecting subjects having pancreatic cancer

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09827261

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09827261

Country of ref document: EP

Kind code of ref document: A2