WO2019229431A1 - Method of predicting survival rates for oropharyngeal cancer patients - Google Patents
Method of predicting survival rates for oropharyngeal cancer patients Download PDFInfo
- Publication number
- WO2019229431A1 WO2019229431A1 PCT/GB2019/051464 GB2019051464W WO2019229431A1 WO 2019229431 A1 WO2019229431 A1 WO 2019229431A1 GB 2019051464 W GB2019051464 W GB 2019051464W WO 2019229431 A1 WO2019229431 A1 WO 2019229431A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- biomarker
- score
- riskscore
- high risk
- tils
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Definitions
- the invention relates to a method of predicting treatment selection and/or survival rates for cancer patients, particularly those with oropharyngeal squamous cell carcinoma.
- chemoradiotherapy or surgery (which may be followed by radiotherapy or another adjuvant treatment). Selection is usually guided by surgical resectability, clinician preference and patient choice.
- prognosis methods for patients with OPSCC use classifiers including some of p16, HPV status, clinical stage, smoking history, alcohol consumption and comorbidities are described in“Human papillomavirus and survival of patients with oropharyngeal cancer” by Ang et al in N Engl J Med 2010;363:24-35;“Tumor stage, human papillomavirus and smoking status affect the survival of patients with oropharyngeal cancer: an Italian validation study” by Granata et al in Ann Oncol.
- WO2016/094330 describes computer-implemented machine learning methods for assessing a likelihood that a patient has a disease.
- US2008/0133141 describes methods for scoring one or more biomarkers in a test sample and determining a subject’s risk of developing a medical condition.
- [1 1 ] We describe a method for predicting a level of risk for a patient with oropharyngeal squamous cell carcinoma, the method comprising: providing a tissue sample from the patient; determining the presence of at least four biomarkers in said sample; for each of said at least four biomarkers, determining a score indicative of a level of the biomarker present within the tissue sample; calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; and comparing the riskscore to a threshold to predict whether the patient is high risk; wherein the four biomarkers are selected from the group consisting of SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
- TILs tumour infiltrating lymphocytes
- a method for predicting a level of risk for a patient with oropharyngeal squamous cell carcinoma comprising: providing a tissue sample from the patient; for each of at least four biomarkers, determining the presence of at least four biomarkers in said sample; determining a score indicative of a level of the biomarker present within the tissue sample; calculating a riskscore based on the determined scores; and comparing the riskscore to a threshold to predict whether the patient is high risk; wherein the four biomarkers are selected from the group consisting of SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
- TILs tumour infiltrating lymphocytes
- the weighted sum may be a linear equation known as the Cox proportional Hazards model, thus in general terms, the riskscore may be expressed as :
- riskscore b ⁇ x ⁇ + b 2 x 2i + ⁇ +b n x nl
- a biomarker may be defined as a biological molecule or cell found in blood, other body fluids, or tissues that is a sign of a normal or abnormal process, or a condition or disease.
- the values for each biomarker score (i.e. covariate) and weight for each biomarker may be determined by training the model on a set of training data comprising information about the values of a plurality of biomarkers in a set of patients and the associated outcomes for each patient.
- the plurality of biomarkers comprises at least the biomarkers listed above and may further comprise additional biomarkers such as cyclinDI , EGFR external, BCL2, COX2 and HIF1 alpha.
- the method may further comprise selecting the plurality of biomarkers to be used when training the model.
- the biomarkers which are selected to be included when training the model may influence the effectiveness of the prediction using the riskscore.
- the weighting for the biomarkers which are included in the riskscore may be different and/or the biomarkers which are ultimately included in the riskscore may be different.
- OPC particular tumour
- the threshold may be the median riskscore for the training data. Appropriate selection of the threshold may also influence the effectiveness of the prediction using the riskscore. When the threshold is based on the training data, it will be appreciated that appropriate selection of the dataset which is used for training is important to provide a useful threshold.
- the H-score is a known method of assessing the extent of a biomarker within a sample.
- the scaled intensity score may be the determined score for at least one of SURVIVIN and PLK1.
- adjustments to the scaled intensity score may be made.
- the biomarker score may be based on the scaled intensity score which has been adjusted by subtracting an adjustment factor. In other words, the biomarker score may be an adjusted determined score.
- determining a score indicative of a level of the biomarker may comprise awarding a first value when the level is above a threshold and a second value when the level is below the threshold.
- the determined score is binary - either having the first value or the second value - the first value may be considered to show a high level of the biomarker and the second value a low level.
- the first or second value may be the determined score for at least one of p16 and high risk HPV.
- the biomarker score may also be an adjusted determined score and the adjustment may provide biomarker scores which are based on the determined score and optimise the model.
- determining a score indicative of a level of the biomarker may comprise awarding a first value when the level is above an upper threshold, a second value when the level is below the upper threshold but above a lower threshold and a third value when the level is below the lower threshold.
- the first value may be considered to show a high level of the biomarker, the second value an intermediate level and the third value a low level.
- This may be an appropriate scoring method for TILs.
- the first, second or third value may be the determined score for TILs.
- the individual weighted component for that biomarker may need to be separated into a plurality of components.
- the weighted biomarker score for the biomarker TILs may comprise a sum of a first weighted biomarker score when TILs equals the second value and a second weighted biomarker score when TILs is equals the third value, e.g.
- YSTILS Y2STILS2 + Y3STILS3
- the biomarker score included in the riskscore calculation may also be an adjusted determined score.
- the four biomarkers may be SURVIVIN, p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
- the method may include calculating the riskscore which is an overall survival riskscore whereby the method is for predicting a level of risk for the patient in terms of their overall survival outcome.
- a high risk patient may thus be at high risk of not surviving within a time frame of interest, e.g. five years.
- a low risk patient will have a better chance of surviving than a high risk patient within the same time frame.
- the method may further comprise determining a treatment. When the patient is classified as high risk, the treatment may be surgical. When the patient is not classified as high risk, e.g.
- the treatment may be a surgical or a non- surgical treatment.
- a surgical or a non- surgical treatment As explained below, such a method is particularly useful because before the work by the inventors, no predictive classifiers have yet been validated to select specific curative treatment regimens for individual patients with head and neck cancer.
- a method for assessing a patient with oropharyngeal squamous cell carcinoma for an appropriate treatment comprising: providing a tissue sample from the patient; determining the presence of at least for biomarkers in said sample; for each of said at least four biomarkers, determining a score indicative of a level of the biomarker present within the tissue sample, calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; comparing the riskscore to a threshold to predict whether the patient is high risk; and determining a treatment; wherein the four biomarkers are selected from the group consisting of SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
- TILs tumour infiltrating lymphocytes
- the overall survival riskscore may be considered a primary riskscore and may for example be calculated by using a weighted sum of the scores for each biomarker, i.e.
- a is the weighting applied to S p16 which is the biomarker score for the biomarker p16
- b is the weighting applied to S Hp v which is the biomarker score for the biomarker HPV
- g is the weighting applied to S T ILS which is the biomarker score for the biomarker TILs
- d is the weighting applied to S sur which is the biomarker score for the biomarker SURVIVIN.
- the values of the weightings may be determined using statistical analysis, for example using the Cox proportional hazards modal above.
- the ranges of the weightings may be d between 0.2 to 0.9, more preferably between 0.35 to 0.5 and as an example 0.2071518.
- the weighting a may have a value ranging between -2.2 to -0.4, more preferably between -1.4 to - 0.4 and as an example -0.9652818.
- the weighting b may have a value ranging between -3 to - 0.5, more preferably between -1.85 to -1.05 and as an example -1.3976193.
- the weighting g 2 may have a value ranging between -1.55 to -0.05, more preferably between -0.95 to -0.55 and as an example -0.6318398.
- the weighting g 3 may have a value ranging between -2.7 to 0.05, more preferably between -1.7 to -1.05 and as an example -1.2751045.
- the weightings applied to p16, HPV and TILs are preferably negative because as described below the presence of these biomarkers is indicative of a low risk patient, i.e. indicative of a positive survival rate.
- the weighting applied to SURVIVIN is preferably positive to reflect that the presence of this biomarker is indicative of a high risk patient. All references to ranges herein are inclusive of the endpoints.
- the four biomarkers may be SURVIVIN, p16, PLK1 and tumour infiltrating lymphocytes (TILs).
- calculating the riskscore comprises calculating a recurrent free survival riskscore whereby the method is for predicting a recurrent free survival outcome.
- a high risk patient may thus be at high risk of not surviving without any recurrence of the cancer within a time frame of interest which may be the same as the time frame used above.
- a low risk patient will have a better chance of surviving without a repetition of the illness than a high risk patient.
- the recurrent free survival riskscore may be termed a secondary riskscore can be calculated using the following model which uses four biomarkers, three of which overlap with the previous model and one of which is new:
- a is the weighting applied to S p16 which is the biomarker score for the biomarker p16
- g is the weighting applied to S T ILS which is the biomarker score for the biomarker TILs
- 5 is the weighting applied to S sur which is the biomarker score for the biomarker SURVIVIN
- e is the weighting applied to S PLKi which is the biomarker score for the new biomarker PLK1.
- the ranges of the weightings given above may also apply to this model.
- the weightings applied to p16 and TILs are preferably negative. This is also true for PLK1 .
- the weighting applied to SURVIVIN is preferably positive.
- Said high risk HPV may be selected from one or more of HPV-16, -18, -31 , -33, -35, -39, - 45, -51 , -52, -56, -58, -6619.
- the presence of said high risk HPV may be determined using High risk HPV in-situ hybridization (HR- HPV - ISH).
- HR- HPV - ISH High risk HPV in-situ hybridization
- the presence of p16, surviving and/or PLK1 is determined using immunohistochemistry.
- p16 may also be known as CDKN2A.
- a computer device comprising at least one processor; and instructions that, when executed by the at least one processor cause the computer device to perform any of the determining, calculating and comparing steps of the methods described above.
- a tangible non-transient computer-readable storage medium having recorded thereon instructions which, when implemented by a computer device, cause the computer device to be arranged as described above and/or which cause the computer device to perform any of the relevant steps of the methods as described above.
- a kit comprising a microarray for the tissue sample and/or one or more reagents to determine the presence of at least one biomarker selected from SURVIVIN, PLK1 , p16, high risk HPV and TILs and optionally the computer device.
- the kit comprises reagents to determine the presence of all of SURVIVIN, PLK1 , p16, high risk HPV and TILs.
- Suitable reagents are known to the skilled person and include reagents for staining tissue to assess the level of the biomarker as described herein.
- the kit may be used for predicting a level of risk for a patient with oropharyngeal squamous cell carcinoma and/or determining a treatment for a patient with oropharyngeal squamous cell carcinoma.
- Figure 1 a is a flowchart showing the steps of the prediction method
- Figure 1 b is a schematic box diagram showing the components of the system for implementing the prediction method
- Figures 2a to 2I are photomicrographs showing examples of stained tumours
- Figure 3 outlines the nature of the data which was used to develop the models used in the prediction method of Figure 1 a;
- Figure 4 is a flowchart showing the steps of the training method;
- Figure 5 is a chart showing the H-score allocated to various biomarkers in a cohort of test samples;
- Figure 6 is a box plot of the coefficients fitted to each biomarker when training the model
- Figures 7a and 7b are calibration plots for the overall survival model using the training cohort data and the validation data respectively;
- Figures 7c and 7d are calibration plots for the recurrence free survival model using the training cohort data and the validation data respectively;
- Figures 8a and 8b are calibration plots for the overall survival model for the combined biomarkers and clinical factors using the training cohort data and the validation data respectively;
- Figures 9a to 9c show the probability of survival for low and high risk patients over a period of 5 years using the overall survival models having clinical factors only, biomarkers only and combined factors respectively where the data is from the validation cohort;
- Figure 9d shows the probability of survival for low and high risk patients over a period of 5 years using the overall survival model having biomarkers only where the data is from the training cohort;
- Figures 10a to 10c show the probability of survival for low, intermediate and high risk patients over a period of 5 years using the overall survival models having clinical factors only, biomarkers only and combined factors respectively where the data is from the validation cohort;
- Figures 1 1 a to 1 1 c show the probability of survival for the patients shown in Figures 9a to 9c separated into patients who did and did not have surgery where the data is from the validation cohort;
- Figure 1 1 d shows the probability of survival for low and high risk patients over a period of 5 years using the overall survival model having biomarkers only where the data is from the training cohort and is separated into those who have or have not had surgery;
- Figures 12a and 12b show the probability of survival for low and high risk patients over a period of 5 years using the recurrence free survival models having biomarkers only and combined factors respectively where the data is from the validation cohort;
- Figures 12c and 12d show the probability of survival low, intermediate and high risk patients over a period of 5 years using the recurrence free survival models having biomarkers only and combined factors respectively where the data is from the validation cohort;
- Figures 12e and 12f show the probability of survival for the patients shown in Figures 12a and 12b separated into patients who did and did not have surgery where the data is from the validation cohort;
- Figures 13a and 13b show the probability of survival for low and high risk patients over a period of 5 years using the recurrence free survival models having clinical markers only where the data is from the validation cohort; and [51 ] Figures 13c and 13d show the probability of survival for the patients shown in Figures 13a and 13b separated into patients who did and did not have surgery where the data is from the validation cohort.
- FIG. 1 a shows the steps which may be carried out in a prediction method.
- the first step is to be provided a tissue sample (step S100).
- the tissue sample may be obtained using any appropriate method, e.g. from a donor sample obtained using a biopsy. A tissue sample may thus be processed and fixed.
- the tissue sample may then be stained with appropriate reagents (step S102), for example haematoxylin and eosin for assessing the level of tumour infiltrating lymphocytes (TILs).
- TILs tumour infiltrating lymphocytes
- Figures 2a to 2I are photomicrographs showing examples of stained tumours, for each of the four biomarkers: p16, HPV, SURVIVIN and TILs and PLK-1 . More information on the genes p16, SURVIVIN and PLK-1 is shown in the table below. HPV types are further described below.
- the staining is used to determine the presence of the biomarker and to assist in the next step which is to determine the score for each biomarker (step S104).
- the scoring was done by pathologists who had attended a training and calibration meeting and had undergone certification by scoring three hundred test samples. Preferably more than one, e.g. at least three pathologists, independently determine the score. It will be appreciated that other methods of scoring, e.g. automated methods, could also be used.
- Figure 2a shows a tissue sample with a tumour showing strong and diffuse nuclear and cytoplasmic staining for p16.
- Figure 2b shows a tissue sample with a tumour showing weak and diffuse cytoplasmic staining for p16.
- Figure 2c shows a tissue sample with a tumour with no staining for p16.
- the scoring for p16 is binary and thus either a positive or negative score is determined. A positive score is awarded if the tissue sample shows strong and diffuse nuclear and cytoplasmic staining present in >70% of the tumour.
- This threshold was selected based on a clinically validated cut off described in “Validation of methods for oropharyngeal cancer HPV status determination in US cooperative group trials” by Jordan et al published in Am J Surg Pathol. 2012;36(7):945-54. As outlined in“Human papillomavirus and survival of patients with oropharyngeal cancer " by Ang et al published in N Engl J Med 2010;363:24-35, this threshold can be equated with a H-score defined by:
- High risk HPV in-situ hybridization may also be scored in a binary fashion, i.e. positive or negative.
- High risk HPV DNA in situ hybridization may be stained using a Ventana INFORMVIII Family 16 probe to detect HPV-16, -18, -31 , -33, - 35, -39, -45, -51 , -52, -56, -58, -66.
- Figure 2d shows a tissue sample with a High-risk HPV tumour showing diffuse nuclear and cytoplasmic staining.
- Figure 2e shows a tissue sample with a High-risk HPV tumour showing punctate nuclear staining.
- Figure 2f shows a tissue sample with a High-risk HPV tumour with no staining.
- the samples in Figures 2d and 2e are thus awarded a positive score and the sample shown in Figure 2f is awarded a negative score.
- the scoring for HPV-ISH may thus be as summarised in the table below:
- the SURVIVIN immunochemistry was assessed by assigning an intensity score selected from 0 for no staining, 1 for weak staining, 2 for moderate staining and 3 for strong staining together with a percentage of malignant cells stained at each intensity.
- the percentages may be rounded up or down to the nearest 5% to simplify the calculations.
- the H- score may then be calculated by using a weighted sum of the percentages, e.g.:
- H-score 0 x a% + 1 x b% + 2 x c% + 3 x d%
- H-score is the percentage of cells with no staining
- b% is the percentage of cells with weak staining
- c% is the percentage of cells with moderate staining
- d% is the percentage of cells with strong staining.
- the value of the H-score will thus be between 0 and 300.
- Figures 2g, 2h and 2i shows tissue samples with high, medium and low H scores respectively for the SURVIVIN immunohistochemistry.
- the scoring for SURVIVIN may thus be as summarised in the table below:
- the H score has been calculated for the SURVIVIN immunohistochemistry, this may be used to calculate a Z-score.
- the first step is to scale the H-score so that the value is between 0 to 10. For the SURVIVIN immunohistochemistry this is achieved by dividing the H- score by 30 to generate an X-score.
- the next transformation allows the input score to the comparable to scores arising from different clinical cohorts and distributions. This is achieved by calculating an X-score for each sample in the cohort of samples. For example, in a cohort of 356 samples, it was found that the H-score ranged from 0 to 192.5 as shown in the upper part of Figure 2m. Thus the X-score ranged from 0 to 6.42.
- the mean X-score m and the standard deviation s for the X-scores for the cohort together with the new value, are then calculated.
- the Z-score for each sample / is calculated as:
- the method of calculating an H-intensity score and conversion to a Z-score may also be followed for an additional biomarker PLK-1 where this biomarker is incorporated, e.g. as explained in more detail below.
- PLK-1 additional biomarker
- the H-score ranged from 0 to 165 as illustrated in the lower portion of Figure 2m.
- the X-score ranged from 0 to 5.5.
- the mean X-score m and the standard deviation s for the X-scores for the cohort, including the latest value, are then calculated so that the Z-score can be calculated as above.
- tumour infiltrating lymphocytes was determined for example as described in“Tumour-infiltrating lymphocytes predict for outcome in HPV-positive oropharyngeal cancer J ’ by Ward et al published in Br J Cancer. 2014;110(2):489-500.
- the scanning magnification was increased (e.g. to 2.5 times objective) and the samples were assigned to one of the following three categories: high, moderate or low as shown in the table below:
- Figures 2j, 2k and 2I shows tissue samples with high, moderate and low scores for tumour infiltrating lymphocytes (TILs).
- TILs tumour infiltrating lymphocytes
- trained pathologists or automated assessment
- the scoring may be considered to be subjective, in fact research shows that the scoring is consistent, including for the antibodies such as PLK1 and Survivin which are described by continuous scoring.
- the table below shows that there is a high correlation between pathologists on the antibodies and stains used.
- the individual scores by the different pathologists may vary, the variation is not high:
- the next step in the method is to determine the riskscore using the determined scores for each of the biomarkers (step S106).
- the riskscore is determined by inputting the determined scores into one of two models which have been developed as explained in more detail below.
- the first model determines a primary riskscore which may be termed an overall survival (OS) riskscore and which may be calculated as the time from diagnosis until death or censored at the last clinical contact.
- the second model determines a secondary riskscore which may be termed a recurrence free survival riskscore and which may be calculated as the time from diagnosis until the date when loco-regional recurrence or distant metastasis was first confirmed by radiology and/or biopsy. For consistency, survival time was truncated, for example at 5 years.
- the primary riskscore may for example by calculated by using a weighted sum of the scores for each biomarker, i.e.
- Each of the other biomarkers is treated as a factor using the lowest category as a baseline group, i.e. a baseline score is assigned to the baseline group and a second score is assigned to the other group.
- the magnitude of the baseline and second scores sums to 1 when the Cox proportional hazards model is used as described in more detail below and it is the baseline or second score which is used as the adjusted score in the calculation of the riskscore.
- p16 has a baseline score of - 0.6603053 and a second score of 0.3396947.
- the biomarker HPV is treated in a similar manner to p16 and the values are shown below.
- the table below shows examples of determined scores, scores adjusted for the riskscore calculation, weightings for each biomarker score and the overall contribution to the riskscore for each biomarker:
- TILs there are three determined scores - high, moderate and low - as explained above.
- the weighting of the score for TILs can be separated into two components:
- nskscore —0.9652818 x j
- the model was developed as a Cox proportional hazards model and thus the outcome is interpreted as a hazards ratio which represents an incremental increase in the hazard (e.g. death for the first model).
- Positive coefficients such as those for negative scores for p16 and High risk HPV in-situ hybridization and for low and moderate scores for TILs indicate that these scores are associated with a poor outcome (i.e. low overall survival riskscore).
- negative coefficients such as those for positive scores for p16 and High risk HPV in-situ hybridization and for a high score for TILs indicate that these scores are associated with a good outcome (i.e. high overall survival riskscore).
- the model is validated in more detail as explained below but as an indication, the expression of the model is in line with the results in the table below which show the mean or median 3 and 5 year overall survival (OS) rates for the subjects with different combinations of biomarker signatures.
- the table also indicates the overall survival rates based on the nature of the treatment:
- a is the weighting applied to S p16 which is the adjusted score for the biomarker p16
- g is the weighting applied to S T ILS which is the adjusted score for the biomarker TILs
- 5 is the weighting applied to S sur which is the adjusted score for the biomarker SURVIVIN
- e is the weighting applied to S PLKi which is the adjusted score for the new biomarker PLK1 .
- the final value of the determined score (i.e. the z-score) for the SURVIVIN immunohistochemistry may be adjusted by subtracting an adjustment factor, e.g. 0.1 1409913, before the weighting is applied.
- the final value of the determined score for PLK1 may be adjusted by subtracting an adjustment factor, e.g. 0.07201744 before summing using the weighting.
- the table below shows examples of determined scores, scores adjusted for the riskscore calculation, weightings for each biomarker score and the overall contribution to the riskscore for each biomarker:
- the weighting of the score for TILs can be separated into two components:
- the riskscore is compared to a threshold (step S108). If the riskscore is equal to or above the threshold, there is deemed to be a high risk that the patient will not survive and thus the patient is classified as a high risk patient (step S1 10). Conversely if the riskscore is below the threshold, there is deemed to be a low risk that the patient will not survive and thus the patient is classified as a low risk patient (step S1 12).
- the threshold may for example be the median riskscore of the data which was used to train the model, for example 0.952636 which has been determined as explained below.
- the riskscore may be compared to an upper and a lower threshold. If the riskscore is equal to or above the upper threshold, the patient is classified as a high risk patient. If the riskscore is below the lower threshold, the patient is classified as a low risk patient. If the riskscore is between the two thresholds, the patient is classified as an intermediate risk patient.
- the upper and lower thresholds may be determined as the tertiles of the riskscores determined from the training cohort as explained below.
- the riskscore may optionally be used to decide on the most appropriate treatment. For example, for a high risk patient, surgery is recommended. Furthermore, the surgery should be combined with either postoperative radiotherapy or postoperative chemoradiotherapy depending on the post-operative pathological conditions. Such treatment results in an improved overall survival rate than chemotherapy alone. By contrast, for a low risk patient, the treatment can be selected from either primary chemotherapy or the combined surgical approach specified above. Both options are equally effective in such cases.
- the trained pathologists provide highly correlated scores. Nevertheless, the impact of variation in scoring may be reduced further using the method described above.
- the method of scaling the H-intensity score, conversion to a Z-score and correction by a scalar may help to reduce the risk of any discrepancies occurring based on a different score being awarded.
- the scaled H scores, the Z scores and adjusted Z scores are shown below for two original differing values. It will be appreciated that the division by 30 and transformation into a Z score reduces the impact of the different classifications:
- the threshold which is also determined from the training data defines a low risk patient as one having a riskscore below 0.952636 and thus in both cases, the classification is as a high risk patient.
- the riskscore calculation thus minimises the effect of an incorrect classification by a pathologist of one of the biomarkers.
- FIG. 1 b A schematic of an associated system for performing the method is shown in Figure 1 b.
- the system comprises a computing device 10 which could be a handheld device which is portable for a clinician to transport from patient to patient and an app could be loaded onto the device for calculation of the riskscore.
- the computing device 10 comprises the standard components such as a processing unit or processor 20, a user interface unit 22 for allowing a user to input information, e.g. determined scores, and a memory 24 for storing the code to perform the calculation and/or the threshold for comparing the calculated riskscore.
- the user interface may display information or alternatively, there may be a display 24 for displaying information to a user, e.g. a calculated riskscore and/or a suggestion for treatment as described above and a communications module 28 for communicating with other devices and/or accessing the cloud 40, e.g. to process the riskscore.
- the tissue sample 30 is also shown schematically.
- This schematic system may be constructed, partially or wholly, using dedicated special- purpose hardware.
- Terms such as‘module’ or‘unit’ used herein may include, but are not limited to, a hardware device, such as circuitry in the form of discrete or integrated components, a Field Programmable Gate Array (FPGA) or Application Specific Integrated Circuit (ASIC), which performs certain tasks or provides the associated functionality.
- FPGA Field Programmable Gate Array
- ASIC Application Specific Integrated Circuit
- the described elements may be configured to reside on a tangible, persistent, addressable storage medium and may be configured to execute on one or more processors.
- These functional elements may in some embodiments include, by way of example, components, such as software components, object-oriented software components, class components and task components, processes, functions, attributes, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables.
- components such as software components, object-oriented software components, class components and task components, processes, functions, attributes, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables.
- Figure 3 illustrates some of the nature of the data which was used to train and then validate the model above.
- data for 985 subjects was used.
- the data includes information from a retrospective longitudinal cohort of consecutive patients 18 years of age or more, who had OPSCC treated with curative intent between 1 January 1999 and 31 December 2012, with either chemo/radiotherapy or surgery with or without radiotherapy/ chemoradiotherapy.
- 600 cases from nine secondary care head and neck cancer treatment centres in the UK and Tru were used.
- An independent cohort of 385 consecutive OPSCC patients undergoing curative treatment between 2002 and 2011 was used for validation.
- the characteristics of the training and validation cohorts are well matched as outlined above. However, there were differences in the proportion of treatments received in each set.
- the model was developed and validated as outlined in Figure 4.
- Both of the initial first and second models i.e. to determine overall survival and recurrence free survival
- included only biomarkers and in the first step S200 of Figure 4 the various biomarkers which could be included in the models are identified.
- a biomarker is a biological molecule or cell found in blood, other body fluids, or tissues that is a sign of a normal or abnormal process, or a condition or disease.
- the biomarkers which are considered are listed below together with an indication of the methods for creating the appropriate tissue samples (step S202) from which they can be scored. It is noted that TILs, although not listed in the table below, is also included as a biomarker as described above:
- IVD In Vitro Diagnostic Device with CE marking
- RUO Research Use Only which refers to a reagent which was optimised for the study
- RTU Ready to Use (prediluted)
- DAB 3,3’-diaminobenzidine.
- the next step is to determine the score for each biomarker in each sample.
- the biomarkers BCL2, COX2, Cyclin D1 , EGFR-external, HIF1 a, PLK1 and Survivin are treated as continuous variables and thus their score is determined as a Z-score as described above.
- the mean, median and distribution of the scores is summarised in the table below and the full distribution is shown in more detail in Figure 5.
- biomarker p16 is shown in Figure 5 and in the table above, like two other biomarkers - high risk HPV DNA ISH, and TILs - it is treated as a factor using the lowest category as a baseline group. The score for each of these factors is calculated as described above.
- h(f) h Q (f)xexp(biXi+b 2 x 2 +...+b p X n )
- t represents the survival time
- h(f) is the hazard function determined by a set of n covariates (x 1 ,x 2 ,...,x n )
- the set (b ⁇ ,b 2 ,...,b n ) are weights (or coefficients) for each covariate
- the term h 0 is called the baseline hazard which corresponds to the value of the hazard if all the x, are equal to zero (the quantity exp(0) equals 1).
- The‘t’ in h(t) reminds us that the hazard may vary over time. However, the time variance can be removed so that the model can be rewritten in linear form by taking the log of the hazard ratio for patient / to the reference group and this may be written as:
- This linear equation is known as the Cox Proportional Hazards model with a set of n covariates (xii,x 2i , ...,Xni ) for each patient / and a set (b ⁇ ,b 2 ,...,b n ) of weights which optimise the model for all patients.
- univariate analysis i.e. considering each variable in term
- the coefficient was calculated together with the lower and upper limits for the 95% confidence interval around the coefficient (CI95L and CI95U respectively).
- the P-value is a measure of the statistical significance of the variable and is calculated either using the Wald-test or the Log-rank test.
- the Q-value is an adjusted P-value using the Benjamini & Hochberg method.
- step S208 The next step was to build a multivariate model which combined a plurality of variables (step S208) using the Cox proportional hazards model above.
- the stability of the variables was then tested by performing repeated iterations of bootstrapping on the training cohort.
- the Cox proportional hazards model is fitted with backwards refinement on each subset to generate a new set of coefficients and possibly a new selection of variables (Step S210).
- the variable selection was guided by the Akaike Information Criterion which is a known estimator of the relative quality of statistical models.
- the next step (S214) is to validate the model and as explained above this was based on 385 independently collated cases.
- the validation repeated steps S100 to S106 as described in Figure 1 a above to generate a riskscore based on each model for each patient in the validation cohort.
- the overall survival model is prognostic and also shows a trend towards prediction of improved survival following surgical treatment for both high-risk patients and low risk patients.
- the recurrence free survival model is demonstrated below to be prognostic but not predictive for any specific treatment.
- each of the overall survival model and the recurrence free model using biomarkers only models were compared with the same models developed using clinical factors only and also a combination of biomarkers and clinical factors.
- the clinical factors were all treated as factors using the lowest category as baseline group and were gathered were possible from patient records.
- the T-category describes the primary tumour with category 0 meaning that there is no evidence of a primary tumour and categories 1 to 4 awarded depending on the size and amount of spread into nearby tissues.
- the N-category describes whether the cancer has spread into nearby lymph nodes with category 0 meaning that the nearby lymph nodes do not contain cancer and categories 1 to 3 are awarded depending on the size, location and number of nearby lymph nodes which are affected.
- the following table sets out the information for the optional univariate analysis using the Cox proportional hazards model for overall survival as described above in step S206 of Figure 4:
- Figures 9a to 9c compare the performance of each of the overall survival models created using the clinical markers only, the biomarkers only and a combination of clinical and biomarkers.
- the cohort is divided into two risk groups - low and high - by comparing the riskscore calculated by the model to a threshold equal to the median riskscore calculated for the training cohort.
- the high risk group contains patients having a calculated riskscore above or equal to the threshold.
- the low risk group contains patients having a calculated riskscore below the threshold.
- Figures 9a to 9c also show the Hazard Ratio (HR) for the high-risk group which uses the low-risk group as the baseline.
- the lower and upper limits to the 95% confidence interval for the HR are shown in brackets.
- the value P is calculated using the Wald test. For each model the numbers of patients in each group at the beginning of each year is shown below each graph and reproduced in the table below:
- each model successfully stratifies the two groups and is thus prognostic.
- the prognostic performance for the models using the biomarkers only and the combined factors are better than the prognostic performance for the model using clinical factors only.
- the performance of the model on the validation cohort can also be compared with the performance of the model on the training cohort.
- Figure 9d thus shows the results of the predicted overall survival model using biomarkers only using the data from the patients in the training cohort.
- the model thus also stratifies the high and low risk groups for the patients in the training cohort. It will be appreciated that this comparison can also be done for the other models.
- Figures 10a to 10c also compare the performance of each of the overall survival models created using the clinical markers only, the biomarkers only and a combination of clinical and biomarkers.
- the patients are divided into three sets of risk groups - high risk, intermediate risk and low risk - by comparing the riskscore calculated by the model to thresholds calculated for the training cohort.
- the high risk group contains patients having a calculated riskscore above or equal to a threshold which is value of the riskscore at the lower end of the upper median of riskscores for the training group.
- the low risk group contains patients having a calculated riskscore below the threshold which is value of the riskscore at the upper end of the lower median of riskscores for the training group.
- the intermediate risk group contains patients having a calculated riskscore between the first and second thresholds.
- Figures 10a to 10c also show the Hazard Ratios (HR) for the intermediate and high-risk groups which use the low-risk group as the baseline. The lower and upper limits to the 95% confidence interval for the HR are shown in brackets. The value P is calculated using the Wald test and/or log-rank test. For each model the numbers of patients in each group at the beginning of each year is shown below each graph. As shown in the graphs, each model successfully stratifies the high, intermediate and low risk groups for the patients in the validation cohort and is thus prognostic for the overall survival model.
- Figures 11 a to 11 c also compare the performance of each of the overall survival models created using the clinical markers only, the biomarkers only and a combination of clinical and biomarkers.
- the patients are divided into four sets of risk groups.
- First the patients are divided into high risk and low risk in the same manner as for Figures 9a to 9c.
- Each of the high risk and low risk groups are then divided into those who had surgery (S1) and those who have not had surgery (S2).
- S1 those who had surgery
- S2 those who have not had surgery
- the clinical only model is not able to stratify the four groups and is thus not predictive for improved survival following surgery.
- the biomarker only model was able to clearly stratify the high risk group and show improved survival rates for those who performed surgery.
- the biomarker only model is predictive of improved survival following surgical treatments for high risk patients.
- the biomarker and clinical factors model is also weakly predictive of improved survival following surgical treatments for high risk patients.
- Figures 12a and 12b compare the performance of each of the recurrence free survival models created using the biomarkers only and a combination of clinical and biomarkers.
- the cohort is divided into two risk groups - low and high.
- Figures 12c and 12d compare the performance of each of the overall survival models created using the biomarkers only and a combination of clinical and biomarkers.
- the cohort is divided into three risk groups - low, intermediate and high. As shown each of the models is prognostic of recurrence free survival by successfully stratifying each of the three groups.
- Figures 12e and 12f compare the performance of each of the recurrence free survival models created using the biomarkers only and a combination of clinical and biomarkers.
- the patients are divided into four sets of risk groups: high risk with and without surgery or low risk with and without surgery.
- the model using biomarkers only is weakly predictive of improved survival for the high risk group after surgery but is not predictive for the low risk group.
- the model using a combination of factors does not stratify either the high risk or the low risk group and is thus not predictive of improved survival following surgery.
- FIG. 13a and 13b compare the performance of each of the recurrence free survival models and as described above the cohort is divided into two risk groups - low and high. As shown in Figure 13a, the recurrence free survival model using TNM7 staging did not stratify the two groups whereas as shown in in Figure 13a, the recurrence free survival model using TNM8 staging does stratify the two groups and is thus prognostic of survival rates.
- Figures 13c and 13d compare the performance of each of the recurrence free survival models using clinical only factors developed with TNM7 staging and TNM8 staging respectively. As described above, the low and high risk groups are divided into two groups depending on whether or not they had surgery. As shown neither recurrence free survival model stratified the four groups and thus neither model is predictive of improved survival rates after surgery.
- Figures 7a to 13d confirm that the biomarker only models described above outperform the model using clinical only factors as well as the model using a combination of both biomarker and clinical factors.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Hospice & Palliative Care (AREA)
- Biophysics (AREA)
- Oncology (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
We describe a method for predicting a level of risk for a patient with oropharyngeal squamous cell carcinoma and use of SURVIVIN, p16, high risk HPV and tumour infiltrating lymphocytes (TILs)in a method for predicting a survival outcome for a patient with oropharyngeal squamous cell carcinoma or a method for determining a treatment for a patient with oropharyngeal squamous cell carcinoma. The method for predicting a level of risk comprises providing a tissue sample from the patient and determining the presence of at least four biomarkers in said sample. For each of the at least four biomarkers, the method further comprises determining a score indicative of a level of the biomarker within the tissue sample, calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; and comparing the riskscore to a threshold to predict whether the patient is high risk. The four biomarkers are selected from the group consisting of SURVIVIN, PLK1, p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
Description
METHOD OF PREDICTING SURVIVAL RATES FOR OROPHARYNGEAL
CANCER PATIENTS
FIELD
[01] The invention relates to a method of predicting treatment selection and/or survival rates for cancer patients, particularly those with oropharyngeal squamous cell carcinoma.
BACKGROUND
[02] Over the past two decades, there has been a remarkable change in the epidemiology and aetiology of oropharyngeal squamous cell carcinoma (OPSCC) which is now one of the most rapidly rising cancers in the Western world. This increasing incidence has been attributed to oncogenic human papillomavirus (HPV) infection (for example as evidenced in “Human papillomavirus and rising oropharyngeal cancer incidence in the United States" by Chaturvedi et al published in J Clin Oncol 2011 ;29:4294-4301). Patients with HPV-related OPSCC appear to be demonstrating better survival outcomes than those with HPV-negative cancers at the same site (for example as evidenced in “Tumor stage, human papillomavirus and smoking status affect the survival of patients with oropharyngeal cancer: an Italian validation study” by Granata et al published in Ann Oncol. 2012;23(7):1832-7 or“Human Papillomavirus and Survival of Patients with Oropharyngeal Cancer” by Ang et al published in NEJM 2010;363:24-35).
[03] There are two effective treatments: chemoradiotherapy or surgery (which may be followed by radiotherapy or another adjuvant treatment). Selection is usually guided by surgical resectability, clinician preference and patient choice.
[04] Methods for determining the prognosis of a subject with OPSCC have been developed, for example as described in US9809857 to Washington University which uses miRNAs. EP2780714 describes methods and kits for the evaluation of risk of head and neck squamous cell carcinoma.
[05] Other examples of prognosis methods for patients with OPSCC use classifiers including some of p16, HPV status, clinical stage, smoking history, alcohol consumption and comorbidities are described in“Human papillomavirus and survival of patients with oropharyngeal cancer” by Ang et al in N Engl J Med 2010;363:24-35;“Tumor stage, human papillomavirus and smoking status affect the survival of patients with oropharyngeal cancer: an Italian validation study” by Granata et al in Ann Oncol. 2012;23(7):1832-7;“Human papillomavirus detection and comorbidity: critical issues in selection of patients with oropharyngeal cancer for treatment De- escalation trials" by Rietbergen et al in Ann Oncol. 2013;24(11):2740-5;“Different prognostic models for different patient populations: validation of a new prognostic model for patients with oropharyngeal cancer in Western Europe" by Rietbergen et al in Br J Cancer. 2015;112(11):1733-6; “Refining American Joint Committee on Cancer/Union for International Cancer Control TNM stage and prognostic groups for human papillomavirus-related oropharyngeal carcinomas" by Huang et al in J Clin Oncol. 2015 Mar 10;33(8):836-45 and
“Development and validation of a staging system for HPV-reiated oropharyngeal cancer by the International Collaboration on Oropharyngeal cancer Network for Staging (ICON-S): a multicentre cohort study” by O-Sullivan et al in Lancet Oncol. 2016;17(4):440-51 .
[06] “Prognostic biomarkers of survival in oropharyngeal squamous cell carcinoma: systematic review and meta-analysis" by Rainsbury et al in Head Neck. 2013;35(7):1048-55 describes how several other biomarkers also encode prognostic information for OPSCC. However, the included studies were found to be generally under-powered, with uncertainties around validation and reproducibility, differing scoring methods and lack of consensus over ‘cut off points. “Recurrent Squamous-Cell Carcinoma of the Head and Neck" by Ferris et al in New England Journal of Medicine. 2016 Nov 10;375(19):1856-67 describes other biomarkers which have not yet been validated, for immunotherapy in the palliative treatment of patients with recurrent and/or metastatic head and neck cancer.
[07] “Human papillomavirus-associated oropharyngeal cancer: defining risk groups and clinical trials" by Bhatia et al in J Clin Oncol. 2015;33(29):3243-50 also explains how to stratify and recruit patients into clinical trials.
[08] WO2016/094330 describes computer-implemented machine learning methods for assessing a likelihood that a patient has a disease. US2008/0133141 describes methods for scoring one or more biomarkers in a test sample and determining a subject’s risk of developing a medical condition.
[09] The present applicant has recognised the need for additional modelling to assist clinicians when deciding on treatment.
SUMMARY
[10] According to the present invention there is provided an apparatus and method as set forth in the appended claims. Other features of the invention will be apparent from the dependent claims, and the description which follows.
[1 1 ] We describe a method for predicting a level of risk for a patient with oropharyngeal squamous cell carcinoma, the method comprising: providing a tissue sample from the patient; determining the presence of at least four biomarkers in said sample; for each of said at least four biomarkers, determining a score indicative of a level of the biomarker present within the tissue sample; calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; and comparing the riskscore to a threshold to predict whether the patient is high risk; wherein the four biomarkers are selected from the group consisting of SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
[12] We also describe a method for predicting a level of risk for a patient with oropharyngeal squamous cell carcinoma, the method comprising: providing a tissue sample from the patient;
for each of at least four biomarkers, determining the presence of at least four biomarkers in said sample; determining a score indicative of a level of the biomarker present within the tissue sample; calculating a riskscore based on the determined scores; and comparing the riskscore to a threshold to predict whether the patient is high risk; wherein the four biomarkers are selected from the group consisting of SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
[13] Where a weighted sum is used, the weighted sum may be a linear equation known as the Cox proportional Hazards model, thus in general terms, the riskscore may be expressed as :
riskscore = b^x^ + b2x2i+· +bnxnl
where there are a set of n covariates (cgί,cΆ,...,chi ) for each patient / and a set (b ,b2,...,bn) of weights which optimise the model for all patients. The covariates are the biomarker scores for the four selected biomarkers and the weights are the weights associated with each covariate for each biomarker. A biomarker may be defined as a biological molecule or cell found in blood, other body fluids, or tissues that is a sign of a normal or abnormal process, or a condition or disease.
[14] The values for each biomarker score (i.e. covariate) and weight for each biomarker may be determined by training the model on a set of training data comprising information about the values of a plurality of biomarkers in a set of patients and the associated outcomes for each patient. The plurality of biomarkers comprises at least the biomarkers listed above and may further comprise additional biomarkers such as cyclinDI , EGFR external, BCL2, COX2 and HIF1 alpha. The method may further comprise selecting the plurality of biomarkers to be used when training the model. The biomarkers which are selected to be included when training the model may influence the effectiveness of the prediction using the riskscore. For example, when different biomarkers are used to train the model the weighting for the biomarkers which are included in the riskscore may be different and/or the biomarkers which are ultimately included in the riskscore may be different. Although combinations of biomarkers have previously been suggested for other illnesses, the particular combination which is used in the riskscore is particular tumour (OPC) and the particular combination which is used in the training are not known.
[15] The threshold may be the median riskscore for the training data. Appropriate selection of the threshold may also influence the effectiveness of the prediction using the riskscore. When the threshold is based on the training data, it will be appreciated that appropriate selection of the dataset which is used for training is important to provide a useful threshold.
[16] The training may be done using the same initial steps as the prediction method, e.g. by providing a tissue sample and determining a score for each biomarker. These determined scores may then be used to build a model which is subsequently refined. Once the model is stable, the model may be output, i.e. the weightings for the biomarkers to be included in the model may be output.
[17] Determining a score indicative of a level of the biomarker may comprise determining a scaled intensity score, for example a Z-score which is a scaled version of an H-score (histoscore). The H-score is a known method of assessing the extent of a biomarker within a sample. This may be particularly appropriate where the level of the biomarker can be treated as a continuous variable. For example, the scaled intensity score may be the determined score for at least one of SURVIVIN and PLK1. Before the determined score is included as the biomarker score in the calculation of the riskscore, adjustments to the scaled intensity score may be made. For example, the biomarker score may be based on the scaled intensity score which has been adjusted by subtracting an adjustment factor. In other words, the biomarker score may be an adjusted determined score.
[18] Alternatively, determining a score indicative of a level of the biomarker may comprise awarding a first value when the level is above a threshold and a second value when the level is below the threshold. In other words, the determined score is binary - either having the first value or the second value - the first value may be considered to show a high level of the biomarker and the second value a low level. This may be appropriate for biomarkers which are treated as factors having a lowest category used as the baseline. For example, the first or second value may be the determined score for at least one of p16 and high risk HPV. The biomarker score may also be an adjusted determined score and the adjustment may provide biomarker scores which are based on the determined score and optimise the model.
[19] It may be necessary to include more than two categories of score but still wish to avoid a fully continuous scoring system. Accordingly, determining a score indicative of a level of the biomarker may comprise awarding a first value when the level is above an upper threshold, a second value when the level is below the upper threshold but above a lower threshold and a third value when the level is below the lower threshold. In other words, the first value may be considered to show a high level of the biomarker, the second value an intermediate level and the third value a low level. This may be an appropriate scoring method for TILs. Thus, the first, second or third value may be the determined score for TILs.
[20] Where the biomarker is classified into three (or more) categories for determined score, the individual weighted component for that biomarker may need to be separated into a plurality of components. For example, the weighted biomarker score for the biomarker TILs may comprise a sum of a first weighted biomarker score when TILs equals the second value and a second weighted biomarker score when TILs is equals the third value, e.g.
YSTILS=Y2STILS2+Y3STILS3
where g is the weight for the biomarker score of TILs (STILS) and g2 is the weight when TILs equals the second value and g3 is the weight when TILs equals the third value. In other words, the biomarker score included in the riskscore calculation may also be an adjusted determined score.
[21] The four biomarkers may be SURVIVIN, p16, high risk HPV and tumour infiltrating lymphocytes (TILs). When these four biomarkers only are selected, the method may include
calculating the riskscore which is an overall survival riskscore whereby the method is for predicting a level of risk for the patient in terms of their overall survival outcome. A high risk patient may thus be at high risk of not surviving within a time frame of interest, e.g. five years. By contrast, a low risk patient will have a better chance of surviving than a high risk patient within the same time frame. The method may further comprise determining a treatment. When the patient is classified as high risk, the treatment may be surgical. When the patient is not classified as high risk, e.g. is classified as low risk, the treatment may be a surgical or a non- surgical treatment. As explained below, such a method is particularly useful because before the work by the inventors, no predictive classifiers have yet been validated to select specific curative treatment regimens for individual patients with head and neck cancer.
[22] Thus, we also describe a method for assessing a patient with oropharyngeal squamous cell carcinoma for an appropriate treatment the method comprising: providing a tissue sample from the patient; determining the presence of at least for biomarkers in said sample; for each of said at least four biomarkers, determining a score indicative of a level of the biomarker present within the tissue sample, calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; comparing the riskscore to a threshold to predict whether the patient is high risk; and determining a treatment; wherein the four biomarkers are selected from the group consisting of SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
[23] The overall survival riskscore may be considered a primary riskscore and may for example be calculated by using a weighted sum of the scores for each biomarker, i.e.
primary riskscore = a Spl6 + b SHPV + ySms + 6Ssur where a is the weighting applied to Sp16 which is the biomarker score for the biomarker p16, b is the weighting applied to SHpv which is the biomarker score for the biomarker HPV, g is the weighting applied to STILS which is the biomarker score for the biomarker TILs and d is the weighting applied to Ssur which is the biomarker score for the biomarker SURVIVIN.
[24] The values of the weightings may be determined using statistical analysis, for example using the Cox proportional hazards modal above. The ranges of the weightings may be d between 0.2 to 0.9, more preferably between 0.35 to 0.5 and as an example 0.2071518. The weighting a may have a value ranging between -2.2 to -0.4, more preferably between -1.4 to - 0.4 and as an example -0.9652818. The weighting b may have a value ranging between -3 to - 0.5, more preferably between -1.85 to -1.05 and as an example -1.3976193. The weighting g2 may have a value ranging between -1.55 to -0.05, more preferably between -0.95 to -0.55 and as an example -0.6318398. The weighting g3 may have a value ranging between -2.7 to 0.05, more preferably between -1.7 to -1.05 and as an example -1.2751045. The weightings applied to p16, HPV and TILs are preferably negative because as described below the presence of these biomarkers is indicative of a low risk patient, i.e. indicative of a positive survival rate. By
contrast, the weighting applied to SURVIVIN is preferably positive to reflect that the presence of this biomarker is indicative of a high risk patient. All references to ranges herein are inclusive of the endpoints.
[25] Alternatively, the four biomarkers may be SURVIVIN, p16, PLK1 and tumour infiltrating lymphocytes (TILs). When these four biomarkers are used, calculating the riskscore comprises calculating a recurrent free survival riskscore whereby the method is for predicting a recurrent free survival outcome. In this model, a high risk patient may thus be at high risk of not surviving without any recurrence of the cancer within a time frame of interest which may be the same as the time frame used above. By contrast, a low risk patient will have a better chance of surviving without a repetition of the illness than a high risk patient. The recurrent free survival riskscore may be termed a secondary riskscore can be calculated using the following model which uses four biomarkers, three of which overlap with the previous model and one of which is new:
secondary riskscore = oc Spl6 + YST[LS + 5Ssur + eSPLK1
[26] where a is the weighting applied to Sp16 which is the biomarker score for the biomarker p16, g is the weighting applied to STILS which is the biomarker score for the biomarker TILs, 5 is the weighting applied to Ssur which is the biomarker score for the biomarker SURVIVIN and e is the weighting applied to SPLKi which is the biomarker score for the new biomarker PLK1. The ranges of the weightings given above may also apply to this model. The specific values for each weighting may be a = -1.4105715, d = 0.2835425, e = -0.2540380, g2 = -0-4883246, g3 = - 1.1510593. As before, the weightings applied to p16 and TILs are preferably negative. This is also true for PLK1 . By contrast, the weighting applied to SURVIVIN is preferably positive.
[27] Said high risk HPV may be selected from one or more of HPV-16, -18, -31 , -33, -35, -39, - 45, -51 , -52, -56, -58, -6619. The presence of said high risk HPV may be determined using High risk HPV in-situ hybridization (HR- HPV - ISH). The presence of p16, surviving and/or PLK1 is determined using immunohistochemistry. p16 may also be known as CDKN2A.
[28] We also describe use of SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs) as biomarkers in a method for predicting a survival outcome for a patient with oropharyngeal squamous cell carcinoma, the method comprising: providing a tissue sample from the patient; determining the presence of at least SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs) in said sample; for each of said at least four biomarkers, determining a score indicative of a level of the biomarker present within the tissue sample, calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; and comparing the riskscore to a threshold to predict whether the patient is high risk.
[29] We also describe use of SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs) as biomarkers in a method for determining a treatment for a patient with oropharyngeal squamous cell carcinoma, the method comprising with oropharyngeal squamous
cell carcinoma, the method comprising: providing a tissue sample from the patient; determining the presence of at least SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs) in said sample; for each of said at least four biomarkers, determining a score indicative of a level of the biomarker present within the tissue sample, calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; and comparing the riskscore to a threshold to predict whether the patient is high risk and determine the treatment.
[30] There may also be a computer device comprising at least one processor; and instructions that, when executed by the at least one processor cause the computer device to perform any of the determining, calculating and comparing steps of the methods described above. There may also be a tangible non-transient computer-readable storage medium having recorded thereon instructions which, when implemented by a computer device, cause the computer device to be arranged as described above and/or which cause the computer device to perform any of the relevant steps of the methods as described above. There may also be a kit comprising a microarray for the tissue sample and/or one or more reagents to determine the presence of at least one biomarker selected from SURVIVIN, PLK1 , p16, high risk HPV and TILs and optionally the computer device. For example, the kit comprises reagents to determine the presence of all of SURVIVIN, PLK1 , p16, high risk HPV and TILs. Suitable reagents are known to the skilled person and include reagents for staining tissue to assess the level of the biomarker as described herein. The kit may be used for predicting a level of risk for a patient with oropharyngeal squamous cell carcinoma and/or determining a treatment for a patient with oropharyngeal squamous cell carcinoma.
[31 ] Although a few preferred embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes and modifications might be made without departing from the scope of the invention, as defined in the appended claims.
BRIEF DESCRIPTION OF DRAWINGS
[32] For a better understanding of the invention, and to show how embodiments of the same may be carried into effect, reference will now be made, by way of example only, to the accompanying diagrammatic drawings in which:
[33] Figure 1 a is a flowchart showing the steps of the prediction method;
[34] Figure 1 b is a schematic box diagram showing the components of the system for implementing the prediction method;
[35] Figures 2a to 2I are photomicrographs showing examples of stained tumours;
[36] Figure 3 outlines the nature of the data which was used to develop the models used in the prediction method of Figure 1 a;
[01 ] Figure 4 is a flowchart showing the steps of the training method;
[37] Figure 5 is a chart showing the H-score allocated to various biomarkers in a cohort of test samples;
[38] Figure 6 is a box plot of the coefficients fitted to each biomarker when training the model;
[39] Figures 7a and 7b are calibration plots for the overall survival model using the training cohort data and the validation data respectively;
[40] Figures 7c and 7d are calibration plots for the recurrence free survival model using the training cohort data and the validation data respectively;
[41 ] Figures 8a and 8b are calibration plots for the overall survival model for the combined biomarkers and clinical factors using the training cohort data and the validation data respectively;
[42] Figures 9a to 9c show the probability of survival for low and high risk patients over a period of 5 years using the overall survival models having clinical factors only, biomarkers only and combined factors respectively where the data is from the validation cohort;
[43] Figure 9d shows the probability of survival for low and high risk patients over a period of 5 years using the overall survival model having biomarkers only where the data is from the training cohort;
[44] Figures 10a to 10c show the probability of survival for low, intermediate and high risk patients over a period of 5 years using the overall survival models having clinical factors only, biomarkers only and combined factors respectively where the data is from the validation cohort;
[45] Figures 1 1 a to 1 1 c show the probability of survival for the patients shown in Figures 9a to 9c separated into patients who did and did not have surgery where the data is from the validation cohort;
[46] Figure 1 1 d shows the probability of survival for low and high risk patients over a period of 5 years using the overall survival model having biomarkers only where the data is from the training cohort and is separated into those who have or have not had surgery;
[47] Figures 12a and 12b show the probability of survival for low and high risk patients over a period of 5 years using the recurrence free survival models having biomarkers only and combined factors respectively where the data is from the validation cohort;
[48] Figures 12c and 12d show the probability of survival low, intermediate and high risk patients over a period of 5 years using the recurrence free survival models having biomarkers only and combined factors respectively where the data is from the validation cohort;
[49] Figures 12e and 12f show the probability of survival for the patients shown in Figures 12a and 12b separated into patients who did and did not have surgery where the data is from the validation cohort;
[50] Figures 13a and 13b show the probability of survival for low and high risk patients over a period of 5 years using the recurrence free survival models having clinical markers only where the data is from the validation cohort; and
[51 ] Figures 13c and 13d show the probability of survival for the patients shown in Figures 13a and 13b separated into patients who did and did not have surgery where the data is from the validation cohort.
DESCRIPTION OF EMBODIMENTS
Prediction Methods
[52] Figure 1 a shows the steps which may be carried out in a prediction method. The first step is to be provided a tissue sample (step S100). The tissue sample may be obtained using any appropriate method, e.g. from a donor sample obtained using a biopsy. A tissue sample may thus be processed and fixed.
[53] The tissue sample may then be stained with appropriate reagents (step S102), for example haematoxylin and eosin for assessing the level of tumour infiltrating lymphocytes (TILs). Figures 2a to 2I are photomicrographs showing examples of stained tumours, for each of the four biomarkers: p16, HPV, SURVIVIN and TILs and PLK-1 . More information on the genes p16, SURVIVIN and PLK-1 is shown in the table below. HPV types are further described below.
[54] The staining is used to determine the presence of the biomarker and to assist in the next step which is to determine the score for each biomarker (step S104). In this example, the scoring was done by pathologists who had attended a training and calibration meeting and had undergone certification by scoring three hundred test samples. Preferably more than one, e.g. at least three pathologists, independently determine the score. It will be appreciated that other methods of scoring, e.g. automated methods, could also be used.
[55] Figure 2a shows a tissue sample with a tumour showing strong and diffuse nuclear and cytoplasmic staining for p16. Figure 2b shows a tissue sample with a tumour showing weak and diffuse cytoplasmic staining for p16. Figure 2c shows a tissue sample with a tumour with no staining for p16. In this example, the scoring for p16 is binary and thus either a positive or negative score is determined. A positive score is awarded if the tissue sample shows strong and diffuse nuclear and cytoplasmic staining present in >70% of the tumour. This threshold was selected based on a clinically validated cut off described in “Validation of methods for oropharyngeal cancer HPV status determination in US cooperative group trials" by Jordan et al published in Am J Surg Pathol. 2012;36(7):945-54. As outlined in“Human papillomavirus and
survival of patients with oropharyngeal cancer " by Ang et al published in N Engl J Med 2010;363:24-35, this threshold can be equated with a H-score defined by:
H-score > 2 x intensity x 70%
H-score > 140%
The sample in Figure 2a is thus awarded a positive score and the samples shown in Figures 2b and 2c are awarded a negative score. The scoring for p16 may thus be as summarised in the table below:
In a similar manner, High risk HPV in-situ hybridization (HR- HPV - ISH) may also be scored in a binary fashion, i.e. positive or negative. For example, High risk HPV DNA in situ hybridization may be stained using a Ventana INFORMVIII Family 16 probe to detect HPV-16, -18, -31 , -33, - 35, -39, -45, -51 , -52, -56, -58, -66. Figure 2d shows a tissue sample with a High-risk HPV tumour showing diffuse nuclear and cytoplasmic staining. Figure 2e shows a tissue sample with a High-risk HPV tumour showing punctate nuclear staining. Figure 2f shows a tissue sample with a High-risk HPV tumour with no staining. The samples in Figures 2d and 2e are thus awarded a positive score and the sample shown in Figure 2f is awarded a negative score. The scoring for HPV-ISH may thus be as summarised in the table below:
[56] By contrast, the SURVIVIN immunochemistry was assessed by assigning an intensity score selected from 0 for no staining, 1 for weak staining, 2 for moderate staining and 3 for strong staining together with a percentage of malignant cells stained at each intensity. The percentages may be rounded up or down to the nearest 5% to simplify the calculations. The H- score may then be calculated by using a weighted sum of the percentages, e.g.:
H-score= 0 x a% + 1 x b% + 2 x c% + 3 x d%
where a% is the percentage of cells with no staining; b% is the percentage of cells with weak staining; c% is the percentage of cells with moderate staining; and d% is the percentage of cells with strong staining. The value of the H-score will thus be between 0 and 300. Such a method is described in more detail in “Validation of methods for oropharyngeal cancer HPV status determination in US cooperative group trials" by Jordan et al published in Am J Surg Pathol. 2012;36(7):945-54. As examples, Figures 2g, 2h and 2i shows tissue samples with high,
medium and low H scores respectively for the SURVIVIN immunohistochemistry. The scoring for SURVIVIN may thus be as summarised in the table below:
[57] Once the H score has been calculated for the SURVIVIN immunohistochemistry, this may be used to calculate a Z-score. The first step is to scale the H-score so that the value is between 0 to 10. For the SURVIVIN immunohistochemistry this is achieved by dividing the H- score by 30 to generate an X-score. The next transformation allows the input score to the comparable to scores arising from different clinical cohorts and distributions. This is achieved by calculating an X-score for each sample in the cohort of samples. For example, in a cohort of 356 samples, it was found that the H-score ranged from 0 to 192.5 as shown in the upper part of Figure 2m. Thus the X-score ranged from 0 to 6.42. The mean X-score m and the standard deviation s for the X-scores for the cohort together with the new value, are then calculated. The Z-score for each sample / is calculated as:
zscore , = xscore,— m
s
[58] The method of calculating an H-intensity score and conversion to a Z-score may also be followed for an additional biomarker PLK-1 where this biomarker is incorporated, e.g. as explained in more detail below. For example, in a cohort of 335 samples, it was found that the H-score ranged from 0 to 165 as illustrated in the lower portion of Figure 2m. Thus the X-score ranged from 0 to 5.5. The mean X-score m and the standard deviation s for the X-scores for the cohort, including the latest value, are then calculated so that the Z-score can be calculated as above.
[59] The score for tumour infiltrating lymphocytes (TILs) was determined for example as described in“Tumour-infiltrating lymphocytes predict for outcome in HPV-positive oropharyngeal cancer J’ by Ward et al published in Br J Cancer. 2014;110(2):489-500. The scanning magnification was increased (e.g. to 2.5 times objective) and the samples were assigned to one of the following three categories: high, moderate or low as shown in the table below:
As examples, Figures 2j, 2k and 2I shows tissue samples with high, moderate and low scores for tumour infiltrating lymphocytes (TILs).
[60] As set out above, trained pathologists (or automated assessment) may be used to determine the score of each antibody. Although the scoring may be considered to be subjective, in fact research shows that the scoring is consistent, including for the antibodies such as PLK1 and Survivin which are described by continuous scoring. For example, the table below shows that there is a high correlation between pathologists on the antibodies and stains used. Although the individual scores by the different pathologists may vary, the variation is not high:
[61] Returning to Figure 1 a, the next step in the method is to determine the riskscore using the determined scores for each of the biomarkers (step S106). The riskscore is determined by inputting the determined scores into one of two models which have been developed as explained in more detail below. The first model determines a primary riskscore which may be termed an overall survival (OS) riskscore and which may be calculated as the time from diagnosis until death or censored at the last clinical contact. The second model determines a secondary riskscore which may be termed a recurrence free survival riskscore and which may be calculated as the time from diagnosis until the date when loco-regional recurrence or distant metastasis was first confirmed by radiology and/or biopsy. For consistency, survival time was truncated, for example at 5 years.
[62] The primary riskscore may for example by calculated by using a weighted sum of the scores for each biomarker, i.e.
primary riskscore = a Spl6 + b SHPV + ySms + 6Ssur where a is the weighting applied to Sp16 which is the adjusted score for the biomarker p16, b is the weighting applied to SHpv which is the adjusted score for the biomarker HPV, g is the weighting applied to STILS which is the adjusted score for the biomarker TILs and 5 is the weighting applied to Ssur which is the adjusted score for the biomarker SURVIVIN.
[63] Survivin is treated as a continuous variable. Before using the determined score in the calculation of the riskscore, the final value of the determined score for the SURVIVIN immunohistochemistry is adjusted by subtracting an adjustment factor, e.g. 0.1319303, before the weighting is applied.
[64] Each of the other biomarkers is treated as a factor using the lowest category as a baseline group, i.e. a baseline score is assigned to the baseline group and a second score is assigned to the other group. The magnitude of the baseline and second scores sums to 1 when the Cox proportional hazards model is used as described in more detail below and it is the baseline or second score which is used as the adjusted score in the calculation of the riskscore. For example for p16, there are two options: a positive score for strong evidence of p16 (simplified to p16=1 ) and a negative score for weak or no evidence of p16 (simplified to p16=0, i.e. the baseline value). Thus, p16 has a baseline score of - 0.6603053 and a second score of 0.3396947. The biomarker HPV is treated in a similar manner to p16 and the values are shown below. The table below shows examples of determined scores, scores adjusted for the riskscore calculation, weightings for each biomarker score and the overall contribution to the riskscore for each biomarker:
[65] For TILs, there are three determined scores - high, moderate and low - as explained above. When using the Cox proportional hazards model as described below, it is necessary to include TILs twice in the model - a first time where the baseline group is for a score which is not moderate, i.e. not TILs = 2 and a second time where the baseline group is for a score which is not high, i.e. not TILs = 3. In other words, the weighting of the score for TILs can be separated into two components:
YSTILS-Y2STILS2+Y3STILS3
[66] The calculation of the riskscore can thus be expressed as:
( _ (—0.6603053, pl6 = 0\
nskscore = —0.9652818 x j
\ f 0.339694/, pl6 = l)
+ (0.2071518 x (Survivin z score— 0.1319303))
0.3435115, HPVISHHR = 0\
+ (-1.3976193 X j
0.6564885, HPVISHHR = l)
/ f— 0.4885496, TILS = 1 \
+ -0.6318398 X 0.5114504, TILS = 2
V (-0.4885496, TILS = 3 /
/ /— 0.3206107, TILS = 1\
+ -1.2751045 X -0.3206107, TILS = 2
V ( 0.6793893, TILS = 3/
[67] As explained in more detail below, the model was developed as a Cox proportional hazards model and thus the outcome is interpreted as a hazards ratio which represents an incremental increase in the hazard (e.g. death for the first model). Positive coefficients such as those for negative scores for p16 and High risk HPV in-situ hybridization and for low and moderate scores for TILs indicate that these scores are associated with a poor outcome (i.e. low overall survival riskscore). Similarly, negative coefficients such as those for positive scores for p16 and High risk HPV in-situ hybridization and for a high score for TILs indicate that these scores are associated with a good outcome (i.e. high overall survival riskscore). The model is validated in more detail as explained below but as an indication, the expression of the model is in line with the results in the table below which show the mean or median 3 and 5 year overall survival (OS) rates for the subjects with different combinations of biomarker signatures. The table also indicates the overall survival rates based on the nature of the treatment:
[68] As an alternative, or in addition to the primary riskscore, a secondary riskscore can be calculated using the following model which uses four biomarkers, three of which overlap with the previous model and one of which is new: secondary riskscore = a Spl6 + ySms + 6Ssur + cSPLK1
where a is the weighting applied to Sp16 which is the adjusted score for the biomarker p16, g is the weighting applied to STILS which is the adjusted score for the biomarker TILs, 5 is the weighting applied to Ssur which is the adjusted score for the biomarker SURVIVIN and e is the weighting applied to SPLKi which is the adjusted score for the new biomarker PLK1 .
[69] As in the previous model, SURVIVIN and PLK1 are treated as continuous variables using their Z-scores. In other words, for each biomarker a scaled H-score is calculated by dividing the H-score by 30 to generate an X-score. The new X-score for this sample is added to the sum of the X scores for all samples in the reference cohort of samples. The mean X-score m and the standard deviation s for the X-scores for the cohort together with the new addition are then calculated. The Z-score for each sample / is calculated as:
zscore = xscore,— m
s
[70] The final value of the determined score (i.e. the z-score) for the SURVIVIN immunohistochemistry may be adjusted by subtracting an adjustment factor, e.g. 0.1 1409913, before the weighting is applied. Similarly, the final value of the determined score for PLK1 may be adjusted by subtracting an adjustment factor, e.g. 0.07201744 before summing using the weighting. The table below shows examples of determined scores, scores adjusted for the riskscore calculation, weightings for each biomarker score and the overall contribution to the riskscore for each biomarker:
[71 ] As in the first model, the weighting of the score for TILs can be separated into two components:
YSTILS-Y2STILS2+Y3STILS3
[72] The model can thus be expressed in detail as:
( (—0.63793103, pl6 = 0\
nskscore - ( 1.4105715 x j 0.3620690, pl6 = lj
+ (-0.2540380 X ( PLK1 zscore - 0.07201744))
+ (0.2835425 x (Survivin z score— 0.11409913))
/ (—0.46120690, TILS = 1\
+ -0.4883246 X 0.53879310, TILS = 2
\ (-0.46120690, TILS = 3/
/ (-0.32327586, TILS = 1\
+ -1.1510593 X -0.32327586, TILS = 2
V ( 0.67672410, TILS = 3/
[73] Returning to Figure 1 a, once the riskscore has been determined, it is compared to a threshold (step S108). If the riskscore is equal to or above the threshold, there is deemed to be a high risk that the patient will not survive and thus the patient is classified as a high risk patient (step S1 10). Conversely if the riskscore is below the threshold, there is deemed to be a low risk that the patient will not survive and thus the patient is classified as a low risk patient (step S1 12). The threshold may for example be the median riskscore of the data which was used to train the model, for example 0.952636 which has been determined as explained below.
[74] As an alternative to step S108, the riskscore may be compared to an upper and a lower threshold. If the riskscore is equal to or above the upper threshold, the patient is classified as a high risk patient. If the riskscore is below the lower threshold, the patient is classified as a low risk patient. If the riskscore is between the two thresholds, the patient is classified as an
intermediate risk patient. The upper and lower thresholds may be determined as the tertiles of the riskscores determined from the training cohort as explained below.
[75] Once the riskscore has been determined, this may optionally be used to decide on the most appropriate treatment. For example, for a high risk patient, surgery is recommended. Furthermore, the surgery should be combined with either postoperative radiotherapy or postoperative chemoradiotherapy depending on the post-operative pathological conditions. Such treatment results in an improved overall survival rate than chemotherapy alone. By contrast, for a low risk patient, the treatment can be selected from either primary chemotherapy or the combined surgical approach specified above. Both options are equally effective in such cases.
[76] As set out above, the trained pathologists provide highly correlated scores. Nevertheless, the impact of variation in scoring may be reduced further using the method described above. For example for Survivin, the method of scaling the H-intensity score, conversion to a Z-score and correction by a scalar may help to reduce the risk of any discrepancies occurring based on a different score being awarded. For example, the scaled H scores, the Z scores and adjusted Z scores (subtracting an adjustment factor, e.g. 0.1319303) are shown below for two original differing values. It will be appreciated that the division by 30 and transformation into a Z score reduces the impact of the different classifications:
[77] Furthermore, the table below shows the impact of the different scores in the overall riskscore. The other scores are kept the same as p16=1 (i.e. positive), HPV-ISH= 1 (i.e. positive) and TILS is 3 (i.e. high). The riskscore calculated using the weights above (which have been derived from training data) are shown below
The threshold which is also determined from the training data defines a low risk patient as one having a riskscore below 0.952636 and thus in both cases, the classification is as a high risk patient. The riskscore calculation thus minimises the effect of an incorrect classification by a pathologist of one of the biomarkers.
[78] A schematic of an associated system for performing the method is shown in Figure 1 b. The system comprises a computing device 10 which could be a handheld device which is portable for a clinician to transport from patient to patient and an app could be loaded onto the device for calculation of the riskscore. The computing device 10 comprises the standard
components such as a processing unit or processor 20, a user interface unit 22 for allowing a user to input information, e.g. determined scores, and a memory 24 for storing the code to perform the calculation and/or the threshold for comparing the calculated riskscore. The user interface may display information or alternatively, there may be a display 24 for displaying information to a user, e.g. a calculated riskscore and/or a suggestion for treatment as described above and a communications module 28 for communicating with other devices and/or accessing the cloud 40, e.g. to process the riskscore. The tissue sample 30 is also shown schematically.
[79] This schematic system may be constructed, partially or wholly, using dedicated special- purpose hardware. Terms such as‘module’ or‘unit’ used herein may include, but are not limited to, a hardware device, such as circuitry in the form of discrete or integrated components, a Field Programmable Gate Array (FPGA) or Application Specific Integrated Circuit (ASIC), which performs certain tasks or provides the associated functionality. In some embodiments, the described elements may be configured to reside on a tangible, persistent, addressable storage medium and may be configured to execute on one or more processors. These functional elements may in some embodiments include, by way of example, components, such as software components, object-oriented software components, class components and task components, processes, functions, attributes, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables. Although the example embodiments have been described with reference to the components discussed herein, such functional elements may be combined into fewer elements or separated into additional elements.
Training methods
[80] Figure 3 illustrates some of the nature of the data which was used to train and then validate the model above. In total, data for 985 subjects was used. The data includes information from a retrospective longitudinal cohort of consecutive patients 18 years of age or more, who had OPSCC treated with curative intent between 1 January 1999 and 31 December 2012, with either chemo/radiotherapy or surgery with or without radiotherapy/ chemoradiotherapy. For the training cohort, 600 cases from nine secondary care head and neck cancer treatment centres in the UK and Poland were used. An independent cohort of 385 consecutive OPSCC patients undergoing curative treatment between 2002 and 2011 was used for validation. These 385 cases were collated as part of the HPV UK Prevalence study (“ HPV - Related Oropharynx Cancer in the United Kingdom: An Evolution in the Understanding of Disease Etiology” by Schache et al published in Cancer Res. 2016 Nov 15;76(22):6598-6606. doi: 10.1158/0008-5472. CAN-16-0633. Epub 2016 Aug 28). These cases were collated from 3 different centres which were selected to ensure a mix of geographic locations, centre size and institutional treatment protocols. In other words, approximately 60% of the data was used to train the model and approximately 40% of the data was used to independently validate the model.
[81 ] It will be appreciated that a full set of data was not available for every patient included in the total of 985. For example, data on survival was available on 809 subjects, of whom 295 had died. Similarly, data was only available on recurrence for 756 subjects, of whom 207 cases had documented recurrence of their cancer. The median overall survival of the cohort was 8.8 years (95% confidence interval = 6.86-10.47) and median follow-up was 5.01 years (95% confidence interval = 4.69-5.18). 624 subjects were male and 225 were female, with a median age of 57 years (range 19-91 years). 501 subjects had T1 -T2 disease, and 373 had N2b-N3 disease. 439 received surgery+/- radiotherapy, 824 received radiotherapy: 440 received primary radical radiotherapy or chemoradiotherapy and 384 adjuvant radiotherapy. 474 received chemoradiotherapy, either as primary or adjuvant treatment.
[82] The characteristics of the training and validation cohorts are well matched as outlined above. However, there were differences in the proportion of treatments received in each set. The validation set comprised more cases receiving surgery (54.5% versus 45.1 %, p=0.029), fewer cases receiving radiotherapy (87% versus 94.5%, p<0.001) and fewer patients having chemotherapy (42.0% versus 67.5% p<0.001).
[83] The model was developed and validated as outlined in Figure 4. Both of the initial first and second models (i.e. to determine overall survival and recurrence free survival) included only biomarkers and in the first step S200 of Figure 4, the various biomarkers which could be included in the models are identified. A biomarker is a biological molecule or cell found in blood, other body fluids, or tissues that is a sign of a normal or abnormal process, or a condition or disease. The biomarkers which are considered are listed below together with an indication of the methods for creating the appropriate tissue samples (step S202) from which they can be scored. It is noted that TILs, although not listed in the table below, is also included as a biomarker as described above:
In the table above, IVD = In Vitro Diagnostic Device with CE marking, RUO = Research Use Only which refers to a reagent which was optimised for the study, RTU = Ready to Use (prediluted) and DAB = 3,3’-diaminobenzidine.
[84] The next step is to determine the score for each biomarker in each sample. When building the model, the biomarkers BCL2, COX2, Cyclin D1 , EGFR-external, HIF1 a, PLK1 and Survivin are treated as continuous variables and thus their score is determined as a Z-score as described above. The mean, median and distribution of the scores is summarised in the table below and the full distribution is shown in more detail in Figure 5.
Although the biomarker p16 is shown in Figure 5 and in the table above, like two other biomarkers - high risk HPV DNA ISH, and TILs - it is treated as a factor using the lowest category as a baseline group. The score for each of these factors is calculated as described above.
[85] Each of these variables has an unknown impact on the overall survival for each patient. The next step is to analyse the variables statistically to develop a model. For the subsequent analysis, the Cox model which is expressed by the hazard function denoted by h(t), is used. The hazard function can be interpreted as the risk of dying at time t. It can be estimated as follows:
h(f)=hQ(f)xexp(biXi+b2x2+...+bpXn) where t represents the survival time, h(f) is the hazard function determined by a set of n covariates (x1,x2,...,xn ), the set (b^ ,b2,...,bn) are weights (or coefficients) for each covariate and the term h0 is called the baseline hazard which corresponds to the value of the hazard if all the x, are equal to zero (the quantity exp(0) equals 1). The‘t’ in h(t) reminds us that the hazard may vary over time. However, the time variance can be removed so that the model can be rewritten in linear form by taking the log of the hazard ratio for patient / to the reference group and this may be written as:
This linear equation is known as the Cox Proportional Hazards model with a set of n covariates (xii,x2i , ...,Xni ) for each patient / and a set (b^ ,b2,...,bn) of weights which optimise the model for all patients.
[86] As an optional step (S206), univariate analysis (i.e. considering each variable in term) was carried out using the model above. For each variable, the coefficient was calculated together with the lower and upper limits for the 95% confidence interval around the coefficient (CI95L and CI95U respectively). The P-value is a measure of the statistical significance of the variable and is calculated either using the Wald-test or the Log-rank test. The Q-value is an adjusted P-value using the Benjamini & Hochberg method. These results are shown below:
[87] The table shows statistically significant associations (P<0.05) between OS (the primary outcome) and the following biomarkers factors: high risk HPV-DNA ISH, p16 = 1 , BCL-2, Cyclin D1 , EGFR external and TILs of 2 or 3. It is also noted that the same biomarkers, except BCL2 are also significantly associated with the secondary outcome of recurrence free survival as shown in the table below:
[88] The next step was to build a multivariate model which combined a plurality of variables (step S208) using the Cox proportional hazards model above. The stability of the variables was then tested by performing repeated iterations of bootstrapping on the training cohort. On each iteration, the Cox proportional hazards model is fitted with backwards refinement on each subset to generate a new set of coefficients and possibly a new selection of variables (Step S210). The variable selection was guided by the Akaike Information Criterion which is a known estimator of the relative quality of statistical models. Once the model was deemed stable (at step S212), e.g. by performing sufficient iterations (1000 in this example), the model was output (Step S214).
[89] For each bootstrap iteration, coefficients of the resulting variables selected by the model are displayed in box plots alongside a percentage inclusion rate for the number of iterations so far. Figure 6 shows an example of such a box plot displaying the range of fitted coefficients including 25th percentile (Q1), median and 75th percentile (Q3). The upper whisker of the box plots indicates: min (max(x), Q3+1.5xlQR) and the lower whisker of the box plots indicates: max (min(x), Q1-1.5xlQR) where IQR=Q3-Q1. Figure 6 confirms that the variables which have been selected in the output model for overall survival set out above are included in over 75% of all the iterations of the model. Multivariate modelling of the secondary outcome (i.e. recurrence free survival) showed that p16, PLK1 , SURVIVIN and TILs were statistically significant but that high-risk HPV DNA ISH was not.
[90] The preferred values of the optimal weightings and adjusted scores for each of the biomarkers which have been determined based on the training data are shown in the equation above. However, it will be appreciated that a different set of data may have given a different set of values. Figure 6 gives an indication of suitable ranges for these coefficients (i.e. weightings):
[91] The final outputs for each model are shown below in the standard format for a Cox proportional hazards model where coefficient shows the fitted coefficients of the model, exp coeff is the exponent of the coefficients, i.e. the hazard ratio, se coeff is the standard error of the
coefficients, z is the Wald statistic z-score, pr is the probability value and the statistical significance of each variable is highlighted with stars (three being the maximum):
Overall survival model:
Recurrence free survival model:
[92] The next step (S214) is to validate the model and as explained above this was based on 385 independently collated cases. The validation repeated steps S100 to S106 as described in Figure 1 a above to generate a riskscore based on each model for each patient in the validation cohort. As demonstrated below, the overall survival model is prognostic and also shows a trend towards prediction of improved survival following surgical treatment for both high-risk patients and low risk patients. The recurrence free survival model is demonstrated below to be prognostic but not predictive for any specific treatment.
[93] In one part of the validation process, calibration plots, using bootstrapping, for both the overall survival and recurrence free models using both the training and validation sets of patients were generated and are shown in Figures 7a to 7d respectively. Visual inspection confirmed that all the models appear to be reasonably calibrated following the x=y line. Hazard regression (hare) from R package polspline (v1 .1 .12) was used for the estimation of survival probabilities. Calibration analyses were performed using R package rms (v4.5-0). These calibration tests reflect agreement between the outcome predictions and the observed outcomes.
[94] As a separate part of the validation process, each of the overall survival model and the recurrence free model using biomarkers only models were compared with the same models developed using clinical factors only and also a combination of biomarkers and clinical factors. The clinical factors were all treated as factors using the lowest category as baseline group and
were gathered were possible from patient records. The full set of clinical factors was identified as age, gender, T-category, N-category, smoking status which was classified as 0 = never smoked, 1 = previous smoker and 2 = current smoker, surgery, radiotherapy and chemotherapy. The T-category describes the primary tumour with category 0 meaning that there is no evidence of a primary tumour and categories 1 to 4 awarded depending on the size and amount of spread into nearby tissues. The N-category describes whether the cancer has spread into nearby lymph nodes with category 0 meaning that the nearby lymph nodes do not contain cancer and categories 1 to 3 are awarded depending on the size, location and number of nearby lymph nodes which are affected. The following table sets out the information for the optional univariate analysis using the Cox proportional hazards model for overall survival as described above in step S206 of Figure 4:
In the table and subsequent analysis, T=2 is T category 3 and 4 (i.e. high risk) and T=1 is T category 1 and 2 (i.e. low risk). Similarly, N=2 is T category 2c to 3 (i.e. high risk) and N=1 is N category 0 to 2b (i.e. low risk). A similar univariate analysis can be conducted for the recurrence free survival and the results are output below:
[95] As described in relation to steps S208 to S214 of Figure 2, the model for overall survival with only clinical factors was created, refined and output using a multivariate Cox proportional hazards model with backwards refinement and variable selection guided by Akaike Information Criterion. The output model included the factors, T2, N2, Smoking status =1 , smoking status = 2 and age at diagnosis. This model was generated using TNM 7 staging. Coefficients for each of the factors for the overall survival model are set out below:
[96] The same model for overall survival with only clinical factors was created, refined and output using TNM 8 staging. It included the factors, T2, N2, N1 , Smoking status =1 , smoking status = 2 and age at diagnosis with the N category refined to include an indication of p16. Coefficients for each of the factors for the overall survival model are set out below:
[97] Corresponding recurrence free survival models using only clinical factors with TNM 7 and TNM 8 staging respectively were also created, refined and output.
[98] Combined clinical factors and biomarker models were also generated. As described in relation to steps S208 to S214 of Figure 2, the model for overall survival with combined clinical factors and biomarkers was created, refined and output using a multivariate Cox proportional hazards model with backwards refinement and variable selection guided by Akaike Information Criterion. The output model included p16, high risk HPV DNA in-situ hybridisation, BCL2, HIF1 alpha, Survivin, TILs, T category, N category and Smoking status. Coefficients for each of the factors for the overall survival model are set out below:
[99] The information for the recurrence free survival model is set out below and as shown this does not include HPV, BCL2 or HIF1 a but includes PLK1 :
[100] Another stage of the validation of the models is to see which of the three types of models performs best. However, before the different types of the models are compared, the models could be validated by using calibration tests as described above in relation to Figures 7a to 7d. It will be appreciated that these calibration tests can be performed for all models but merely as an example, the calibration plots for the overall survival model and the recurrence free survival having the combined clinical and biomarker factors are shown in Figures 8a and 8b. Visual inspection confirms that the overall survival model is reasonably calibrated but the recurrence free survival model is not. Accordingly, this suggests that the latter model is not suitable for use.
[101] Figures 9a to 9c compare the performance of each of the overall survival models created using the clinical markers only, the biomarkers only and a combination of clinical and biomarkers. As shown in each of Figures 9a to 9c, the cohort is divided into two risk groups - low and high - by comparing the riskscore calculated by the model to a threshold equal to the median riskscore calculated for the training cohort. The high risk group contains patients having a calculated riskscore above or equal to the threshold. The low risk group contains patients having a calculated riskscore below the threshold. Figures 9a to 9c also show the Hazard Ratio (HR) for the high-risk group which uses the low-risk group as the baseline. The lower and upper
limits to the 95% confidence interval for the HR are shown in brackets. The value P is calculated using the Wald test. For each model the numbers of patients in each group at the beginning of each year is shown below each graph and reproduced in the table below:
[102] Each model successfully stratifies the two groups and is thus prognostic. However, the prognostic performance for the models using the biomarkers only and the combined factors are better than the prognostic performance for the model using clinical factors only. As explained above, the performance of the model on the validation cohort can also be compared with the performance of the model on the training cohort. Figure 9d thus shows the results of the predicted overall survival model using biomarkers only using the data from the patients in the training cohort. The model thus also stratifies the high and low risk groups for the patients in the training cohort. It will be appreciated that this comparison can also be done for the other models.
[103] Figures 10a to 10c also compare the performance of each of the overall survival models created using the clinical markers only, the biomarkers only and a combination of clinical and biomarkers. In this comparison, the patients are divided into three sets of risk groups - high risk, intermediate risk and low risk - by comparing the riskscore calculated by the model to thresholds calculated for the training cohort. The high risk group contains patients having a calculated riskscore above or equal to a threshold which is value of the riskscore at the lower end of the upper median of riskscores for the training group. The low risk group contains patients having a calculated riskscore below the threshold which is value of the riskscore at the upper end of the lower median of riskscores for the training group. The intermediate risk group contains patients having a calculated riskscore between the first and second thresholds.
[104] Figures 10a to 10c also show the Hazard Ratios (HR) for the intermediate and high-risk groups which use the low-risk group as the baseline. The lower and upper limits to the 95% confidence interval for the HR are shown in brackets. The value P is calculated using the Wald test and/or log-rank test. For each model the numbers of patients in each group at the beginning of each year is shown below each graph. As shown in the graphs, each model successfully stratifies the high, intermediate and low risk groups for the patients in the validation cohort and is thus prognostic for the overall survival model.
[105] Figures 11 a to 11 c also compare the performance of each of the overall survival models created using the clinical markers only, the biomarkers only and a combination of clinical and biomarkers. In this comparison, the patients are divided into four sets of risk groups. First the patients are divided into high risk and low risk in the same manner as for Figures 9a to 9c. Each of the high risk and low risk groups are then divided into those who had surgery (S1) and those who have not had surgery (S2). As shown in Figure 11 a, the clinical only model is not able to stratify the four groups and is thus not predictive for improved survival following surgery. By contrast as shown in Figure 11 b, the biomarker only model was able to clearly stratify the high risk group and show improved survival rates for those who performed surgery. Thus the biomarker only model is predictive of improved survival following surgical treatments for high risk patients. The biomarker and clinical factors model is also weakly predictive of improved survival following surgical treatments for high risk patients.
[106] Figure 11 d shows the performance of the overall survival model using the training data and this shows that the model is predictive of improved survival following surgical treatments for both high risk patients (HR=0.57, 95% Confidence Interval = 0.33-1.02, P=0.058) or low risk patients (HR=0.13, 95% Confidence Interval = 0.02-1.05, P=0.056).
[107] Figures 12a and 12b compare the performance of each of the recurrence free survival models created using the biomarkers only and a combination of clinical and biomarkers. In line with the arrangement in each of Figures 9a to 9c, the cohort is divided into two risk groups - low and high. Figures 12c and 12d compare the performance of each of the overall survival models created using the biomarkers only and a combination of clinical and biomarkers. In line with the arrangement in each of Figures 10a to 10c, the cohort is divided into three risk groups - low, intermediate and high. As shown each of the models is prognostic of recurrence free survival by successfully stratifying each of the three groups.
[108] Figures 12e and 12f compare the performance of each of the recurrence free survival models created using the biomarkers only and a combination of clinical and biomarkers. In line with the arrangement in each of Figures 11 a to 11 c, the patients are divided into four sets of risk groups: high risk with and without surgery or low risk with and without surgery. The model using biomarkers only is weakly predictive of improved survival for the high risk group after surgery but is not predictive for the low risk group. The model using a combination of factors does not stratify either the high risk or the low risk group and is thus not predictive of improved survival following surgery.
[109] Two recurrence free survival models using clinical only factors were developed with TNM7 staging and TNM8 staging respectively. Figures 13a and 13b compare the performance of each of the recurrence free survival models and as described above the cohort is divided into two risk groups - low and high. As shown in Figure 13a, the recurrence free survival model using TNM7 staging did not stratify the two groups whereas as shown in in Figure 13a, the
recurrence free survival model using TNM8 staging does stratify the two groups and is thus prognostic of survival rates.
[1 10] Figures 13c and 13d compare the performance of each of the recurrence free survival models using clinical only factors developed with TNM7 staging and TNM8 staging respectively. As described above, the low and high risk groups are divided into two groups depending on whether or not they had surgery. As shown neither recurrence free survival model stratified the four groups and thus neither model is predictive of improved survival rates after surgery.
[1 11] In summary, Figures 7a to 13d confirm that the biomarker only models described above outperform the model using clinical only factors as well as the model using a combination of both biomarker and clinical factors.
[1 12] Various combinations of optional features have been described herein, and it will be appreciated that described features may be combined in any suitable combination. In particular, the features of any one example embodiment may be combined with features of any other embodiment, as appropriate, except where such combinations are mutually exclusive. Throughout this specification, the term “comprising” or “comprises” means including the component(s) specified but not to the exclusion of the presence of others.
[1 13] Attention is directed to all papers and documents which are filed concurrently with or previous to this specification in connection with this application and which are open to public inspection with this specification, and the contents of all such papers and documents are incorporated herein by reference.
[1 14] All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
[1 15] Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.
[1 16] The invention is not restricted to the details of the foregoing embodiment(s). The invention extends to any novel one, or any novel combination, of the features disclosed in this specification (including any accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed.
Claims
1. A method for predicting a level of risk for a patient with oropharyngeal squamous cell carcinoma, the method comprising:
providing a tissue sample from the patient;
determining the presence of at least four biomarkers in said sample
for each of the at least four biomarkers, determining a score indicative of a level of the biomarker within the tissue sample,
calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; and
comparing the riskscore to a threshold to predict whether the patient is high risk;
wherein the four biomarkers are selected from the group consisting of SURVIVIN, PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
2. The method of claim 1 , wherein the associated weight for each of the biomarker scores for PLK1 , p16, high risk HPV and tumour infiltrating lymphocytes (TILs) has a negative value and the associated weight for the biomarker score for SURVIVIN has a positive value.
3. The method of claim 1 or claim 2, wherein the weighted sum for the riskscore is:
riskscore = b^x^ + b2x2i+· +bnxnl
where xr„x2j,...,xni are the biomarker scores for the four selected biomarkers for each patient / and b ,b2,...,bn are a set of associated weights for each biomarker score.
4. The method of claim 3, further comprising determining the weights for the weighted sum using a Cox proportional hazard model which is trained using training data comprising information on a plurality of biomarkers in a set of patients.
5. The method of claim 4, further comprising identifying the plurality of biomarkers to be used in the Cox proportional hazard model, wherein the plurality of biomarkers are selected from the group comprising SURVIVIN, PLK1 , p16, high risk HPV, tumour infiltrating lymphocytes (TILs), cyclinDI , EGFR external, BCL2, COX2 and HIF1 alpha.
6. The method of claims 4 or claim 5, wherein the threshold is the median riskscore for the training data.
7. The method of any one of the preceding claims, wherein determining a score indicative of a level of the biomarker comprises determining a scaled intensity score.
8. The method of claim 7, wherein the scaled intensity score is the determined score for at least one of SURVIVIN and PLK1 .
9. The method of claim 7 or claim 8, wherein the biomarker score is based on the scaled intensity score which has been adjusted by subtracting an adjustment factor.
10. The method of any one of the preceding claims, wherein determining a score indicative of a level of the biomarker comprises awarding a first value when the level is above a threshold and a second value when the level is below the threshold.
1 1 . The method of claim 10, wherein the first or second value is the determined score for at least one of p16 and high risk HPV.
12. The method of any one of the preceding claims, wherein determining a score indicative of a level of the biomarker comprises awarding a first value when the level is above an upper threshold, a second value when the level is below the upper threshold but above a lower threshold and a third value when the level is below the lower threshold.
13. The method of claim 12, wherein the first, second or third value is the determined score for TILs.
14. The method of claim 13, wherein the weighted biomarker score for the biomarker TILs comprises a sum of a first weighted biomarker score when TILs equals the second value and a second weighted biomarker score when TILs is equals the third value.
15. The method of any one of the preceding claims, wherein the four biomarkers are SURVIVIN, p16, high risk HPV and tumour infiltrating lymphocytes (TILs).
16. The method of claim 15, wherein the weighting applied to SURVIVIN is in the range 0.2 to 0.9, more preferably between 0.35 to 0.5.
17. The method of claim 15 or claim 16, wherein the weighting applied to p16 is in the range -2.2 to -0.4, more preferably between -1 .4 to -0.4.
18. The method according to any one of claims 15 to 17, wherein the weighting applied to high risk HPV has a value ranging between -3 to -0.5, more preferably between -1 .85 to -1 .05.
19. The method according to any one of claims 15 to 18, when dependent on claim 14, wherein the first weighting applied to TILs has a value ranging between -1 .55 to -0.05, more preferably between -0.95 to -0.55 and the second weighting applied to TILs has a value ranging between -2.7 to 0.05, more preferably between -1 .7 to -1 .05.
20. The method of any one of claims 15 to 19, wherein calculating the riskscore comprises calculating an overall survival riskscore whereby the method is for predicting whether a patient has a high risk of not achieving an overall survival outcome.
21 . The method of any one of claims 15 to 20, further comprising determining a surgical treatment.
22. The method of claim 21 , wherein said treatment is a surgical treatment when the patient is classified as high risk.
23. The method of any one of claims 1 to 14, wherein the four biomarkers are SURVIVIN, p16, PLK1 and tumour infiltrating lymphocytes (TILs).
24. The method of claim 23, wherein calculating the riskscore comprises calculating a recurrent free survival riskscore whereby the method is for predicting whether a patient has a high risk of not achieving a recurrent free survival outcome.
25. The method of claim 23 or claim 24, wherein the weighting applied to the biomarker score for p16 has a value of approximately -1 .41 , the weighting applied to the biomarker score for PLK1 has a value of approximately -0.25, the weighting applied to the biomarker score for SURVIVIN has a value of approximately 0.28, the first weighting applied to TILs is approximately -0.49 and the second weighting applied to TILs has a value of approximately - 1 .15.
26. The method of any one of the preceding claims wherein said high risk HPV is selected from one or more of HPV-16, -18, -31 , -33, -35, -39, -45, -51 , -52, -56, -58, -66.
27. The method of claim 26 wherein the presence of said high risk HPV is determined using High risk HPV in-situ hybridization (HR- HPV - ISH).
28. The method of any one of the preceding claims wherein the presence of p16, SURVIVIN and/or PLK1 is determined using immunohistochemistry.
29. Use of SURVIVIN, p16, high risk HPV and tumour infiltrating lymphocytes (TILs) as biomarkers in a method for predicting a survival outcome for a patient with oropharyngeal squamous cell carcinoma, the method comprising:
providing a tissue sample from the patient;
determining the presence of at least SURVIVIN, p16, high risk HPV and tumour infiltrating lymphocytes (TILs) in said sample;
for each of said at least four biomarkers, determining a score indicative of a level of the biomarker present within the tissue sample,
calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; and
comparing the riskscore to a threshold to predict whether the patient is high risk.
30. Use of SURVIVIN, p16, high risk HPV and tumour infiltrating lymphocytes (TILs) as biomarkers in a method for determining a treatment for a patient with oropharyngeal squamous cell carcinoma, the method comprising:
providing a tissue sample from the patient;
determining the presence of at least SURVIVIN, p16, high risk HPV and tumour infiltrating lymphocytes (TILs) in said sample;
for each of said at least four biomarkers, determining a score indicative of a level of the biomarker present within the tissue sample,
calculating a riskscore based on the determined scores, wherein the riskscore is calculated by summing weighted biomarker scores, wherein the biomarker scores are based on the determined scores and each biomarker score has an associated weight; and
comparing the riskscore to a threshold to predict whether the patient is high risk and determining a treatment.
31 . A kit comprising a microarray for a tissue sample and/or one or more reagents to determine the presence of at least one biomarker selected from SURVIVIN, PLK1 , p16, high risk HPV.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GBGB1808839.3A GB201808839D0 (en) | 2018-05-30 | 2018-05-30 | Method of predicting survival rates for cancer patients |
| GB1808839.3 | 2018-05-30 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019229431A1 true WO2019229431A1 (en) | 2019-12-05 |
Family
ID=62812233
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/GB2019/051464 Ceased WO2019229431A1 (en) | 2018-05-30 | 2019-05-29 | Method of predicting survival rates for oropharyngeal cancer patients |
Country Status (2)
| Country | Link |
|---|---|
| GB (1) | GB201808839D0 (en) |
| WO (1) | WO2019229431A1 (en) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080133141A1 (en) | 2005-12-22 | 2008-06-05 | Frost Stephen J | Weighted Scoring Methods and Use Thereof in Screening |
| WO2012094744A1 (en) * | 2011-01-11 | 2012-07-19 | University Health Network | Prognostic signature for oral squamous cell carcinoma |
| WO2013074793A1 (en) * | 2011-11-15 | 2013-05-23 | University Of Miami | Methods for detecting human papillomavirus and providing prognosis for head and neck squamous cell carcinoma |
| WO2016015059A1 (en) * | 2014-07-25 | 2016-01-28 | OncoGenesis Inc. | Systems and methods for early detection of cervical cancer by multiplex protein biomarkers |
| WO2016094330A2 (en) | 2014-12-08 | 2016-06-16 | 20/20 Genesystems, Inc | Methods and machine learning systems for predicting the liklihood or risk of having cancer |
| US9809857B2 (en) | 2013-02-28 | 2017-11-07 | Washington University | Methods and signatures for oropharyngeal cancer prognosis |
-
2018
- 2018-05-30 GB GBGB1808839.3A patent/GB201808839D0/en not_active Ceased
-
2019
- 2019-05-29 WO PCT/GB2019/051464 patent/WO2019229431A1/en not_active Ceased
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080133141A1 (en) | 2005-12-22 | 2008-06-05 | Frost Stephen J | Weighted Scoring Methods and Use Thereof in Screening |
| WO2012094744A1 (en) * | 2011-01-11 | 2012-07-19 | University Health Network | Prognostic signature for oral squamous cell carcinoma |
| WO2013074793A1 (en) * | 2011-11-15 | 2013-05-23 | University Of Miami | Methods for detecting human papillomavirus and providing prognosis for head and neck squamous cell carcinoma |
| EP2780714A1 (en) | 2011-11-15 | 2014-09-24 | University Of Miami | Methods for detecting human papillomavirus and providing prognosis for head and neck squamous cell carcinoma |
| US9809857B2 (en) | 2013-02-28 | 2017-11-07 | Washington University | Methods and signatures for oropharyngeal cancer prognosis |
| WO2016015059A1 (en) * | 2014-07-25 | 2016-01-28 | OncoGenesis Inc. | Systems and methods for early detection of cervical cancer by multiplex protein biomarkers |
| WO2016094330A2 (en) | 2014-12-08 | 2016-06-16 | 20/20 Genesystems, Inc | Methods and machine learning systems for predicting the liklihood or risk of having cancer |
Non-Patent Citations (15)
Also Published As
| Publication number | Publication date |
|---|---|
| GB201808839D0 (en) | 2018-07-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Zelic et al. | Predicting prostate cancer death with different pretreatment risk stratification tools: a head-to-head comparison in a nationwide cohort study | |
| Freedland et al. | Utilization of a genomic classifier for prediction of metastasis following salvage radiation therapy after radical prostatectomy | |
| Ross et al. | Tissue-based genomics augments post-prostatectomy risk stratification in a natural history cohort of intermediate-and high-risk men | |
| Lammers et al. | Development and validation of a scoring system to predict outcomes of patients with primary biliary cirrhosis receiving ursodeoxycholic acid therapy | |
| Arpino et al. | Gene expression profiling in breast cancer: a clinical perspective | |
| Witteveen et al. | Personalisation of breast cancer follow-up: a time-dependent prognostic nomogram for the estimation of annual risk of locoregional recurrence in early breast cancer patients | |
| Regierer et al. | An internally and externally validated prognostic score for metastatic breast cancer: analysis of 2269 patients | |
| Potdar et al. | Prognostic scoring systems in allogeneic hematopoietic stem cell transplantation: where do we stand? | |
| KR20160003124A (en) | Medical prognosis and prediction of treatment response using multiple cellular signalling pathway activities | |
| Bartlett et al. | Validation of the IHC4 breast cancer prognostic algorithm using multiple approaches on the multinational TEAM clinical trial | |
| AU2020256295A1 (en) | Assessing colorectal cancer molecular subtype and uses thereof | |
| Maltoni et al. | Cell-free DNA detected by “liquid biopsy” as a potential prognostic biomarker in early breast cancer | |
| Colomer et al. | Biomarkers in breast cancer 2024: an updated consensus statement by the Spanish Society of Medical Oncology and the Spanish Society of Pathology | |
| Yamamoto-Ibusuki et al. | Comparison of prognostic values between combined immunohistochemical score of estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2, Ki-67 and the corresponding gene expression score in breast cancer | |
| Ochiai et al. | Natural history of biochemical progression after radical prostatectomy based on length of a positive margin | |
| Champer et al. | Adherence to treatment recommendations and outcomes for women with ovarian cancer at first recurrence | |
| Kim et al. | Gene expression assay and Watson for Oncology for optimization of treatment in ER-positive, HER2-negative breast cancer | |
| Curtit et al. | Results of PONDx, a prospective multicenter study of the Oncotype DX® breast cancer assay: Real-life utilization and decision impact in French clinical practice | |
| CN113853444A (en) | Methods for predicting survival in cancer patients | |
| JP2024532762A (en) | Predicting Patient Response | |
| Eigentler et al. | Which melanoma patient carries a BRAF-mutation? A comparison of predictive models | |
| Lee et al. | Development and validation of a next-generation sequencing–based multigene assay to predict the prognosis of estrogen receptor–positive, her2-negative breast cancer | |
| Ledesma-Bazan et al. | Predicting prostate cancer progression with a Multi-lncRNA expression-based risk score and nomogram integrating ISUP grading | |
| WO2019229431A1 (en) | Method of predicting survival rates for oropharyngeal cancer patients | |
| Östlund et al. | Avoiding pitfalls in gene (co) expression meta-analysis |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19730446 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 19730446 Country of ref document: EP Kind code of ref document: A1 |