HK1179660A - Blood transcriptional signature of active versus latent mycobacterium tuberculosis infection - Google Patents
Blood transcriptional signature of active versus latent mycobacterium tuberculosis infection Download PDFInfo
- Publication number
- HK1179660A HK1179660A HK13106715.6A HK13106715A HK1179660A HK 1179660 A HK1179660 A HK 1179660A HK 13106715 A HK13106715 A HK 13106715A HK 1179660 A HK1179660 A HK 1179660A
- Authority
- HK
- Hong Kong
- Prior art keywords
- patient
- gene
- gene expression
- mycobacterium tuberculosis
- module
- Prior art date
Links
Description
Field of the invention
The present invention relates generally to the field of Mycobacterium tuberculosis (Mycobacterium tuberculosis) infection, and more particularly to methods, kits and systems for diagnosing, prognosing and monitoring active Mycobacterium tuberculosis infection and disease progression, before, during and after treatment, which disease is manifested as latent or asymptomatic.
Background
Without limiting the scope of the invention, its background is described in connection with the identification and treatment of mycobacterium tuberculosis infection.
Tuberculosis (PTB) is a leading and growing cause of worldwide morbidity and mortality caused by mycobacterium tuberculosis (m. However, most individuals infected with Mycobacterium tuberculosis remain asymptomatic, keeping the infection in a latent form, and considering that the latent state is maintained by an active immune response (WHO; Kaufmann, SH & McMichael, AJ., Nat Med, 2005). This is supported by reports showing that treatment of patients with Crohn's Disease or Rheumatoid Arthritis (rhematoid Arthritis) with anti-TNF antibodies results in an improvement of autoimmune symptoms, while on the other hand, in reactivation of TB in patients previously exposed to mycobacterium tuberculosis (Keane). The immune response to M.tuberculosis is multifactorial and includes genetically determined host factors such as TNF and IFN-gamma of the Th1 axis and IL-12 (reviewed in Casanova, Ann Rev; Newport). However, immune cells from adult human pulmonary TB patients can produce IFN- γ, IL-12 and TNF, and IFN- γ therapy does not help to ameliorate the disease (reviewed in Reljic,2007, J Interferon & cytres, 27,353-63), suggesting that a broader number of host immune factors are involved in protection against mycobacterium tuberculosis and maintenance of latency. Thus, knowledge of the host factors induced in latent versus active TB can provide information about the immune response that can control infection by mycobacterium tuberculosis.
Diagnosis of PTB can be difficult and problematic for a variety of reasons. First, the presence of typical mycobacterium tuberculosis in sputum is demonstrated by microscopic examination (smear positive) with only 50-70% sensitivity, and positive diagnosis requires isolation of mycobacterium tuberculosis by culture, which can take up to 8 weeks. In addition, some patients have sputum that is smear negative or that is not capable of producing sputum, requiring additional sampling by bronchoscopy, an invasive procedure. Because of these limitations in the diagnosis of PTB, patients who are negative for spreading are sometimes tested for tuberculin (PPD) skin reactivity (mantoux test). However, tuberculin (PPD) skin reactivity does not distinguish between BCG vaccination, latent or active TB. In response to this problem, assays have been developed that demonstrate immunoreactivity to specific mycobacterium tuberculosis antigens that are not present in BCG. However, the reactivity to these mycobacterium tuberculosis antigens, measured by the production of IFN- γ by blood cells in the gamma interferon release assay (IGRA), does not distinguish latent disease from active disease.
Clinically, latent TB is defined by a delayed type of highly sensitive response when patients are challenged intradermally with PPD with positive IGRA results without clinical symptoms or signs of active disease or radiologic findings. Reactivation of latent/latent Tuberculosis (TB) presents a major health hazard with the risk of transmission to other individuals, so biomarkers reflecting the differentiation of latent and active TB patients would be useful in disease control, especially since anti-mycobacterial drug treatment is difficult and can lead to serious side effects.
Most individuals infected with mycobacterium tuberculosis remain asymptomatic, with an estimated one-third of the world's population being latently infected with the bacterium, which provides a tremendous reservoir for the spread of the disease. Of the people described as latently infected, 5-15% will develop active TB disease throughout their life7,8. Thus, latent TB patients represent a clinically distinct classification, from patients who will remain asymptomatic for the majority of life, to those who will progress to disease reactivation9. The diagnosis of latent TB is based solely on signs of immunosensitisation, generally on skin reactions to mycobacterium tuberculosis antigens, the specificity of this test being influenced by positive reactions to non-pathogenic mycobacteria, including the vaccine BCG. More recent assays to measure IFN- γ secreted by blood cells against specific Mycobacterium tuberculosis antigens (IGRA) are less problematic, but, like skin tests, do not distinguish latent disease from active disease, nor do they clearly identify patients who may progress to active disease10. Identifying those most at risk of reactivation will help in targeted prophylactic treatment, which is important since anti-mycobacterial drug treatment is lengthy and can lead to serious side effects. Therefore, new tools for diagnosis, therapy and vaccination are urgently needed, but efforts to develop these are limited by the incomplete understanding of the complex underlying pathogenicity of TB.
Disclosure of Invention
The invention includes methods and kits for identifying latent versus active Tuberculosis (TB) patients relative to healthy controls. In one embodiment, differential and inverse immune signatures of blood are analyzed using microarrays to determine, diagnose, track, and treat latent versus active Tuberculosis (TB) patients. The present invention provides for the first time the ability to differentiate between the heterogeneity of TB infections, which can be used to determine which individuals with latent TB should be given anti-mycobacterial chemotherapy due to active and non-latent/asymptomatic TB infections.
In one embodiment, the invention includes a method of predicting an active mycobacterium tuberculosis infection that appears latent/asymptomatic, the method comprising: obtaining a gene expression dataset for a patient from the patient suspected of being infected with mycobacterium tuberculosis; dividing the patient's gene expression dataset into one or more gene modules associated with mycobacterium tuberculosis infection; and comparing the patient's gene expression dataset for each of the one or more gene modules to gene expression datasets from non-patients also divided into the same gene module; wherein an overall increase or decrease in gene expression in the gene expression dataset for the patient of one or more gene modules is indicative of an active mycobacterium tuberculosis infection rather than a latent/asymptomatic mycobacterium tuberculosis infection. In one aspect, the method further comprises the step of using the determined comparative gene product information to formulate at least one diagnostic, prognostic, or therapeutic regimen. In another aspect, the method may further comprise the step of distinguishing patients with latent TB from patients with active TB. In one aspect, the gene expression dataset for the patient is from cells in at least one of whole blood, peripheral blood mononuclear cells, or saliva. In another aspect, the patient's gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200, 250, 300, 350, or 393 genes selected from the genes in table 2. In another aspect, the patient's gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200 modules M1.3, M2.8, M1.5, M2.6, M2.2, and M3.1. In another aspect, the gene module associated with mycobacterium tuberculosis infection is selected from the group consisting of: module M1.3, module M2.8, module M1.5, module M2.6, module M2.2 and module 3.1. In another aspect, the gene modules associated with mycobacterium tuberculosis infection are selected according to the following changes: increased in B-cell associated genes, decreased in T-cell associated genes, increased in bone marrow associated genes, and increased in neutrophil associated transcripts and interferon inducible genes (IFNs). In another aspect, the disease state of the patient is further determined by radiologic analysis of the patient's lungs. In another aspect, the method further comprises the steps of determining the gene expression dataset of the treated patient after the patient has been treated and determining whether the gene expression dataset of the treated patient has returned to a normal gene expression dataset, thereby determining whether the patient has been treated.
In another embodiment, the invention is a method for distinguishing between active and latent mycobacterium tuberculosis infection in a patient suspected of being infected with mycobacterium tuberculosis, the method comprising: obtaining a first gene expression dataset from a first clinical group obtained from active mycobacterium tuberculosis infection, a second gene expression dataset from a second clinical group obtained from patients with latent mycobacterium tuberculosis infection, and a third gene expression dataset from a clinical group obtained from uninfected individuals; generating a gene cluster (gene cluster) data set comprising differential expression of genes between any two of the first, second and third data sets; and determining a unique expression/representative pattern indicative of latent infection, active infection, or health, wherein the patient gene expression dataset comprises at least 6, 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, or 200 genes obtained from at least one of modules M1.3, M2.8, M1.5, M2.6, M2.2, and M3.1.
In yet another embodiment, the invention is a kit for diagnosing infection in a patient suspected of being infected with mycobacterium tuberculosis, the kit comprising: a gene expression detector for obtaining a patient gene expression dataset from a patient, wherein the expressed genes are obtained from the patient's whole blood; and a processor capable of comparing the gene expression dataset to a previously defined gene module dataset associated with a mycobacterium tuberculosis infection and distinguishing between infected and non-infected patients, wherein the whole blood demonstrates a global change in the level of the polynucleotide in the one or more transcriptional gene expression modules as compared to a matching non-infected patient, thereby distinguishing between active and latent mycobacterium tuberculosis infection. In one aspect, the gene expression dataset for the patient is obtained from peripheral blood mononuclear cells. In another aspect, the patient's gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200, 250, 300, 350, or 393 genes selected from the genes in table 2. In another aspect, the patient gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200 modules M1.3, M2.8, M1.5, M2.6, M2.2 and M3.1. In another aspect, the gene module associated with mycobacterium tuberculosis is selected from the group consisting of: module M1.3, module M2.8, module M1.5, module M2.6, module M2.2 and module 3.1. In another aspect, the gene module associated with mycobacterium tuberculosis is selected according to the following changes: decreased in B-cell associated genes, decreased in T-cell associated genes, increased in bone marrow associated genes, increased in neutrophil associated transcripts and interferon inducible genes (IFNs). In another aspect, the gene is selected from PDL-1, CASP5, CR1, CASP5, TLR5, MAPK14, STX11, BCL6, and C5.
Another embodiment of the invention is a system for diagnosing a patient with active and latent mycobacterium tuberculosis infection, the system comprising: a gene expression detector for obtaining a patient gene expression dataset from a patient, wherein the expressed genes are obtained from the patient's whole blood; and a processor capable of comparing the gene expression dataset to a pre-defined gene module dataset associated with a mycobacterium tuberculosis infection, and the processor is capable of distinguishing between infected and non-infected patients, wherein whole blood demonstrates an overall change in polynucleotide levels in one or more transcriptional gene expression modules as compared to a matched non-infected patient, thereby distinguishing between active and latent mycobacterium tuberculosis infection, wherein the gene module dataset comprises at least one of modules M1.3, M2.8, M1.5, M2.6, M2.2, and M3.1. In one aspect, the patient's gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200, 250, 300, 350, or 393 genes selected from the genes in table 2. In another aspect, the patient gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200 modules M1.3, M2.8, M1.5, M2.6, M2.2 and M3.1. In another aspect, the gene module associated with mycobacterium tuberculosis infection is selected from the group consisting of: module M1.3, module M2.8, module M1.5, module M2.6, module M2.2 and module 3.1. In another aspect, the gene modules associated with mycobacterium tuberculosis infection are selected according to the following changes: decreased in B-cell associated genes, decreased in T-cell associated genes, increased in bone marrow associated genes, increased in neutrophil associated transcripts and interferon inducible genes (IFNs). In another aspect, the gene is selected from PDL-1, CASP5, CR1, CASP5, TLR5, MAPK14, STX11, BCL6, and C5.
Drawings
For a more complete understanding of the features and advantages of the present invention, reference is now made to the detailed description of the invention along with the accompanying figures, and in which:
fig. 1a to 1 c. A unique whole blood transcriptional signature of active TB. The columns of the thermodynamic diagram (heatmap) represent individual genes, and each column represents an individual participant. The abundance of transcripts across the paper is indicated by the color scale at the bottom of the graph (red, high; yellow, medium; blue, low). (1a) The 393 most significantly differentially expressed genes in the training set were organized by hierarchical clustering. (1b) The same 393 transcript lists arranged in the same gene tree were used to analyze data from independent test sets, hierarchically clustered by Spearman correlation using the mean distance method to generate conditional trees (along the upper horizontal edge of the thermodynamic diagram) and study grouping (i.e. clinical phenotype) expressed as color blocks at the bottom of each. (1c) Independent validation sets recruited in south africa were analyzed as described previously.
Fig. 2a to 2 c: the transcriptional signature of active TB correlates with the radiographic extent of the disease. Three independent clinicians evaluated chest radiographs of each patient in the training set and independent test set without knowledge of other data (fig. 9 a). (2a) In a separate test set, a pattern of 393 transcripts was shown for each patient with active TB. Late stage disease, moderate disease, mild disease and no disease are exemplified. (2 b, 2 c) the profiles were grouped according to radiographic extent of disease and the mean "molecular distance to health" of each group was compared using Kruskal-Wallis ANOVA (additional method) using multiple comparison post hoc test by Dunn to compare between groups (. + -. p < 0.0001).
Fig. 3a to 3 d. The transcriptional signature of active TB was reduced in successful treatment. (3a) 7 patients with active TB (active) were resampled at 2 and 12 months after initiation of anti-mycobacterial treatment and compared to healthy controls from independent test sets (control, n = 12). (3b) Chest radiographs at 2 and 12 months after initiation of anti-mycobacterial treatment showed 2 out of 7 patients (labeled "4" or "7"). The profiles of these individuals are shown in the foregoing, and are labeled with the same plurality of indicators. (3c) The "molecular distance to health" of each patient at each time point was calculated and compared using Spearman correlation with time after initiation of treatment. (3d) The average "molecular distance to health" for each time point was compared using the Friedman test, which uses the multiple comparison post hoc test of Dunn to compare time points. The horizontal line indicates the median, 5th and 95 th percentiles.
Fig. 4a to 4 e. The whole blood transcriptional signature of active TB reflects both significant changes in cellular composition and changes in the absolute levels of gene expression. (4a) Gene table for active TB compared to healthy controlsDrawing in a preset module frame. The density of dots represents the proportion of transcripts that were significantly differentially expressed for each module (red = increased, blue = decreased, transcript abundance). Functional interpretation (interpretation) previously determined by unbiased literature analysis was indicated by a grid of underlying color codes (4 b), and whole blood from the test set healthy controls (controls) and active TB patients (active) was analyzed by flow cytometry for CD3+CD4+And CD3+CD8+T cells and CD19+CD20+B cells. Error bar = median. (4c) Whole blood from the test set controls (controls) and active TB patients (active) was analyzed for CD14 by flow cytometry+Of monocytes, CD14+CD16+Inflammatory monocytes and CD16+Neutrophil (neutrophil) of (a). Error bar = median. (4d) The Ingenity Pathways analysis typical of the interferon signaling pathway is shown here, with the individual gene products identified using symbols corresponding to their function (illustrated on the right side), and transcripts over-represented in the training set of active TB patients are shown in red shading. (4e) Serum levels of CXCL10 from healthy controls (control) and active pulmonary TB (active). Statistical comparisons were performed using the two-sided Mann-Whitney test. The horizontal line indicates the mean of each group, and the side line indicates the 95% confidence interval.
Fig. 4f and 4 g. The unique whole blood 86 gene transcriptional signature of active TB is distinct from other diseases. (4f) Comparison of 86 gene signatures in TB and other disease patients, normalized against their own controls; the patients are: TB (training, n = 13; control, n = 12), TB (SA, n = 20; control = 12), group a streptococci (Strep; n = 23; control = 12), staphylococci (Staph; n = 40; control = 12), stills disease (Still's; n = 31; control = 22), adult (SLE; n = 29; control = 16) and pediatric SLE (pSLE; n = 49; control = 11). (4g) Expression levels of 86 gene signatures after 2 and 12 months of treatment in TB patients.
Fig. 4 h. Gene expression for TB (test set) and different diseases (disease versus healthy control), which were mapped in a pre-defined modular framework. Dot density (red, increased; blue, decreased) indicates transcript abundance.
Fig. 5a and 5 b. Interferon-inducible gene expression in active TB. (ii) interferon inducible gene (5 a) transcript abundance in whole blood samples from active TB (5 a); and expression in isolated blood leukocyte populations from test set blood (5 b). Gene abundance/expression was expressed as a median compared to healthy controls (labeled as in fig. 1). The numbers displayed in the test set and the separate cohorts correspond to individual patients.
Fig. 6a to 6 d. PDL1 (CD 274) is excessive in whole blood of active TB patients, mainly due to its overexpression by neutrophils. (6a) The abundance of PDL1 (normalized to the median of all samples) in active TB patients (active) and healthy controls (control) (or latent south africa). The geometric Mean Fluorescence Intensity (MFI) of PDL1 in whole blood leukocytes from representative patients and controls is also shown. The MFI level is connected to the expression profile of PDL1 by an arrow. The images show the combined MFI data from 1111 active TB patients and 11 healthy controls (error bars = mean ± 95% CI). (6b) The MFI (blue) of PDL1 of the different cell subsets was compared to total leukocytes (red) and isotype control of total cells (green). Controls and patients are shown. Images show combined MFI data from the same number of active TB patients and healthy controls (error bars = mean ± 95% CI). (6c) PDL1 expression is shown for 4 controls and 7 active TB patients in the enriched cell subpopulation, normalized to the median of all samples. (6d) PDL1 abundance in whole blood of 7 active TB patients (active) was shown at 0, 2 and 12 months after anti-mycobacterial treatment, compared to 12 healthy controls (controls) in the experimental set.
Fig. 7a to 7 c. Training, testing and formation of validation sets. Not only are the individual cohorts recruited independently, but all stages of RNA processing and microarray analysis are performed completely independently. (7a) Recruitment of training set cohorts in london, UK; (7b) recruitment of independent test set cohorts in london, UK; (7c) independent validation set cohorts were recruited in south africa, cape town.
Fig. 8a to 8 d. Hierarchical clustering of patient profiles. (8a) The 1836 transcript expression profiles of the training set were unsupervised hierarchical clustered by Spearman correlation, which produced a conditional tree (along the upper edge of the thermodynamic diagram) with its average distance. These patient clusters may be compared to clinical and demographic parameters displayed in the blocks below each atlas along the lower edge of the thermodynamic diagram. Keywords are provided at the bottom of the graph. And uniformly dividing clusters according to the distance. (8b) The 393 transcript expression profiles of the test set were clustered by Pearson correlation using the average distance. (8c) The 393 transcript expression profiles of the validation set were clustered using the average distance according to Pearson correlations. (8 d and 8 e) 393 transcript patient expression profiles in the validation set for only 22 to 34 years of age.
Fig. 9a to 9 c. Comparison of transcriptional signatures of active TB and radiographic levels of disease. (9a) A classification scheme for ranking chest radiographs according to the degree of disease. (9b) 393 transcript expression profiles of all 13 active TB patients in the training set, as well as corresponding chest radiographs taken at diagnosis, were both grouped according to X-ray grading according to the classification scheme. For a given patient, the expression profile and the radiograph are given the same numerical designation. (9c) 393 transcript expression profiles and chest radiographs of all 21 active TB patients were pooled.
Fig. 10a to 10 d. The whole blood transcriptional signature of active TB reflects unique changes in cellular composition as well as changes in absolute levels of gene expression. Gene expression of active TB was plotted against healthy controls in a pre-defined modular frame. The density of dots represents the proportion of transcripts that were significantly differentially expressed for each module (red = increased, blue = decreased, transcript abundance). Functional interpretations previously determined by unbiased literature analysis are indicated by a grid color-coded in main graph 4. Here it is demonstrated that in the training set (10 a), the percentage of genes in each module is either increased (red) or decreased (blue); (10b) testing the set; (10c) a validation Set (SA). (10d) Healthy weighted molecular distances were calculated for each patient at baseline pretreatment (0 month) and at months 2 and 12 after the start of anti-mycobacterial treatment. The numbers of the individual patients correspond to those shown in figures 3a to 3 d.
Fig. 11a to 11 c. Analysis of lymphocytes in blood of active TB patients and controls. (11a) Flow cytometric selection pass strategies for analyzing T cells and B cells from whole blood of healthy controls and active TB patients of the experimental set are shown. The top column of the panel shows the posterior opt-through strategy for determining the lymphocyte FSC/SSC opt-through used in the subsequent opt-through. A large FSC/SSC selection pass (left panel) was initially set and CD45vs CD3 was subsequently analyzed. CD45CD3 (middle panel) was selected and their FSC/SSC profile determined (right panel). This profile was then used to determine the appropriate lymphocyte FSC/SSC selection pass (see second row, left panel). This latter option is also programmed in CD45+CD19+(B cells) selection passes were performed to ensure that these cells were included in the lymphocyte selection pass (not shown). The second row of the panel shows the selection pass strategy for identifying T cell populations. Lymphocyte FSC/SSC selection was set and CD45vs. cd3 was evaluated for these cells (from the second panel from the left). Then choose to pass CD45+Cells, and CD3vs CD8 was evaluated. Selection by CD3+T cells, and the expression of CD4 and CD8 was assessed. Then choose to pass CD4+And CD8+A subset of (2). Lines 3-6 show the selection pass strategy for defining T cell memory subsets. Evaluation of expression of CD45RA vs CCR7 of CD4 and CD8T cells selected for passage in row 2, and a quadrant set based on isotype controlLines 5 and 6 to define (CD45 RA) as-is+CCR7+) Central memory (CD45 RA-CCR 7)+) Effector memory (CD45 RA)-CCR7-) And in CD8+In the case of T cells, the effects of terminal differentiation (CD45 RA)+CCR7-) T cells. The expression of CD62L was also evaluated for these subsets. The bottom row of the panel shows the strategy for selecting through B cells. Lymphocyte FSC/SSC passage was set and cell CD45vs CD19 was evaluated. Selection passed cell CD45+And CD19 and CD20 were evaluated. B cell definition is CD19+CD20+. (11b) To obtain a T cell memory population, whole blood from 11 test set healthy controls (controls) and 9 test set active TB patients (active) was analyzed by multiparameter flow cytometry. The complete flow cytometric selection pass strategy is shown in FIG. 11 a. The graphs show the proto-state, intermediate memory (TCM), effector memory (TEM) and terminal differentiation effector (TD, CD8 only) of all individuals+T cells) percentage of cell subsets (top row, in each group), and number of cells in each cell subset (x 10)6Ml) (bottom row, groups). Each symbol represents an individual patient. The horizontal line represents the median value. (11c) Gene (i) T cell transcript abundance in whole blood samples from active TB (training, test and validation set); (ii) expression in isolated blood lymphocyte populations from test set blood. Gene abundance/expression is shown in comparison to the median of healthy controls (as marked in figure 1). The numbers and separate populations displayed in the experimental set correspond to individual patients.
Fig. 12a to 12 c. Analysis of myeloid cells in blood of active TB patients and controls. (12a) Flow cytometric selection pass strategies for analysis of monocytes and neutrophils from whole blood of healthy controls and active TB patients of the test set are shown. Large FSC/SSC selection passes were set (top row, left panel) followed by analysis of CD45vs CD 14. Selection passed cell CD45+(intermediate panel) andCD14vs CD16 was evaluated. Monocyte is defined as CD14+Inflammatory monocytes are defined as CD14+CD16+And neutrophil is defined as CD16+. Also shown in the figure is a means for assessing CD16+A possible overlap between neutrophils and NK cells expressing CD16 was selected by strategy. A large FSC/SSC selection pass was set to pass both neutrophils and NK cells. (12b) CD45 was subsequently evaluated+Cellular CD16vs CD56 (NK cell marker). CD16+Neutrophils expressed high levels of CD16 instead of CD56 (as shown by the isotype control plot, bottom panel). CD56+NK cells expressed moderate levels of CD16 and did not overlap with CD16hi cells. CD56+CD16int cells and CD16hi cells have different FSC/SSC characteristics. (12c) Myeloid gene (i) transcript abundance in whole blood samples from active TB patients (training, test and validation set); and (ii) expression in an isolated blood lymphocyte population from the test set blood. Gene abundance/expression is shown as a median comparison to healthy controls (as labeled in figure 1). The numbers and separate populations displayed in the experimental set correspond to individual patients.
Fig. 13a and 13 b. 393 transcript-signed Ingenity Pathways analysis. (13a) The probability of significant overexpression of each typical biological pathway (corrected by Benjamini-Hochberg multiplex tests according to the logarithm of the p-value calculated by the Fischer exact test) is indicated by the orange squares. The solid colored bars represent the percentage of the total number of genes comprising the pathway (given in bold on the right edge of each bar) present in the gene list analyzed. The color of the bars indicates the abundance of transcripts in whole blood of those active TB patients in the training set and healthy controls compared to this. (13b) Serum levels of interferon alpha-2 a (IFN-alpha 2a) and interferon gamma (IFN-gamma) are shown for 12 healthy controls and 13 patients with active TB who were used for the training set microarray analysis. No significant difference was observed between groups using the two-sided Mann-Whitney assay for either cytokine. Horizontal lines indicate the mean of each group and side lines indicate 95% confidence intervals.
Fig. 14a and 14 b. Expression of PDL1 (CD 274) in whole blood and cell subsets from individual healthy controls and patients with active TB. (14a) Expression of PDL1 was analyzed by flow cytometry in whole blood from 11 test set healthy controls (control) and 11 test set active TB patients (active). A large FSC/SSC selection pass was set to pass all leukocytes and the geometric Mean Fluorescence Intensity (MFI) of PDL1 (indicated in red) was compared to the isotype control evaluated (green). Each active TB patient was analyzed on a different day, healthy controls were analyzed in groups (starting from the left, samples 1 and 2, 3 and 4, 6-8 and 9-11 were run together, 5 was run alone) and one isotype control was shared for the samples in each group. (14b) In section a, cell subsets from blood of the same 11 test set healthy controls (controls) and 11 test set active TB patients (active) were also analyzed by flow cytometry for expression of PDL 1. The cell subpopulations were as defined in fig. 6b, and the MFI of PDL1 (red) was compared to the mapped isotype control (green).
FIGS. 15 a-f. The 393 transcript profiles of the training set, ordered according to the study group, are shown magnified with the gene symbols listed in the right part of the figure. Key transcripts are highlighted with larger words. The left part of each figure shows the entire gene tree, thermodynamic diagram, and an enlarged area marked by a black rectangle. The relative abundance of transcripts is indicated on a color scale at the bottom of the graph (as shown in figure 1).
FIGS. 16a to 16 are thermodynamic diagrams comparing various genes for control, latency and activity, the genes listed on the right hand side of the thermodynamic diagrams.
Fig. 17a to 17c are tables of statistics for a plurality of training sets, test sets and validation sets listed in the tables, i.e. gender, country of origin and ethnicity (ehttinity) for a plurality of breaks.
FIGS. 18a to 18c are tables of statistical results for a plurality of training, test and validation sets listed in the tables, i.e., test results for TST, BCG inoculation and shear status
Fig. 19 is a table summarizing the specificity and sensitivity results for the training set, test set, and validation set across multiple sources of samples.
Description of the invention
While the making and using of various embodiments of the present invention are discussed in detail below, it should be appreciated that the present invention provides many applicable inventive concepts which can be embodied in a wide variety of specific contexts. The specific embodiments discussed herein are merely illustrative of specific ways to make and use the invention, and do not delimit the scope of the present invention.
To facilitate an understanding of the present invention, a number of terms are defined below. The terms defined herein have meanings as commonly understood by one of ordinary skill in the art to which the invention pertains. Terms such as "a/an", "an" and "the" are not intended to refer to only a singular entity, but include the general class in which specific examples may be used for illustration. The term used herein is intended to describe specific embodiments of the invention, however its use is not limiting of the invention unless indicated in the claims. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The following references provide the skilled person with a general definition of a number of terms used in the present invention: singleton et al, Dictionary of Microbiology and molecular Biology (2d ed.1994), The Cambridge Dictionary of Science and technology (Walker ed.,1988), The Glossary of Genetics,5TH ED., R.Rieger et al (eds.), Springer Verlag (1991), and Hale & Marham, The Harcollels Dictionary of Biology (1991).
Different biochemical and molecular biological methods are well known in the art. For example, methods for isolating and purifying nucleic acids are described in detail in WO 97/10365; WO 97/27317; third Chapter of laboratory techniques in Biochemistry and Molecular Biology, Hybridization with Nucleic acids Probes, Part I.Therory and Nucleic Acid Preparation, (P.Tij ssen, ed.) Elsevier, N.Y. (1993); sambrook et al, Molecular Cloning: ALaboratory Manual, Cold Spring Harbor Press, N.Y. (1989), and Currentprotocols in Molecular Biology, (Ausubel, F.M et al, eds.) John Wiley & Sons, Inc., New York (1987 Across 1999), including the appendix.
Bioinformatics definition
As used herein, a "subject" refers to any target item or information (typically textual, including nouns, verbs, adjectives, adverbs, phrases, sentences, symbols, numerical symbols, and the like). Thus, an object is anything that can constitute a relationship, and anything that can be obtained, identified, and/or searched from a source. "subject" includes, but is not limited to, target entities such as genes, proteins, diseases, phenotypes, mechanisms, drugs, and the like. In some aspects, the object may be data, as described further below.
As used herein, a "relationship" refers to objects that co-occur in the same unit (e.g., a phrase, a sentence, two or more lines of text, a paragraph, a portion of a web page, a magazine, a paper, a book, etc.). Which may be text, symbols, numbers, and combinations thereof.
As used herein, "metadata content" refers to information organized according to text in a data source. The metadata may include standard metadata (e.g., dublin core metadata) or may be specimen specific. Examples of metadata formats include, but are not limited to, machine-readable catalog (MARC) records for library catalogs, source description format (RDF), and extensible markup language (XML). The meta-objects may be generated manually or by automated information extraction algorithms.
As used herein, an "engine" refers to a program that implements the core or critical functions of other programs. For example, the engine may be a central program in an operating system or application program that regulates the overall operation of other programs. The term "engine" may also refer to a program that includes algorithms that may be changed. For example, the knowledge mining engine may be designed so that its manner of determining relationships may vary, reflecting new rules for determining and ordering relationships.
As used herein, "semantic analysis" refers to the determination of relationships between words that represent similar concepts, such as by removing suffixes or adding parts or by employing a synonym dictionary. "statistical analysis" refers to a technique based on counting the number of occurrences of each item (word, root word, stem word, meta-grammar, phrase, etc.). The same phrase for different contexts may represent different concepts without limiting the set of objects. Statistical analysis of the co-occurrence of phrases may help analyze semantic ambiguity. "syntax analysis (syntax)" can be used to further reduce the ambiguity by part-of-speech analysis. As used herein, one or more of these analyses are more generally referred to as "syntactic analysis". "Artificial Intelligence (AI)" refers to some method by which a non-human device, such as a computer, performs tasks that a human deems noteworthy or "intelligent". Examples include authenticating pictures, understanding spoken words or written text, and solving problems.
Terms such as "data," data set, "and" information "are often used interchangeably, as are" information "and" knowledge. As used herein, "data" is the most fundamental unit, which is a measurement or set of measurements from an experiment. Data is collected to constitute information, but it is basically independent of this and can be combined into data sets, i.e. sets of data. In contrast, information is derived from targets such as data (units) may be collected on race, sex, height, weight and diet for the purpose of finding variables related to cardiovascular disease risk. However, the same data may be used to develop recipes or to generate "information" about dietary preferences, i.e. the likelihood that a particular product has a higher probability of being sold in the supermarket.
As used herein, the term "database" is a repository of raw or collected data, even where different aspects of information can be found in the data domain. The database may include one or more data sets. Databases are generally organized so that their contents can be accessed, managed, and updated (e.g., the database is dynamic). The terms "database" and "source" can also be used interchangeably in the present invention, since the original source of data and information is a database. However, a "source database" or "source data" generally refers to data, such as unstructured text and/or structured data, that is input into the system for identifying objects and determining relationships. The source database may or may not be a relational database. However, the system database typically comprises a relational database or some equivalent type of database that stores relational values of relationships between objects.
As used herein, a "system database" and a "relational database" are used interchangeably and refer to one or more sets of data organized as a set of tables comprising data combined in predetermined categories. For example, a database table may include one or more categories set by columns (e.g., attributes), while a row of the database may include objects that are unique to the category set by the columns. Thus, a property of a subject, such as a gene, may have a list of its presence, absence, and/or level of expression of the gene. A row of a relational database may also be referred to as a "set" and is typically defined by the values of its columns. A "field" in the context of a relational database is a valid set of values, such as fields that a column may include.
As used herein, "knowledge domain" refers to the field of research in which the system is operable, e.g., all biomedical data. It should be noted that it is beneficial to merge data from several domains, e.g. biomedical data and engineering data, since different data can sometimes link things that cannot be put together for a normal person who is only familiar with one domain or who is exploring/studying (one domain). "distributed database" refers to a database that may be distributed or replicated in different points on a network.
As used herein, "information" refers to a set of data that may include numbers, letters, a collection of numbers, a collection of letters, or conclusions resulting or derived from a collection of data. "data" is then a measurement or statistic, and is the basic unit of information. "information" may also include other types of data, such as words, symbols, text, such as unstructured free text, code, and so forth. "knowledge" is loosely defined as a collection of information that provides a sufficient understanding of the system to model causes and effects. To extend the foregoing example, information on demographics, gender, and previous purchases may be used to develop a regional marketing strategy for food sales, while information on nationality may be used by purchasers as a guideline for product importation. It is important to note that there is no strict boundary between data, information, and knowledge; these three terms are sometimes considered equivalent. Generally, data comes from testing, information comes from correlation, and knowledge comes from modeling.
As used herein, a "program" or "computer program" refers generally to a syntactic unit, conforming to the rules of a particular programming language and including declarative documents and statements or specifications, separable into "code segments," which are required to solve or perform particular functions, tasks, and problems. Programming languages are generally artificial languages used to express programs.
As used herein, a "system" or "computer system" refers generally to one or more computers, peripherals, and software that perform data processing. A "user" or "system operator" generally includes a person using a computer network for data processing and information exchange via a "user device" (e.g., a computer, wireless device, etc.). A "computer" is generally a functional unit capable of performing substantial calculations, which includes a variety of arithmetic and logical operations without human intervention.
As used herein, "application software" or "application program" generally refers to software or a program specific to the resolution of an application problem. The "application problem" is typically submitted by the end user, whose resolution requires information processing.
As used herein, "natural language" refers to a language whose rules are not specifically specified based on existing usage, such as english, spanish, or chinese. As used herein, "artificial language" refers to a language whose rules are explicitly established prior to its use, such as a computer programming language, such as C, C + +, JAVA, BASIC, FORTRAN, or COBOL.
As used herein, "statistical relevance" refers to the use of one or more ranking schemes (O/E ratio, strength, etc.) in which relationships are determined to be statistically relevant if they are expected to occur significantly more frequently than random chance.
As used herein, the terms "co-regulated genes" or "transcription modules" are used interchangeably to refer to a set of gene expression profiles (e.g., signal values associated with a particular gene sequence) for a particular gene. Each transcriptional module establishes a link between two key pieces of data, namely the literature search section and the numerical data of gene expression from actual testing of gene microarrays. The set of genes selected for the transcription module is based on an analysis of gene expression data (module selection algorithm described above). Further steps are taught by Chaussabel, D. & Sher, A.Mining microarray expression data by y little engineering profiling. genome Biol 3, RESEARCH0055(2002), (http:// genobiology. com/2002/3/10/RESEARCH/0055) related parts incorporated herein by reference and expression data from diseases or conditions of interest such as systemic lupus erythematosus, arthritis, lymphoma, cancer, melanoma, acute infection, autoimmune disease, autoinflammatory disease, etc.).
Examples of keywords used to derive the document search portion or contributing to the transcription module are listed in the following table. The skilled person will be aware that other terms, such as specific cancer, specific infectious disease, transplantation, etc., can be readily selected in other circumstances. For example, the genes and signals of those genes associated with T cell activation are described below in the module ID of "M2.8", where specific keywords (e.g., lymphoma, T-cell, CD4, CD8, TCR, thymus, lymph, IL 2) are used to determine key T-cell associated genes, such as T-cell surface markers (CD 5, CD6, CD7, CD26, CD28, CD 96); molecules expressed by cells of the lymphoid lineage (lymphotoxin beta, IL 2-induced T cell kinase, TCF 7; and T cell differentiation proteins mal, GATA3, STAT 5B). The entire module then generates a transcriptional module by correlating (whether platform, presence/absence, and/or up-or down-regulating) the data from these genes from the patient population. In some cases, the gene expression profile does not match (at this point) any specific gene cluster and data for these disease symptoms, however specific physiological pathways (e.g., cAMP signaling, zinc finger proteins, cell surface markers, etc.) are found in the "undefined" modules. In practice, gene expression datasets may be used to extract genes with coordinated expression prior to matching with keyword searches, i.e., any dataset may be correlated prior to cross-referencing with a second dataset.
TABLE 1 transcription Module
Biological definition
As used herein, the term "array" refers to a solid support or matrix having one or more peptide or nucleic acid probes attached to the support. Arrays typically have one or more different nucleic acid or peptide probes coupled to the surface of the substrate at different known locations. These arrays, also known as "microarrays" or "gene chips", may have 10,000 based on known genomes (e.g., the human genome); 20,000; 30,000 or 40,000 different identifiable genes. These arrays are used to detect the entire "transcriptome" or transcription pool of genes expressed or found in a sample, e.g., nucleic acids expressed as RNA, mRNA, etc., which may be subjected to RT and/or RT-PCR to generate a complementary set of DNA replicons. Arrays can be prepared using methods incorporating non-lithographic and/or lithographic methods in combination with solid phase synthesis methods, such as mechanosynthesis methods, light-induced synthesis methods, and the like.
Different techniques for synthesizing these nucleic acid arrays are described, such as fabrication on virtually any shape surface or even multiple surfaces. The array may be peptides or nucleic acids on beads, gels, polymer surfaces, fibers such as optical fibers, glass, or any other suitable matrix. The array may be assembled in such a way that diagnostics or other operation of the overall included device may be performed, see, for example, U.S. patent No. 6,955,788, relevant portions of which are incorporated herein by reference.
As used herein, the term "disease" refers to a physiological state in which an organism has any abnormal biological state of a cell. Diseases include, but are not limited to, disruption, cessation or dysregulation of cells, tissues, bodily functions, systems or organs, which may be caused by intrinsic, genetic, infection-induced, abnormal cellular function, abnormal cell division, and the like. Diseases that result in a "disease state" are generally harmful to biological systems, i.e., the host of the disease. In connection with the present invention, any biological condition, such as infection (e.g., viral, bacterial, fungal, parasitic, etc.), inflammation, autoinflammation, autoimmunity, anaphylaxis, allergy, pre-cancerous lesions, malignancy, surgery, transplantation, physiological, etc., associated with a disease or disorder is considered a disease state. Pathological conditions are generally equivalent to disease states.
The disease state may also be classified into different levels of disease state. As used herein, the grade of a disease or disease state is a subjective measure that reflects the progression of the disease or disease state, as well as the physiological response before, during, and after treatment. Generally, a disease or disease state progresses along a grade or stage, wherein the impact of the disease becomes more and more severe. The level of the disease state may be influenced by the physiological state of the cells in the sample.
As used herein, the term "therapy" or "treatment regimen" refers to a medical step taken to alleviate or alter a disease state, such as a treatment process using pharmacological, surgical, dietary, and/or other techniques intended to reduce or eliminate the effects or symptoms of a disease. The treatment regimen may include the administration of one or more drugs or surgically prescribed doses. Therapies are beneficial in most cases and reduce disease states, but in many instances the effects of the therapy have undesirable or side effects. The effectiveness of therapy is also influenced by the physiological state of the host, such as age, sex, genetics, weight, symptoms of other diseases, and the like.
As used herein, the term "pharmacological state" or "pharmacological states" refers to those samples that are to be and/or have been treated with one or more drugs, surgery, etc., that may affect the pharmacological state of one or more nucleic acids in the sample, such as new transcription, stabilization, and/or destabilization as a result of drug intervention. The pharmacological state of a sample relates to a change in a biological state before, during and/or after drug treatment and may be used as a diagnostic or prognostic function as taught herein. Some changes following drug treatment or surgery may be associated with a disease state and/or may not be associated with side effects of the therapy. Changes in pharmacological state may be the result of: duration of therapy, type and dosage of medication prescribed, compliance with a given course of treatment, and/or ingestion of an over-the-counter medication.
As used herein, the term "biological state" refers to the state of the transcriptome (which is the entire collection of RNA transcripts) of a cell sample that is isolated and purified for analysis of expression status. The biological state reflects the physiological state of the cells in the sample by measuring the adequacy and/or activity of cellular components, characterization according to morphological phenotypes, or a combination of the methods described for detecting transcripts.
As used herein, the term "expression profile" refers to the relative abundance of RNA, DNA or protein abundance or activity level. The expression profile can be, for example, a measure of the transcriptional or translational state, which is determined and/or analyzed for gene expression by a variety of methods and using any of a variety of gene chips, gene arrays, beads, multiplex PCR, quantitative PCR, run-on assay, northern blot analysis, western blot analysis, protein expression, Fluorescence Activated Cell Sorting (FACS), enzyme-linked immunosorbent assay (ELISA), chemofluorescence studies, enzyme analysis, proliferation studies, or any other method, device, and system, which are readily commercially available for use.
As used herein, the term "transcriptional state" of a sample includes the nature and relative abundance of RNA species, particularly mRNA, present in the sample. The overall transcriptional state of the sample, i.e., the combination of nature and abundance of the RNA, is also referred to herein as the transcriptome. Generally, a large fraction of the total relative composition of the entire collection of RNA species in a sample is measured.
As used herein, the term "modular transcription vector" refers to transcriptional expression data that reflects a "ratio of differently expressed genes". For example, the ratio of transcripts expressed differently between at least two groups (e.g., healthy subject versus patient) for each module. The vector was obtained from a comparison of samples from both groups. The first analysis step is used to select a disease-specific transcript set in each module. Subsequently, it is "expression level". Group comparison for a given disease provides a list of transcripts that are expressed differently for each module. Different diseases were found to result in different sets of modular transcripts. Vectors for individual modules (or multiple modules) of a single sample may then be calculated at this expression level by averaging the expression values of a subset of the disease-specific genes identified as being differentially expressed. This method enables the generation of a modular expression vector map for a single sample, such as those described in the modular maps disclosed herein. These vector module profiles represent the average expression level of each module (rather than a portion of the genes expressed differently), and the profiles can be obtained for each sample.
By using the invention, diseases can be determined and distinguished not only from the module level but also from the gene level; that is, both diseases may have the same vector (same ratio of differently expressed transcripts, same "polarity"), whereas the genetic composition of the vector may still be disease-specific. Gene level expression provides the obvious benefit of greatly improved resolution of the assay. Further, the present invention utilizes a combined transcriptional marker. As used herein, the term "combined transcriptional marker" refers to the mean of the expression values of multiple genes (a subset of modules) as compared to using a single gene as a marker (and the composition of these markers may be disease-specific). The combined transcriptional marker approach is unique in that the user is able to develop multivariate microarray integration to assess the severity of disease in patients using, for example, SLE, or to derive the expression vectors disclosed herein. Most importantly, it has been found that the use of the combination modules of the invention to transcribe tags, the results obtained herein are reproducible across microarray platforms, thereby providing greater reliability for registration approval.
Gene expression monitoring systems for use in the present invention may include customized gene arrays having a limited and/or basal number of genes that are specific and/or customized for one or more target diseases. Unlike pan-genomic arrays, which are commonly used customarily, the present invention not only provides for the use of these general arrays in retrospective genetic and genomic analysis (without the need to use a specific platform), but more importantly, it provides for the development of customized arrays that provide an alternative set of genes for analysis, without the need for thousands of otherwise unrelated genes. One significant advantage of the optimized arrays and modules of the present invention over the prior art is the reduction of financial costs (e.g., cost per analysis, materials, equipment, time, labor, training, etc.), and more importantly, the environmental costs of producing a generalized array (where the vast majority of the data is irrelevant). The modules of the present invention enable for the first time the design of simple, custom arrays that provide optimized data using a minimum number of probes and maximize signal-to-noise ratios. By reducing the total number of genes for analysis, it is possible, for example, to reduce the need to prepare thousands of expensive platinum masks for use in photolithography in the preparation of pan-gene chips, which produce large amounts of irrelevant data. Using the present invention, it is possible to completely avoid the need for microarrays if the limited probe set (or sets) of the present invention is used with the following methods or any other methods, devices and systems (readily commercially available) for determining and/or analyzing gene expression: for example, digital optical chemical arrays, bead arrays, beads (e.g., Luminex), multiplex PCR, quantitative PCR, sequential analysis, northern blot analysis, and even for protein analysis such as western blot analysis, 2D and 3D gel protein expression, MALDI-TOF, Fluorescence Activated Cell Sorting (FACS) (cell surface or intracellular), enzyme-linked immunosorbent assay (ELISA), chemofluorescence studies, enzyme assays, proliferation studies.
The "molecular fingerprinting system" of the present invention may be used to facilitate and perform comparative analysis of expression in different cells or tissues, in different subpopulations of the same cells or tissues, in different physiological states of the same cells or tissues, in different developmental stages of the same cells or tissues, or in different populations of cells of the same tissues, against other disease and/or normal cellular controls. In some cases, normal or wild-type Expression data may be from samples analyzed at or near the same time, or it may be Expression data obtained from or selected from existing Gene array Expression databases, e.g., public databases such as the NCBI Gene Expression Omnibus database.
As used herein, the term "differentially expressed" refers to measurements of cellular components (e.g., nucleic acids, proteins, enzyme activity, etc.) that vary between two or more samples (e.g., a disease sample and a normal sample). The cellular component may be on or off (present or absent), up-regulated as compared to a control or down-regulated as compared to a control. For applications using gene chips or gene arrays, differential gene expression of nucleic acids, such as mRNA or other RNAs (miRNA, siRNA, hnRNA, rRNA, tRNA, etc.), may be used to differentiate between cell types or nucleic acids. Most generally, measurement of the transcriptional state of a cell is accomplished by quantitative Reverse Transcriptase (RT) and/or quantitative reverse transcriptase-polymerase chain reaction (RT-PCR), genomic expression analysis, post-translational analysis, modification of genomic DNA, translocation, in situ hybridization, and the like.
For some disease states, cellular or morphological differences may be identified, particularly at the early levels of the disease state. The present invention avoids the following needs: the specific mutation or the gene or genes can be identified by examining the gene modules of the cell itself or, more importantly, by examining the cellular RNA expression of genes from immune effector cells which act in their normal physiological state, i.e.during immune activation, immune tolerance or even immune weakness. While genetic mutations may result in significant changes in the expression levels of a set of genes, biological systems often compensate for changes by altering the expression of other genes. As a result of these intrinsic compensatory responses, many perturbations may have little effect on the observable phenotype of the system, yet have profound effects on the composition of cellular components. Similarly, the actual copy number of a gene transcript may not increase or decrease, however, the persistence or half-life of the transcript may be affected, which results in a substantial increase in protein production. The present invention eliminates the need to detect actual signals by, in one embodiment, observing effector cells (e.g., leukocytes, lymphocytes, and/or subpopulations thereof) rather than individual signals and/or mutations.
The skilled artisan will readily recognize that samples may be obtained from a variety of sources, including, for example, individual cells, collections of cells, tissues, cell cultures, and the like. In certain cases, it may even be possible to isolate sufficient RNA from cells found in, for example, urine, blood, saliva, tissue or biopsy samples, and the like. In certain instances, sufficient cells and/or RNA may be obtained from mucosal secretions, feces, tears, plasma, ascites fluid, interstitial fluid, within the dura mater, cerebrospinal fluid, sweat, or other bodily fluids. The source of the nucleic acid, such as from a tissue or cell source, may include a tissue biopsy sample, one or more sorted cell populations, a cell culture, a cell clone, a transformed cell, a biopsy sample, or an individual cell. The source of the tissue may include, for example, brain, liver, heart, kidney, lung, spleen, retina, bone, neural, lymph node, endocrine gland, reproductive organ, blood, nerve (neural), vascular tissue, and olfactory epithelium.
The present invention includes the following basic components, which may be used alone or in combination, i.e., one or more data mining algorithms; one or more module level analysis processes; characterization of the blood leukocyte transcription module; use of the aggregated module data in multivariate analysis for molecular diagnosis/prediction of human disease; and/or visualization of data and results at the module level. Using the present invention, it is also possible to develop and analyze combined transcriptional markers, which may be further aggregated in a single multivariate score.
The explosion in data acquisition rates has stimulated the development of mining tools and algorithms for developing microarray data and biomedical knowledge. Methods aimed at uncovering the modular organization and function of transcription systems include promising approaches for identifying powerful molecular signatures for diseases. In fact, this analysis can shift the understanding of large-scale transcription studies by conceptualizing microarray data at a level beyond individual genes or series of genes.
The present inventors have recognized that current microarray-based research is facing significant challenges, and that analysis of data is notoriously "noisy", i.e., difficult to interpret, and difficult to compare well between laboratories and platforms. For the analysis of microarray data, a widely accepted approach begins with identifying subsets of genes that are differentially expressed between study groups. Next, the user then attempts to "work out" the resulting gene series using pattern recognition algorithms and prior art knowledge.
The present inventors developed strategies to emphasize the selection of biologically relevant genes early in the analysis, rather than the great variability between processing platforms. Briefly, the method involves the identification of transcriptional components that characterize a given biological system, for which improved data mining algorithms were developed to analyze and extract sets of co-expressed genes or transcriptional modules from large data sets.
Tuberculosis (PTB) is a leading and growing cause of worldwide morbidity and mortality caused by mycobacterium tuberculosis (m. However, most individuals with mycobacterium tuberculosis infection remain asymptomatic, keeping the infection in a latent form, and this latent state is thought to be maintained by an active immune response. Blood is a line of the immune system and is therefore an ideal biological material from which the health and immune status of an individual can be established. Thus, using microarray technology to assess whole genome activity in blood cells, we identified different and reciprocal blood transcriptional biomarker signatures in patients with active and latent tuberculosis. These tags are also different from those in control individuals. Latent tuberculosis signaling, which shows overexpression of immunocytotoxic gene expression in whole blood, may be helpful in determining protective immune factors against mycobacterium tuberculosis infection, since these patients are infected but most do not progress to overt disease. This different transcriptional biomarker signature from active and latent TB patients can also be used to diagnose infection and to monitor response to treatment with anti-mycobacterial drugs. In addition, the signature in active tuberculosis patients can help to identify factors involved in immune pathogenesis and may lead to strategies for immunotherapeutic intervention. The present invention relates to a prior application claiming the use of blood transcription biomarkers in the diagnosis of infection. However, this prior application does not disclose the presence of biomarkers for active and latent tuberculosis, but focuses on children with other acute infections (Ramillo, Blood, 2007).
The identification of transcriptional signatures in blood from latent and active TB patients herein can be used to test patients suspected of having a mycobacterium tuberculosis infection, as well as for health screening/early detection of the disease. The invention also allows the assessment of the response to treatment with anti-mycobacterial drugs. In this context, the test may also be particularly valuable in the context of drug testing, especially in evaluating drug treatment of multidrug resistant patients. Furthermore, the invention can be used to obtain immediate, intermittent and long-term data from the immune signature of latent tuberculosis, thereby better defining the protective immune response in vaccination trials. At the same time, the signature in active tuberculosis patients can help to identify factors involved in immunopathogenesis and is likely to lead to strategies for immunotherapeutic intervention.
The immune response to mycobacterium tuberculosis is complex and multifactorial. Although T cells and cytokines such as TNF, IFN-. gamma.and IL-12 are known to be important for the immunological control of Mycobacterium tuberculosis14-17Understanding of host factor-determined protection or pathogenicity remains incomplete16. Blood transcript profiling has been successfully applied to inflammatory diseases to improve the diagnosis and understanding of the pathogenicity of the disease18,19. However, the size and complexity of the data generated makes its interpretation difficult, often forcing scientists to focus on further research into large candidate genes20This is not sufficient for specific biomarkers for diagnosis and provides little information on the pathogenicity of the disease. By using independent and complementary bioinformatics techniques, we defined a transcriptional signature of active TB patients that facilitated further immunoassays. Our comprehensive unbiased measurement provides an important insight into the immunopathogenesis of this complex disease, and a greater understanding of this disease will help progress in TB control.
Unique whole blood transcriptional signature of active tuberculosis
To obtain a unbiased overall measure of host response to mycobacterium tuberculosis infection, whole genome transcript profiles of blood from active TB patients, latent TB patients and healthy controls were generated using Illumina HT12 biarrays. All patients were sampled prior to treatment. The diagnosis of active TB is determined by positive culture of mycobacterium tuberculosis. Latent TB patients are asymptomatic daily contacts of active TB patients or new members from endemic countries, defined by positive tuberculin-skin test (TST) (london) and positive IGRA (london and south africa). Healthy controls were recruited in london and were negative for all criteria above. Three queues were recruited independently and samples were taken: training set (enrolled in london, 2007, months 1-9; 13 patients with active pulmonary TB, 17 patients with latent TB; and 12 healthy controls); test set (enrolled in london, 10-2009, 2 months in 2007; 21 active TB patients; 21 latent TB patients; 12 healthy controls); and a validation set (khayoligsha towelette recruited in the vicinity of Southern Africa (SA) Cape Town, 5-2009, 2008, 2 months, 20 active TB patients; 31 latent TB patients) (fig. 16 and 17; fig. 7). Similarly, all processing and analysis of samples from these three cohorts were performed independently. The training set is used to discover knowledge and to assess the sufficiency of the sample size. RNA was extracted from whole blood samples and processed as described in the methods section. The resulting data were screened to remove undetected (α = 0.01) transcripts, and transcripts with less than two-fold deviation from 10% or more of the samples comprising the data set when the expression was normalized to the median of all samples. This unsupervised screening produced a list of 1836 transcripts, which revealed unique signatures in the active TB group (fig. 8 a). This list of 1836 transcripts was then used to identify signature genes that were significantly differentially expressed between groups (Kruskal-Wallis ANOVA, corrected false discovery rate equal to 0.01 using Benjamini-Hochberg multiplex test). This produced a list of 393 transcripts that were hierarchically clustered by Pearson correlation, with the average distance as a measure of the distance between the two clusters, thereby producing a gene tree with transcripts of similar relative abundance. This is shown on the left side of the thermodynamic diagram in the form of a dendrogram, where the data in each individual is organized into unique transcription profiles and displayed on the basis of clinical diagnostic groupings (fig. 1 a). This revealed a unique signature of active TB that was not present in most samples from latent TB patients or healthy controls.
Having identified putative transcriptional signatures for active TB, it is important to confirm these findings in an independent cohort of patients. Microarray analysis is susceptible to methodological, technical, and statistical variations21-23. In addition to this, TB is likely to exhibit a broad immune response against mycobacterium tuberculosis infection, which is highly likely to be influenced by race, geographical area, co-infection, age and socioeconomic status 11, 13. Thus, to ensure that our findings are widely applicable, we validated them in two additional independent cohorts that were recruited at a later time. Samples from these two independent cohorts, the test set (london) and the validation set (south africa) were processed and the data was normalized like the training set. Since the purpose of these additional verifications is to independently validate the signatures defined in the training set, no screening or selection of transcripts was performed. In contrast, a pre-selected 393 list of transcripts and a gene tree defined by analysis of the training set data were applied to the data from the independent test and validation Sets (SA). The 393 transcript profiles for the trial and validation Set (SA) were subjected to a hierarchical clustering algorithm using Spearman correlation and the average distance as a measure of the distance between clusters, to group the individual gene expression profiles according to their similarity to generate a "condition tree" which is displayed at the upper edge of the thermodynamic diagram (fig. 1b and 1 c). Unsupervised hierarchical clustering of the transcript profiles of the test and validation Set (SA) patients clearly showed that active TB patients were clustered independently of latent TB and healthy controls (fig. 1b, london), or independently of latent TB (fig. 1c, south africa), with significant associations between clusters and study groups (Pearson chi-square test p)<0.0005) (fig. 1b and 1 c), but there was no significant association with race, age and gender (fig. 8b, 8c and 8 d). However, transcriptional profile of a few latent TB patients (II) ((III))Approximately 10%, 2/21 test set, london; 3/31 validation Set (SA)) was clustered with the transcript profiles of active TB patients (labeled as in the experimental set)And a, FIG. 1b, and validation sets in south Africa labeled as Sigma, omega andfig. 1 c). Without knowing the clinical diagnosis, using a classification prediction tool based on the K-nearest neighbor classification prediction method, we subsequently tested the ability of the 393 transcript lists to correctly distinguish the test and validation set samples as active TB or not (healthy or latent). The prediction model yielded 44 correct predictions, 9 incorrect predictions, and no prediction was given for 1 sample in the experimental set. This equates to a sensitivity of 61.67%, a specificity of 93.75% and a rate of uncertainty of 1.9%. Incorrect predictions in the trial set included 5 latent TB patients classified as active TB, which were indicated in the cluster analysis above; and 4 active TB patients were predicted not to be active TB. In the validation set in south africa, there were 45 correct predictions, two incorrect predictions (1 active, 1 latent) and 4 samples with no prediction. This gave a sensitivity of 94.12% and a specificity of 96.67%, but the uncertainty was 7.8% (fig. 19).
TABLE 2.393 Gene List
Transcriptional signatures in blood from active TB patients from moderate (london) and high (south africa) load regions were identified, as shown by hierarchical clustering and blinded classification predictions, as distinguished from signatures from latent TB patients and healthy controls. The signature of latent TB shows molecular heterogeneity. The number of latent patients in the two independent cohorts of patients showed a similar transcriptional signature to active TB, consistent with the probability of patients in the group expected to progress to active disease10. These latent TB characteristics then representThose patients with subclinical active disease, or patients with a high load of latent infection, and therefore a higher risk of progression to active disease11,24。
The transcriptional signature of active TB correlates with the extent of the radiographic disease.
From our results (fig. 1a to 1 c), it is clear that there is molecular heterogeneity in the transcriptional signatures of active TB patients. Although most patients demonstrated identical 393 gene expression profiles, there were apparently a few abnormalities that showed either a clear or weaker transcriptional profile. For example, among 21 patients in the experimental set of the active TB group, 4 of the spectra did not cluster with other active TB patients and were closer to the spectra of healthy controls or latent TB patients (labeled ●, #, ■,. diamond-solid in fig. 1 b). These were 4 active patients misclassified according to the K-nearest neighbor algorithm as discussed above.
Molecular abnormalities in the active TB fraction can be due to a variety of causes. First, there may be misdiagnosis, erroneous positive cultures resulting from laboratory cross-contamination as reported previously25. Or the heterogeneity of molecules/transcripts may reflect the degree of heterogeneity of disease. To determine this, a chest radiograph of each patient in the training set and trial set at the time of diagnosis was obtained and graded by two chestnuts and one radiologist to assess the radiographic level of the disease. This assessment was performed without knowledge of clinical diagnosis or transcriptional profile using a modified version of the U.S. national cyberculosis and respiratory Disease Association Scheme, which classifies radiographic Disease into no Disease, mild Disease, intermediate and advanced Disease, and very advanced Disease (Falk a, 1969; and fig. 9 a). The 393 transcript profiles of all 13 active TB patients in the training set (FIG. 9 b) and all 21 active TB patients in the trial set (FIG. 9 c) were ranked in a thermography according to the grading of the radiographic extent of their disease (training set, FIG. 9 b; trial set, FIG. 9 c). Ratio of this transcript Profile to radiographic gradingA comparative example is shown in fig. 2a, which shows that the transcript profile can be correlated with the extent of disease. To formally determine this, we calculated a quantitative score for the molecular perturbation reflected by the transcriptional signature of each TB patient, "molecular distance from health. This is a composite of both the number of transcripts in the spectrum that are significantly distinguishable from the baseline of healthy controls, and the extent of this distinction26. This score was calculated for 393-transcript profiles for each TB patient and subsequently compared to the radiographic scores for each latent (n = 38) and active (n = 30) TB patient in the training and test sets. In this case, the protocol for assessing the radiographic level of the disease is modified so that the radiographic level rating of the disease is converted into a numerical radiographic score. The spectra grouped according to the radiographic extent of the disease showed that the mean "molecular distance from health" increased with increasing radiographic extent of the disease (using Kruskal-Wallis ANOVA p<0.001, Dunn's multiple comparisons post test to compare between groups) (FIG. 2 b). These results show for the first time that molecular signatures in blood can provide a quantitative measure of the extent of disease in active TB patients and confirm that blood transcript profiles can reflect changes in the disease site. Thus, using a method of system biology, we identified a robust blood transcriptional signature of active pulmonary TB in both medium and high load settings, which signature correlates with the extent of radiation of the disease. This method can be used to monitor the extent of disease and may help guide the treatment regimen.
Successful treatment resulted in a reduction in the transcriptional signature of active TB.
These findings confirm that the transcriptional signature of active TB correlates with the radiographic level of the disease, and it is of interest to determine whether the transcriptional signature decreases during TB treatment and whether the transcriptional signature reflects the efficacy of the treatment. This also confirms that the signature truly reflects TB disease. To test this, 7 patients with active TB were resampled at 2 and 12 months after starting the anti-mycobacterium therapy and their blood was again subjected to microarray analysis as described above with their baseline pretreatment samples, and healthy control samples (n-12) from independent test groups. Again, a 393-transcriptional signature was observed in active TB patients that was different from that of healthy controls (fig. 3 a). This transcriptional signature was reduced in most active TB patients after 2 months of treatment and completely disappeared after 12 months of treatment, so that the signature of active TB patients began to more closely resemble the signature of healthy controls. This change in the transcript profile after 2 months of treatment was more pronounced in terms of increased transcript abundance, which is reduced in approximately 50% of TB patients. This is in contrast to transcripts with reduced abundance, which were still present after two months of treatment, but returned to baseline expression after 12 months of treatment. The disappearance of blood transcriptional signatures in the treatment of active TB patients appears to reflect the improvement in radiographs (fig. 3 b). We then analyzed the difference in molecular distance from the health score between various time points during treatment. At 12 months after treatment, the "molecular distance from health" score for active TB patients was significantly lower than baseline before treatment (p <0.001, Friedman repeated measures test) (fig. 3c and d). These data indicate that the transcriptional signature in the blood of active TB patients can be used to monitor the efficacy of the treatment. Furthermore, there is evidence that 393-transcription signature is a true reflection of the host response to M.tuberculosis infection. Thus, the transcriptional signature of active TB is reduced during successful treatment, thereby providing a method for quantitatively monitoring the response to anti-mycobacterial therapy, including clinical trials for new therapeutic agents.
TB patients in south africa and london showed the same module signature.
To expedite the analysis of transcriptional signatures in collections and characterize host responses during active TB disease, we employed a modular data mining strategy18. This strategy is based on the idea that gene clusters are expressed synergistically in a range of different inflammatory and infectious diseases. The gene discrete cluster can be determinedDefining modules that can often be shown to have a coherent functional relationship by unbiased document side-writing18. Modular analysis facilitates assessment and identification of functionally relevant transcript abundance changes in blood of active TB patients compared to healthy controls (performed on the complete microarray dataset, screening only for transcripts not detected in at least two individuals (α = 0.01)) (fig. 4 a). The module signatures observed in blood of active TB patients, (module), the training and test sets in london and in the independent validation set in south africa appeared very similar to the healthy controls (fig. 4 a), confirming the reproducibility of the transcription signatures observed using classical cluster analysis by independent and unbiased analysis (fig. 1). The modular signature of active TB patients reflects a decrease in both B-cell (module M1.3) and T-cell (module M2.8) associated transcript abundance, as well as an increase in myeloid-related transcript (modules M1.5 and M2.6) abundance, and a lesser increase in neutrophil-related transcript (module M2.2). The changes in the largest proportion of transcripts in the blood of active TB patients compared to controls were those within the interferon Inducible (IFN) module (module 3.1; 75-82% of transcripts) (FIG. 4 a; and FIGS. 10a-10 c).
Blood is a heterogeneous tissue, so our transcriptional signature defined in active TB patients can represent changes in cellular composition during migration, apoptosis, or cell proliferation, or changes in gene expression in separate cell populations. There was no significant difference in total white blood cell/white blood cell counts in the blood of active TB patients compared to healthy controls (Student' st test p = 0.085). To find whether the significant reduction in B and T cell transcripts revealed by the modular analysis (fig. 4 a) was caused by changes in cell numbers in the blood, and/or gene expression in separate cells, whole blood from the experimental set of active TB patients and healthy controls was analyzed by multiparameter flow cytometry (fig. 4B, fig. 11a and 11B). CD4 compared to healthy controls+The percentage and number of T cells, and the percentage of CD8+ T cells and B cellsSignificantly decreased in active TB patients (fig. 4 b). CD4+The reduction in T cell numbers was largely due to a reduction in the number of central memory cells, CD4+The reduction in T cell numbers had a smaller but insignificant effect on effector memory and naive CD4+ T cells (fig. 11 b). However, CD8+The decrease in T cell numbers was mainly observed in the naive T cell chamber (component). To confirm that the decrease in transcript abundance of T cell-associated genes was caused by a decrease in cell number and not by a decrease in expression of these genes, we purified CD4+And CD8+Gene expression profiles of a number of representative T cell-associated genes were evaluated in T cells and compared to whole blood (fig. 11 c). These T cell transcripts appeared to be in lower abundance in active TB patients compared to healthy controls (fig. 11c (i)). However, CD4 purified from blood of active TB patients compared to healthy controls+And CD8+There were no differences in expression of these T cell-specific genes in T cells (fig. 11c (ii)). Taken together, these data indicate that the lower transcriptional abundance of T cell genes in the blood of active TB patients is solely due to a reduction in cell number. Based on our findings, several studies reported CD4 in the blood of active TB patients+The percentage and/or number of T cells decreased despite CD8+The effects of T cells and B cells are more variable27,28. However, this degree of difference between TB patients and controls in our study suggests that this phenomenon exceeds the migration of mycobacterium tuberculosis antigen-specific T cells alone, affecting a large proportion of the entire circulating T cell population.
In active TB patients and healthy controls (modules M1.5 and M2.6), a substantial increase in myeloid cell-associated transcripts at module levels was observed. To determine that this is caused by changes in cell number and/or changes in gene expression, whole blood of myeloid type cells was first analyzed for changes by flow cytometry (fig. 12 a). Monocytes (CD 14) were present in the blood of the active TB patients of the test group compared to healthy controls (see also the literature)+,CD16-) Or neutrophils (CD 16)+,CD14-) There was no change in percentage or cell number (fig. 4 c). Interestingly, inflammatory monocytes (CD 14) were observed in the blood of active TB patients compared to healthy controls (see, e.g., Western reference samples)+,CD16+) There is a small but significant increase in percentage and cell number. Representative myeloid cell-related transcripts were shown to be over-abundant in the blood of active TB patients compared to healthy controls (fig. 12b (i)). Although if expression is increased to a small monocyte population such as CD14+,CD16+Inflammatory subsets, then the increase in expression of these myeloid-related transcripts should be diluted, but the increase is in purified monocytes (CD 14)+) Is less pronounced (fig. 12b (ii)). Inflammatory monocytes have been previously shown to be elevated in inflammatory and infectious diseases29. Thus, changes in the myeloid module are explained to some extent by changes in gene expression, but can be caused by changes in the number of inflammatory monocytes in the blood of active TB patients relative to controls.
Interferon-inducible gene expression in neutrophils controls TB signatures
To confirm the overexpression of the IFN-inducible genes shown by the modular analysis (fig. 4 a) in active TB patients, the transcripts constituting the 393 transcript signatures were analyzed using the informaity pathway analysis software. By Fischer accurate testing using Benjamini-Hochberg multiplex assay correction (p < 0.0000001), it was demonstrated that IFN signaling is the most over-represented functional pathway in 393 transcripts compared to other artificially corrected (cured) biological pathways available from the literature (fig. 13). Interestingly, genes downstream of IFN- γ and type I IFN α/β receptor signaling were significantly over-represented in active TB patients (labeled red in figure 4). It is worth mentioning that, although both IFN- α 2a and IFN- γ were not detectable in the serum of active TB patients (fig. 13b and 13 c), elevated levels of the IFN-inducible chemokine CXCL10 were detected in the blood of active TB patients relative to controls (fig. 4 e).
Although IFN-gamma is directed against including mycobacteria14-16,30The role of type I IFNs is less clear, as protective during the immune response of intracellular pathogens is shown. Signaling through type I IFNR (IFN-. alpha.beta.R) is crucial for defense against viral infections31However, IFN-. alpha.beta.appears to be detrimental in intracellular bacterial infection32-34. However, the role of IFN- α β in TB infection is unclear; a number of papers have shown detrimental effects35-37(ii) a While others do not38,39. There are a few cases reported that there is a synergistic effect between IFN-gamma treatment of hepatitis C virus infection and Mycobacterium tuberculosis infection40,41。
The inventors analyzed significance by comparing patients with other bacterial and inflammatory diseases52The whole blood signature of the TB-specific 86-gene was identified. The 86-gene signature was then tested against patients normalized by class prediction against self-controls from seven independent datasets (k-nearest neighbors) (figure 4 f). The sensitivity in the training and validation sets of TB was 92% and 90%, respectively, to differentiate active TB from other diseases with a cumulative specificity of 83%. As with 393 gene signatures, these 86 gene signatures were attenuated in response to treatment (fig. 4 g) and the same heterogeneity was reflected in the same samples from patients.
To identify functional components of the transcriptional host response during active TB, the present inventors used a modular data mining strategy that uses a set of genes that are co-expressed in different diseases and defined them as specific modules that demonstrate intrinsic functional associations, usually by unbiased literature profiling18. Compared to healthy controls, the blood module signatures of patients with active TB (only undetected transcripts were screened out in at least two individuals, α = 0.01) were similar in all three TB datasets (fig. 4h), confirming the reproducibility of the transcriptional signatures.
This modular TB signature reflects a decrease in abundance of B-cell (modular, M1.3) and T-cell (M2.8) transcripts, and an increase in abundance of myeloid-related transcripts (M1.5 and M2.6). The change in the maximum proportion of transcripts in a given module of TB was in the IFN inducible module (M3.1; 75-82% of IFN module transcripts) (FIG. 4 h). Since it has been demonstrated that type I IFN-inducible signatures are linked to the pathogenicity of the disease in peripheral blood mononuclear cells from patients with SLE53,54The inventors compared whole blood module signatures from patients with other diseases. Patients with SLE demonstrated an overexpression of the IFN-inducible signature (M3.1 (fig. 4 h)), but showed a deletion of the plasma cell-associated module in TB (M1.1 (fig. 4 h)). Blood module signatures from patients with group a streptococcal or staphylococcal infection or stills disease showed little to no change in the IFN-inducible module (M3.1), but over-expression in the neutrophil-associated module (M2.2), distinguishing these diseases from TB (fig. 4 h). Thus, IFN-inducible signatures are not common to all inflammatory responses, but are preferentially induced in some diseases, potentially reflecting protection or pathogenicity. Although SLE and TB share a common inflammatory component, such as IFN-inducible responses, the overall pattern of transcriptional changes (fig. 4h) and their amplitude distinguish one disease from another.
To determine whether the high transcriptional abundance of IFN gene-inducible genes in the blood of active TB patients could be attributed to a particular cell type, we purified neutrophils, monocytes, and CD4+And CD8+Gene expression for IFN- γ and type I IFN α/β receptor signaling pathways was evaluated in T cells, and in whole blood in comparison thereto (figure 5). Representative IFN-inducible transcript sets appeared to be more abundant in whole blood of active TB patients compared to healthy controls (fig. 5 a). Strikingly, the IFN-inducible transcripts in monocytes purified from blood of active TB patients showed to be compared to equivalent cells from healthy controlsIt was substantially overexpressed in neutrophils and less extensively in monocytes (fig. 5 b). In contrast, CD4 purified from healthy control individuals+And CD8+CD4 purified from blood of active TB patients compared to T cells+And CD8+T cells showed no difference in IFN-inducible gene expression (fig. 5 b).
Neutrophils are professional phagocytes that have been demonstrated to be the major cell type infected by rapidly replicating mycobacterium tuberculosis in TB patients42. In genetically susceptible mice, the epidemic and neutrophil responses, compared to resistant mice, raise the theory that neutrophils contribute to pathogenesis in TB inflammation, rather than protecting the host43. Our studies support the role of neutrophils in the pathogenicity of TB. This may be caused by its excessive activation by IFN- γ and type I IFN, which we now suggest are the dominant transcriptional signatures in the blood of active TB patients and are expressed predominantly in neutrophils (fig. 5).
PDL is overexpressed by neutrophils from active TB patients.
One gene that has increased abundance in the blood of active TB patients clustered with IFN-inducible transcripts is programmed death ligand 1 (PDL-1, also denoted CD274 and B7-H1), an immunoregulatory ligand expressed on a variety of cells (fig. 6). PDL-1 has been reported to inhibit T cell proliferation and effector function by binding to the programmed death-1 receptor (PD-1) in chronic viral infections44,45. To determine what cells would overexpress PDL-1, whole blood populations from TB patients and healthy controls were analyzed by flow cytometry, and PDL-1 appeared to be upregulated in all leukocytes in active TB patients compared to the latency of the control/validation (SA) set (fig. 6a and fig. 14). Expression of PDL-1 was most pronounced in neutrophils from active TB patients, to a lesser extent in their monocytes, and not in their lymphocytes (fig. 6b and fig. 14). Obtained by these flow cytometry methodsIt was found that purified neutrophils from active TB patients expressed higher levels of PDL-1 than neutrophils from healthy controls. In contrast, there was expression of PDL-1 in 2 monocytes of 7 active TB patients and no detectable expression in their T cells (fig. 6 c). The increased abundance of PDL-1 transcripts in the blood of active TB patients disappeared after successful therapy, although it remained present after 2 months of treatment in most patients (fig. 6).
These findings confirm that the presence of PDL-1 in the blood of active TB patients may be pathogenesis-related and fail to control disease, consistent with reports in chronic viral infections44,45. In addition, it has been reported that the expression of PD-1 is elevated in human T cells from TB patients, that PD-1 expression is activated by sonicated Mycobacterium H37Rv, and that blocking antibodies against PDL-1/PD-1 are capable of enhancing antigen-specific IFN-. gamma.and cytotoxic CD8+T reaction46. In connection with our findings, HIV has been shown to be present in monocytes and CCR5+Induced PDL-1 expression in T cells appears to be IFN-alpha dependent and IFN-gamma independent47. Thus, increased PDL expression in neutrophils in response to type I interferon, as we suggest herein, may be a way that overexpression of interferon is detrimental to the host response. Whether blocking PDL-1/PD-1 signaling results in an enhancement of the protective response may depend on the type and stage of infection/vaccination48,49And may require targeting of the block to specific cells and sites to achieve enhanced protection from immunopathology. The effect of PDL-1 on the immune response during bacterial infection is therefore more complex than was preliminarily expected, which is supported by our findings that are highly expressed in neutrophils in the blood of active TB patients, but not in T-cells or monocytes.
Further understanding of host responses in TB is critical for improving diagnosis, vaccination and therapy (Young et al, 2008, JCI). The interest in this complex disease has been thwarted for a number of reasons, including the fact that clinically established latent TB actually presents a spectrum that progresses from a lack of viable mycobacteria to subclinical disease (Young et al, 2009, Trends Micro). Here we define the transcriptional signature of active TB, 393 genes in the blood of patients from london and south africa (figures 1, 14 and 15), which is absent in most latent TB patients and healthy controls. Furthermore, we were able to demonstrate the heterogeneity of the disease by using this protocol and analyzing the number of TB patients and healthy controls required to achieve significance. For example, a signature of active TB is observed in the blood of 10% of latent TB patients, which may reflect those individuals that may develop active disease in the future. This is molecular evidence that demonstrates the heterogeneity of TB for the first time, suggesting that this molecular scheme is useful in determining which patients with latent TB should undergo anti-mycobacterial chemotherapy. Furthermore, a longitudinal study is required to verify that this signature is indeed predictive for a latent patient to have TB disease in the future.
The size and complexity of the microarray data generated makes interpretation difficult, often forcing scientists to focus on a large number of candidate genes for further study50,51This is not sufficient for specific biomarkers for diagnosis and provides little information on the pathogenesis of the disease. To improve our understanding of the host factors that reveal the pathogenesis of TB, we used three different complementation assay formats: and analyzing the module, the path and the gene level, thereby realizing the deep attention to the biological path reflected by the gene signature. Various approaches identify general biological pathways involved in mycobacterium tuberculosis host transcription reactions, and the identified IFN-inducible genes form a key part of the immune signature of active pulmonary TB. We first used modular analysis, which is the most unsupervised way and therefore least prone to bias. Modules are derived from multiple independent data sets and annotated by literature-side writing, which strongly integrates experimental data with knowledge from collected literature18. This modular analysis reflects active TB diseaseSignificant IFN-inducible signature. This was verified in a separate manner using the Ingeneity Pathways analysis, which was entirely from published literature and demonstrated the significance of the IFN-inducible signature and further reflected that the signature was composed of IFN-. gamma.and type I IFN-inducible genes. Since the two approaches analyzed different transcript lists, the identification of common biological processes by the two approaches corroborates the robustness of our findings. As a further level of validation, analysis at the level of a single gene confirms, but also extends findings from other analytical methods. Using these approaches and further immunological analysis, we revealed that a key component of the host blood transcription response against mycobacteria is a neutrophil-driven IFN-inducible signature that is eliminated by successful treatment. This study improves our understanding of the underlying biology of TB and may provide guidance for diagnosis and treatment in the future.
Blood represents the reservoir and compartment for the migration of cells of the innate and adaptive immune system, which contain neutrophils, dendritic cells and monocytes, or B and T cells, respectively, that are exposed to infectious agents in the tissue during infection. For this reason, whole Blood from infected individuals provides a source of clinically accessible relevant material, where unbiased phenotypes can be obtained using the previously described gene expression array validation for studying cancer in tissue (Alizadeh AA.,2000; Goub, TR.,1999; Bittner, 2000) and autoimmunity in Blood or tissue (Bleharski, JR et al, 2003) (Bennet, 2003; Baechler, EC,2003; Burczynski, ME,2005; Chaussabel, D.,2005; Cobb, JP.,2005; Kaizer, EC, 2007; Allataz, 2005; Allantaz, 2007) and inflammation (Thach, DC., 2005) as well as infectious diseases of Ramillo, Blood, 2007). Microarray analysis of gene expression by blood leukocytes has identified diagnostic and prognostic gene expression signatures that lead to a better understanding of the mechanisms of the disease that occurs and the response to treatment (Bennet, L2003; Rubins, KH.,2004; Baechler, EC,2003; Pascual, V.,2005; Allantaz, F., 2007). These microarray approaches have been attempted for studying active and latent TB, but only a few differentially expressed genes were obtained in a relatively small number of patients (chemistry, R., 2007) (Jacobsen, m., Kaufmann, S h.,2006; chemistry, R, Lukey, PT, 2007), which were not robust enough to distinguish between other inflammatory and infectious diseases.
Additional method
Recruitment of participants and description of patients. This study was approved by NHS Research Ethics Committee (REC 06/Q0403/128) located locally in St Marys Hospital (LREC), London, UK and University of Cape Town (REC 012/2007) located in the republic of south Africa. All participants were over 18 years of age and gave written informed consent. Participants were recruited from St. Mary's Hospital and Hammersmith Hospital, Imperial College Healthcare NHSTRust, London, UK and Hillingdon Hospital, The Hillingdon Hospitals NHS Trust, Uxbridge, UK, and Ubuntu TB/HIV clinic, Khayelitha, Cape Town, SouthAfrica. Patients were voluntarily enrolled and sampled before any antimycobacterial onset and included in the final analysis only if they met all clinical rules of the relevant study group. A subset of active TB patients enrolled in london into the first cohort were sampled at 2 and 12 months after the start of therapy. Patients who were pregnant, immunosuppressed or had diabetes or autoimmune disease were not eligible and were excluded from the study. In south Africa, Abbott was used by all participantsConventional HIV testing was performed with the HIV1/2 Rapid antibody detection kit (Abbott Laboratories, Abbott Park, Illinois, USA). Active TB patients were confirmed by laboratory isolation of mycobacteria on Mycobacterial cultures of respiratory specimens (sputum or bronchoalveolar lavage fluid), The sensitivity test of which was performed by The Royal Brompton Hospital Mycobacterium Reference laboratory, London, UK or The Reference laboratory of The National health laboratory Service, Groote Schuur Hospital, Cape Town. At UK, latent patients are those with positive TST, forwarded from TB outpatient, while having negative results with IGRA. Latent TB participants in south africa were recruited from self-identified individuals in the self-testing outpatient of the Ubuntu TB/HIV outpatient, and IGRA positives alone were used to confirm the diagnosis, regardless of the TST results (although still performed). Healthy control participants were recruited from volunteers in the National Institute for Medical Research (NIMR), Mill Hill, London, UK. In order to meet the final study admission rules, healthy volunteers must be negative in both TST and IGRA.
Tuberculin skin test. This is according to the UK guidelines1This was done using 0.1ml (2TU) of tuberculin PPD (RT23, Serum Statens Institute, Copenhagen, Denmark). According to the UK national guidelines 2, TST positive is marked for > 6mm if BCG is not inoculated, and TST positive is marked for > 15mm if BCG is inoculated.
Interferon gamma release detection assay:gold In-Tube assay (Cellesis, Carnegie, Australia) was performed according to the manufacturer's instructions.
Total and differential counts of leukocytes. 2ml of whole blood were collected in Terumo Venosa 5mlK2-EDTA tubes (Terumo Europe, Leuven, Belgium). The samples were then analyzed within 4 hours using a Nihon KohdenMEK-6400 Automated Hematology Analyzer (Nihon Kohden Corporation, Tokyo, Japan).
Evaluation of radiographic extent of disease. Digital images of general chest radiographs of all patients recruited in london were taken and used by three independent physicians blinded to transcript profiling and clinical data.s.national cyberccumulosis and Respiratory Disease Association modified version of the classification system3And (4) grading. The system radiographs the disease based on the density and extent of the lesion, and the presence or absence of cavitationThe degree is described as the "mild", "intermediate" or "very late" stage. We modified the system for our study so that it also included a classification of "no disease" and also indicated the presence of pleural disease or lymphadenopathy. This system is then translated into a decision tree to aid classification (fig. 9 a).
RNA sampling, extraction and processing of microarray analysis. 3ml of whole blood was collected into Tempus tubes (Applied Biosystems, Foster City, Calif., USA), mixed vigorously immediately after collection, and stored between-20 ℃ and-80 ℃ prior to RNA extraction. In the samples of the training set, RNA was isolated from 1.5ml of whole Blood and PerfectPure RNA Blood kit (5PRIME Inc, Gaithersburg, Md., USA). Magmax was used from 1ml of whole blood in the samples of the test and validation (SA) setsTM96Blood RNA Isolation Kit (Applied Biosystems/Ambion, Austin, TX, USA) according to the manufacturer's instructions. Followed by GLOBINclearTM96-well format kit (Applied Biosystems/Ambion, Austin, TX, USA) globulins were removed according to the manufacturer's instructions. The integrity of all and globin-depleted RNA was assessed using an Agilent 2100Bioanalyzer, showing RIN amounts of 7-9.5(Agilent Technologies, Santa Clara, Calif., USA). RNA yield was assessed using a Nanodrop 1000 spectrophotometer (Nanodrop Products, Thermo Fisher Scientific Inc, Wilmington, DE, USA). Biotinylated amplified antisense complementary RNA targets (cDNA) were then prepared from 200- "250 ng of globin-depleted RNA using the Illumina CustomPrep RNA amplification kit (applied biosystems/Ambion, Austin, TX, USA). 750ng of the labeled cDNA was hybridized overnight to an Illumina Human HT-12BeadChip array (Illumina Inc, San Diego, Calif., USA) containing more than 48000 probes. The array was then washed, blocked, stained and scanned on the illuminabendation 500 according to the manufacturer's protocol. Illumina BeadStudio v2 software (Illumina Inc, San Diego, CA, USA) was used to generate signal intensity values from the scan results.
Isolation of separated cells and RNA extraction. Whole blood was collected into EDTA. In sequenceNeutrophils were isolated using Dynabeads according to the manufacturer's instructions (CD 15)+) Monocyte (CD 14)+)、CD4+T cells and CD8+T cells. RNA was extracted from whole blood (5' Prime perfect pure kit) or from isolated cell populations (Qiagen RNEasy Mini kit) and stored at-80 ℃ until use.
Analysis of microarray data.
And (6) normalizing. Illumina BeadStudio v2 software was used to subtract background and amplify the average signal intensity of each sample to the global signal intensity of all samples. Further normalization was performed using the gene expression analysis software program GeneSpring GX, version 7.1.3(Agilent Technologies, Santa Clara, Calif., USA, hereafter GeneSpring). All signal strengths less than 10 are set equal to 10. Subsequently, normalization of each gene was performed by dividing the signal intensity of each probe in each sample by the median intensity of the probe in all samples. These normalized data were used for all downstream analyses except for the assessment of molecular distance to health as detailed below.
And (5) classifying and predicting. We utilized one of the class prediction tools available in GeneSpring. The prediction model employs a K-nearest neighbor algorithm that uses 10 neighbors and a p-value proportional interrupt value of 0.5. All genes from the 393 transcript list were used for this prediction. The predictive model was optimized by cross-validation of the training set, excluding exceptions to individual activities. This model is then used to predict the classification of samples in the independent test set and validation set. For those for which no prediction was made, it was recorded as an intermediate result. Sensitivity, specificity and 95% confidence intervals (95% CI) were determined using GraphPad Prism version 5.02 for Windows. The p-value was determined using the two-sided Fisher's Exact test.
Supervised analysis: (i) transcriptional changes or "molecular distance from health". This technique proceeds as described previously4. The goal is to convert the transcript abundance value toFormation is a representative score that indicates a measure of transcriptional perturbation for a given sample relative to a healthy baseline. This is done by determining whether the expression value for a given sample is within or outside of two standard deviations of the mean of healthy controls.
Supervised analysis: (ii) and (5) analyzing the path. An additional functional Analysis of differentially expressed genes is the use of Ingeneity pathway Analysis (Systems,Inc.,Redwood,CA,USA,www.ingenuity.com) To proceed with. Classical pathway analysis identified the most prominently represented pathways in the dataset from the Ingenity pathway analysis. The significance of the association between the dataset and the classical pathway was measured by Fisher's Exact test to calculate a p-value representing the likelihood of association of transcripts in the dataset, whereas the classical pathway was explained by chance alone, using Benjamini-Hochberg correction for the tests performed. The program can also be used to map a classical network and overlap it with expression data from a dataset.
Supervised analysis: (iii) and analyzing the transcription module. The analysis was performed as described previously 4, 5. In the context of this study, since the module framework was obtained using Affymetrix HG U133A & B GeneChip, it was necessary to translate the probe containing the module into its equivalent on the Illumina platform. RefSeq ID was used to match probes between the Affymetrix HG U133 and Illumina WG-6V2 platforms. Of the 5348 Affymetrix probe sets, 2109 clear matches were found and used in this module analysis. The matched probe remains in its original module. To represent global transcriptional changes from the image, the disease group as a whole was aligned with points on the grid relative to the healthy control group as a whole, the positions of each corresponding to a different module based on the original definition. The density of the dots indicates the percentage of differentially expressed transcripts in the displayed direction over the total number of transcripts detected in the module, while the color of the dots indicates the direction of change (red = over-represented, blue = under-represented).
Multiplex serum protein measurements. 1-4ml of blood were collected into tubes of serum agglutination activators (Greiner BioOne 1ml vacuette tube, ref 454098, Greiner BioOne, Kremm Hunst, Austria or BD 4ml vacutainer tube, ref 368975; Becton Dickinson). The tubes were centrifuged at 2000g for 5 minutes at room temperature and the serum fractions were extracted and frozen at-80 ℃ until analysis. Analysis of multiplex cytokine bead-based immunoassays by Millipore UK (Millipore UK Ltd, Dundee, UK) usingMulti-analysis Profiling systems (Millipore, Billerica, MA, USA). In this manner, 63 cytokines, chemokines, soluble receptors, growth factors, adhesion molecules and acute phase proteins were measured in each sample. The levels of the following were tested: MMP-9, C-reactive protein, serum amyloid A, EGF, eotaxin, FGF-2, Flt-3 ligand, trephine, G-CSF, GM-CSF, GRO, IFN- α 2, IFN- γ, IL-10, IL-12p40, IL-12p70, IL-13, IL-15, IL-17, IL-1 α, IL-1 β, IL-1Rγ, IL-2, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, CXCL10(IP10), MCP-1, MCP-3, MIP-1 α, MIP-1 β, PDGF-AA, PDGF-AB/BB, RANTES, soluble CD40 ligand, soluble IL-2RA, TGF- α, TGF-3, or TGF-1, IL-1 α, or a pharmaceutically acceptable salt thereof, TNF-alpha, VEGF, MIF, soluble Fas ligand, tPAI-1, soluble ICAM-1, soluble VCAM-1, soluble CD30, soluble gp130, soluble IL-1RII, soluble IL-6R, soluble RAGE, soluble TNF-RI, soluble TNF-RII, IL-16, TGF beta 1, TGF beta 2, and TGF beta-3.
Flow cytometry. Mu.l of whole blood was incubated with the appropriate antibodies for 20 min at room temperature in the dark in each staining channel, followed by lysis of red blood cells using BD FACS lysis solution (BD Biosciences) for 10 min at room temperature in the dark. Cells were spun out and washed in 2ml FACS buffer (PBS/BSA/Azide) followed by fixation in 1% paraformaldehyde. The samples were then run on a beckmann coulter cell using a Summit Software Version 3.02. The analysis was performed using FlowJo Version 8.7.3(Tree Star, Inc.) for Macintosh. The choices used are listed in figures 11 and 12 by policy. Flow cytometry data collected were tested for significance using the Mann-Whitney Rank Sum U-test, where appropriate. All antibodies were purchased from BD Pharmingen or Caltag Laboratories (Invitrogen), except CD45RA was purchased from beckmann coulter.
And (5) carrying out statistical analysis. Molecular distance to health and modular framework analytical calculations were done using Microsoft excel 2003(Microsoft Corporation, Redmond, WA, USA). Statistical analysis and correction analysis of continuous variables were performed using GraphPad Prism version 5.02(GraphPad Software, San Diego California USA, www.graphpad.com) for Windows. The analysis of absolute variables was done using SPSS version 14 for Windows (Chicago, Illinois, USA).
Fig. 10a to 10 d. The whole blood transcriptional signature of active TB reflects unique changes in cellular composition as well as changes in absolute levels of gene expression. Gene expression of active TB was plotted against healthy controls in a pre-defined modular frame. The density of dots represents the proportion of transcripts that were significantly differentially expressed for each module (red = increased, blue = decreased, transcript abundance). Functional interpretations previously determined by unbiased literature analysis are indicated by a grid color-coded in main graph 4. Here it is demonstrated that in the training set (10 a), the percentage of genes in each module is either increased (red) or decreased (blue); (10b) testing the set; (10c) a validation Set (SA). (10d) Healthy weighted molecular distances were calculated for each patient at baseline pretreatment (0 month) and at months 2 and 12 after the start of anti-mycobacterial treatment. The numbers of the individual patients correspond to those shown in figures 3a to 3 d.
Fig. 11a to 11 c. Analysis of lymphocytes in blood of active TB patients and controls. (11a) Shows the total blood for healthy controls and active TB patients from the test setFlow cytometric selection strategies for analysis of T cells and B cells. The top column of the panel shows the posterior opt-through strategy for determining the lymphocyte FSC/SSC opt-through used in the subsequent opt-through. A large FSC/SSC selection pass (left panel) was initially set and CD45vs CD3 was subsequently analyzed. CD45CD3 (middle panel) was selected and their FSC/SSC profile determined (right panel). This profile was then used to determine the appropriate lymphocyte FSC/SSC selection pass (see second row, left panel). This latter option is also programmed in CD45+CD19+(B cells) selection passes were performed to ensure that these cells were included in the lymphocyte selection pass (not shown). The second row of the panel shows the selection pass strategy for identifying T cell populations. Lymphocyte FSC/SSC selection was set and these cells were evaluated for cd45vs. cd3 (from the second panel from the left). Then choose to pass CD45+Cells, and CD3vs CD8 was evaluated. Selection by CD3+T cells, and the expression of CD4 and CD8 was assessed. Then choose to pass CD4+And CD8+A subset of (2). Lines 3-6 show the selection pass strategy for defining T cell memory subsets. Evaluation of expression of CD45RA vs CCR7 on CD4 and CD8T cells selected for passage in row 2, and definition of naive (CD45 RA) based on a quadrant set of isotype controls (rows 5 and 6)+CCR7+) Central memory (CD45 RA-CCR 7)+) Effector memory (CD45 RA)-CCR7-) And in CD8+In the case of T cells, the effects of terminal differentiation (CD45 RA)+CCR7-) T cells. The expression of CD62L was also evaluated for these subsets. The bottom row of the panel shows the strategy for selecting through B cells. Lymphocyte FSC/SSC passage was set and cell CD45vs CD19 was evaluated. Selection passed cell CD45+And CD19 and CD20 were evaluated. B cell definition is CD19+CD20+. (11b) To obtain a T cell memory population, whole blood from 11 test set healthy controls (control) and 9 test set active TB patients (active) was analyzed by multiparameter flow cytometry. The complete flow cytometric selection pass strategy is shown in FIG. 11 a. The graphs show the proto-state, intermediate memory (TCM), effector memory (TEM) and terminal differentiation effector (TD, CD8 only) of all individuals+T cells) percentage of cell subsets (top row, in each group), and number of cells in each cell subset (x 10)6Ml) (bottom row, groups). Each symbol represents an individual patient. The horizontal line represents the median value. (11c) Gene (i) T cell transcript abundance in whole blood samples from active TB (training, test and validation set); (ii) expression in isolated blood lymphocyte populations from test set blood. Gene abundance/expression is shown in comparison to the median of healthy controls (as marked in figure 1). The numbers and separate populations displayed in the experimental set correspond to individual patients.
Fig. 12a to 12 c. Analysis of myeloid cells in blood of active TB patients and controls. (12a) Flow cytometric selection pass strategies for analysis of monocytes and neutrophils from whole blood of healthy controls and active TB patients of the test set are shown. Large FSC/SSC selection passes were set (top row, left panel) followed by analysis of CD45vs CD 14. Selection passed cell CD45+(panel in the middle) and evaluate CD14vs CD 16. Monocyte is defined as CD14+Inflammatory monocytes are defined as CD14+CD16+And neutrophil is defined as CD16+. Also shown in the figure is a means for assessing CD16+A possible overlap between neutrophils and NK cells expressing CD16 was selected by strategy. A large FSC/SSC selection pass was set to pass both neutrophils and NK cells. (12b) CD45 was subsequently evaluated+CD16vs CD56 (NK cell marker) of cells. CD16+Neutrophils expressed high levels of CD16 instead of CD56 (as shown by the isotype control plot, bottom panel). CD56+NK cells expressed moderate levels of CD16 and did not overlap with CD16hi cells. CD56+CD16int cells and CD16hi cells have different FSC/SSC characteristics. (12c) Medullary sampleGene (i) transcript abundance in whole blood samples from active TB patients (training, test and validation set); and (ii) expression in an isolated blood lymphocyte population from the test set blood. Gene abundance/expression is shown as a median comparison to healthy controls (as labeled in figure 1). The numbers and separate populations displayed in the experimental set correspond to individual patients.
Fig. 13a and 13 b. 393 transcript-signed Ingenity Pathways analysis. (13a) The probability of significant overexpression of each typical biological pathway (corrected by Benjamini-Hochberg multiplex tests according to the logarithm of the p-value calculated by the Fischer exact test) is indicated by the orange squares. The solid colored bars represent the percentage of the total number of genes comprising the pathway (given in bold on the right edge of each bar) present in the gene list analyzed. The color of the bars indicates the abundance of transcripts in whole blood of those active TB patients in the training set and healthy controls compared to this. (13b) Serum levels of interferon alpha-2 a (IFN-alpha 2a) and interferon gamma (IFN-gamma) are shown for 12 healthy controls and 13 patients with active TB who were used for the training set microarray analysis. No significant difference was observed between groups using the two-sided Mann-Whitney assay for either cytokine. Horizontal lines indicate the mean of each group and side lines indicate 95% confidence intervals.
Fig. 14a and 14 b. Expression of PDL1 (CD 274) in whole blood and cell subsets from individual healthy controls and patients with active TB. (14a) Expression of PDL1 was analyzed by flow cytometry in whole blood from 11 test set healthy controls (control) and 11 test set active TB patients (active). A large FSC/SSC selection pass was set to pass all leukocytes and the geometric Mean Fluorescence Intensity (MFI) of PDL1 (indicated in red) was compared to the isotype control evaluated (green). Each active TB patient was analyzed on a different day, healthy controls were analyzed in groups (starting from the left, samples 1 and 2, 3 and 4, 6-8 and 9-11 were run together, 5 was run alone) and one isotype control was shared for the samples in each group. (14b) In section a, cell subsets from blood of the same 11 test set healthy controls (controls) and 11 test set active TB patients (active) were also analyzed by flow cytometry for expression of PDL 1. The cell subpopulations were as defined in fig. 6b, and the MFI of PDL1 (red) was compared to the mapped isotype control (green).
FIGS. 15 a-f. The 393 transcript profiles of the training set, ordered according to the study group, are shown magnified with the gene symbols listed in the right part of the figure. Key transcripts are highlighted with larger words. The left part of each figure shows the entire gene tree, thermodynamic diagram, and an enlarged area marked by a black rectangle. The relative abundance of transcripts is indicated on a color scale at the bottom of the graph (as shown in figure 1).
FIGS. 16a to 16 are thermodynamic diagrams comparing various genes for control, latency and activity, the genes listed on the right hand side of the thermodynamic diagrams.
Fig. 17a to 17c are tables of statistics for a plurality of training sets, test sets and validation sets listed in the tables, i.e. gender, country of origin and ethnicity (ehttinity) for a plurality of breaks.
FIGS. 18a to 18c are tables of statistical results for a plurality of training, test and validation sets listed in the tables, i.e., test results for TST, BCG inoculation and shear status
Fig. 19 is a table summarizing the specificity and sensitivity results for the training set, test set, and validation set across multiple sources of samples.
Reference to the method:
1.Salisbury,D.,Ramsay,M.Immunization against infectious diseases-the Green Book.D.O.Health,London The Stationery Office,391-408(2006).
2.National Institute for Health and Clinical Excellence.(Royal College ofPhysicians,UK,2006).
3.Falk,A.,O′Connor,J.B.Classification of pulmonary tuberculosis:Diagnosis standards and classification of tuberculosis.National tuberculosis andrespiratory disease association 12,68-76(1969).
4.Pankla,R.et al.Genomic Transcriptional Profiling Identifies aCandidate Blood Biomarker Signature for the Diagnosis of SepticemicMelioidosis.Genome Biol In press(2009).
5.Chaussabel,D.et al.A modular analysis framework for blood genomicsstudies:application to systemic lupus erythematosus.Immunity 29,150-64(2008).
genes in Module M1.3
Genes in Module M2.8
Genes in Module M1.5
Genes in the M2.6 module
Genes in Module M2.2
Genes in Module 3.1
It is contemplated that any of the embodiments discussed in this specification can be practiced with respect to any of the methods, kits, reagents or compositions of the invention, and vice versa. Furthermore, the compositions of the present invention may be used to carry out the methods of the present invention.
It is to be understood that the specific embodiments described herein are shown by way of illustration and not as limitations of the invention. The principal features of this invention can be employed in various embodiments without departing from the scope of the invention. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific procedures described herein. Such equivalents are considered to be within the scope of the invention and are covered by the claims.
All publications and patent applications mentioned in this specification are indicative of the level of skill of those skilled in the art to which this invention pertains. All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
The word "a" or "an" when used in conjunction with the term "comprising" in the claims and/or the specification may mean "one", but it is also consistent with the meaning of "one or more", "at least one", and "one or more than one". The term "or" as used in the claims means "and/or" unless explicitly indicated to refer only to alternatives, or alternatives are mutually exclusive, although the disclosure supports the definition of referring only to alternatives and "and/or". Throughout this application, the term "about" is used to describe a value that includes inherent variations in the error of the apparatus and method used to determine the value, or variations that exist between study subjects.
As used in this specification and the claims (or claims), the word "comprising" (and any form of "comprising", such as "comprises" and "comprises)", "having" (and any form of "having", such as "having" and "having)", "including" (and any form of "including", such as "including" and "including", or "containing" (and any form of "containing", such as "containing" and "containing", is inclusive or open-ended, and does not exclude additional, unrecited elements or process steps.
As used herein, the term "or combinations thereof refers to all permutations and combinations of the listed items preceding the term. For example, "A, B, C or a combination thereof" is intended to include at least one of a, B, C, AB, AC, BC, or ABC, and if order is important in a particular context, BA, CA, CB, CBA, BCA, ACB, BAC, or CAB. Continuing with this example, it is expressly included that combinations comprising repetitions of one or more items or terms, such as BB, AAA, MB, BBC, AAABCCCC, CBBAAA, CABABB, and the like. Those of skill in the art will understand that there is generally no limitation on the number of items or terms in any combination, unless otherwise apparent from the context.
All of the compositions and/or methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the compositions and/or methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
Reference to the literature
1.WHO.(WorldHealth Organization,Geneva,2008).
2.Anderson,S.R.,Maguire,H.& Carless,J.Tubereulosis in London:a dccade and a half of nodeelinc [corrcetcd].Thorax 62,162-7(2007).
3.Trunz,B.B.,Finc,P.& Dye,C.Effeet of BCG vaeeination on childhood tuberculous meningitisand miliary tuberculosis worldwide:a meta-analysis and assessment of cost-effeetiveness.Lancet 367,1173-80(2006).
4.Young,D.B.,Perkins,M.D.,Dunean,K.& Barry,C.E.,3rd.Confronting the scientificobstaeles to global control of tuberculosis.J Clin Invest 118,1255-65(2008).
5.Center for Communieable Discase Control and Prevention.(ed.U.S.Department of Health andHuman Serviees,C.)XX (Atlanta,GA,2007).
6.Pfyffcr,G.E.,Cieslak,C.,Welseher,H.M.,Kissling,P.& Ruseh-Gerdcs,S.Rapid detcction ofmycobactcria in clinieal speeimens by using the automated BACTEC 9000MB systcm and eomparisonwith radiometric and solid-culture systems.JClin Mierobiol 35,2229-34(1997).
7.Schoeh,O.D.et al.Diagnostie yield of sputum,induced sputum,and bronchoscopy afterradiologic tuberculosis serccning.A m J Rcspir Crit Care Med 175,80-6(2007).
8.Storla,D.G.,Yimer,S.& Bjunc,G.A.A systematic review of delay in the diagnosis andtreatment of tubereulosis.BMC Publie Hcalth8,15(2008).
9.Comstock,G.W.,Livesay,V.T.&Woolpert,S.F.The prognosis of a positive tuberculinreaction in childhood and adolescence.Am J Epidemiol 99,131-8(1974).
10.Vynnycky,E.& Fine,P.E.Lifctime risks,incubation period,and serial interval of tuberculosis.Am J Epidemiol 152,247-63(2000).
11.Young,D.B.,Gideon,H.P.& Wilkinson,R.J.Eliminating latent tuberculosis.TrendsMicrobiol17,183-8(2009).
12.National Institute for Health and Clinical Excellence.(Royal College of Physicians,UK,2006).
13.Ottenhoff,T.H.Overcoming the global crisis:″ycs,we can″,but also for TB...?Eur J Immunol39,2014-20(2009).
14.Casanova,J.L.& Abel,L.Genetic disseetion of immunity to mycobacteria:the human model.Annu Rev Immunol 20,581-620(2002).
15.Cooper,A.M.Cell-mcdiatcd immune rcsponscs in tubcrculosis.Annu Rev Immunol 27,393-422(2009).
16.Flynn,J.L.& Chan,J.Immunology oftubereulosis.Annu Rev Immunol 19,93-129(2001).
17.Kcane,J.et al.Tubereulosis associated with infliximab,a tumor nccrosis factor alpha-neutralizing agent.N Engl J Med 345,1098-104(2001).
18.Chaussabel,D.et al. A modular analysis framcwork for blood gcnomics studies:application tosystemie lupus erythcmatosus.Immunity29,150-64(2008).
19.Paseual,V.et al.How the study of childrcn with rheumatie diseases identified interferon-alphaand interleukin-I as novel therapcutie targets.Immunol Rcv 223,39-59(2008).
20.Benoist,C.,Germain,R.N.& Mathis,D.A plaidoyer for′systems immunology′.Immunol Rev210,229-34(2006).
21.Allmark,P.Should research samples reflect the diversity of the populationJ MedEthies 30,185-9(2004).
22.Cottin,V.et al.Small-eell lung cancer:patients included in clinieal trials are not reprcsentativeof the patient population as a whole.Ann Oncol 10,809-15(1999).
23.Simon,R.,Radmacher,M.D.,Dobbin,K.& MeShanc,L.M.Pitfalls in the use of DNAmieroarray data for diagnostie and prognostie elassifieation.J Natl Cancer Inst 95,14-8(2003).
24.Barry,C.E.,3rd et al.The spectrum of latent tuberculosis:rethinking the biology andintervention sttategies.Nat Rev Mierob iol 7,845-55(2009).
25.Center for Communieable Disease Control and Prevention.Misdiagnosis of tuberculosis resultingfrom laboratory cross-contamination of Mycobacterium tuberculosis cultures.MMWR,New Jerscy 49,413-16(2000).
26.Pankla,R.et al.Genomie Transeriptional Profiling Identifies a Candidate Blood BiomarkcrSignature for the Diagnosis of Septieemic Melioidosis.Genome Biol Re-submitted(2009).
27.Beek,J.S.,Potts,R.C.,Kardjito,T.& Grange,J.M.T4 lymphopenia in patients with aetivepulmonary tuberculosis.Clin Exp Immunol 60,49-54(1985).
28.Rodrigues,D.S.et al.Immunophenotypie characterization of peripheral T lymphocytes inMycobaeterium tubereulosis infeetion andisease.Clin Exp Immunol 128,149-54(2002).
29.Auffray,C.,Sieweke,M.II.& Geissmann,F.Blood monocytes:development.heterogeneity.and relationship with dendritic cells.Annu Rev Immunol 27,669-92(2009).
30.Sher,A.& Coffman,R.L.Regulation of immunity to parasites by T cells and T eell-derivedcytokines.Annu Rev Immunol 10,385-409(1992).
31.Theofilopoulos,A.N.,Baeeala,R.,Beutler,B.& Kono,D.H.Type I interfcrons(alpha/bcta)inimmunity and autoimmunity.Annu Rev Immunol 23,307-36(2005).
32.Auerbuch,V.,Brockstedt,D.G.,Meyer-Morse,N.,O′Riordan,M.& Portnoy,D.A.Micelacking the type I interferon receptor are resistant to Listeria monocytogenes.J Exp Med 200,527-33(2004).
33.Carrero,J.A.,Calderon,B.& Unanue,E.R.Type I interferon sensitizes lymphocytes toapoptosis and reduces resistance to Listeria infeetion.J Exp Med 200,535-40(2004).
34.O′Connell,R.M.et al.Type I interferon production enhances susceptibility to Listeriamonocytogenes infcction.J Exp Med 200,437-45(2004).
35.Bouchonnct,F.,Boechat,N.,Bonay,M.& Hance,A.J.Alpha/beta interfcron impairs the abilityof human macrophages to control growth of Mycobacterium bovis BCG.Infeet Immun 70,3020-5(2002).
36.Manca,C.ct al.Hypervirulent M.tuberculosis W/Beijing strains uprcgulate type I IFNs andinerease expression of negative regulators of the Jak-Stat pathway.J Interferon Cytokine Res 25,694-701(2005).
37.Stanley,S.A.,Johndrow,J.E.,Manzanillo,P.& Cox,J.S.The Type I IFN response to infeetionwith Mycobacterium tuberculosis requires ESX-1-mediated secretion and contributes to pathogenesis.JImmunol 178,3143-52(2007).
38.Cooper,A.M.,Pearl,J.E.,Brooks,J.V.,Ehlers,S.& Orme,I.M.Expression of the nitrie oxidesynthase 2 gene is not essential for early control of Mycobacterium tuberculosis in the murine lung.InfectImmun 68,6879-82(2000).
39.Shi,S.et al.Expression of many immunologically important genes in Mycobacteriumtuberculosis-infceted macrophages is independent of both TLR2 and TLR4 but dependent on IFN-alphabeta receptor and STAT1.J Immunol 175,3318-28(2005).
40.Farah,R.& Awad,J.The association of intcrferon with the development of pulmonarytubcrculosis.Int J Clin Pharmacol Ther 45,598-600(2007).
41.Telesca,C.et al.lnterfcron-alpha treatment of hepatitis D induces tuberculosis exacerbation in animmigrant.J Infeet 54,c223-6(2007).
42.Eum,S.Y.et al.Neutrophils are the predominant infected phagocytic cells in the airways ofpatients with active pulmonary tuberculosis.Chest(2009).
43.Eruslanov,E.B.et al.Neutrophil responses to Mycobacterium tuberculosis infection ingenetically susceptiblc and resistant mice.Infcct Immun 73,1744-53(2005).
44.Barber,D.L.et al.Restoring function in exhausted CD8 T cells during chronie viral infection.Nature 439,682-7(2006).
45.Day,C.L.et al.PD-1 expression on HIV-secific T cells is associated with T-cell exhaustionand disease progression.Nature 443,350-4(2006).
46.Jurado,J.O.et al.Programmed death(PD)-1:PD-ligand 1/PD-ligand 2 pathway inhibits T celleffeetor functions during human tuberculosis.J Immunol 181,116-25(2008).
47.Boasso,A.et al. PDL-1 uprcgulation on monocytes and T cells by HIV via type I intcrfcron:restrieted expression of type I interferon receptor by CCR5-expressing leukocytes.Clin Immunol 129,132-44(2008).
48.Einarsdottir,T.,Lockhart,E.& Flynn,J.L.Cytotoxieity and seerction of gamma interferon arecarried out by distinct CD8 T cells during Mycobacterium tuberculosis infcetion.Infcet Immun 77,4621-30(2009).
49.Ha,S.J.,West,E.E.,Araki,K.,Smith,K.A.& Ahmed,R.Manipulating both the inhibitory andstimulatory immune system towards the success of therapeutie vaecination against chronic viralinfeetions.Immunol Rev 223,317-33(2008).
50.Jacobsen,M.et al. Candidate biomarkers for diserimination between infection and disease causedby Mycobacterium tuberculosis.J Mol Med 85,613-21(2007).
51.Mistry,R.et al.Genc-exprcssion patterns in whole blood identify subjcets at risk for recurrenttuberculosis.J Infcct Dis 195,357-65(2007).
52.Allantaz,F.et al.Blood leukocytc mieroarrays to diagnose systemie onset juvenileidiopathie arthritis and follow the response to IL-1blockade.J.Exp.Med.204,2131-2144(2007).
53.Baechler.E.C.et al.Interferon-inducible gene expression signature in peripheral blood cells ofpatients with severe lupus.Proc.Natl Acad.Sei.USA 100,2610-2615(2003).
54.Bennett,L.et al.Interfcron and granulopoiesis signatures in systemie lupus erythematosus blood.J.Exp.Med.197,711-723(2003).
Claims (25)
1. A method for detecting active mycobacterium tuberculosis infection that appears latent/asymptomatic, the method comprising:
obtaining a patient's gene expression dataset from a patient suspected of being infected with latent/asymptomatic mycobacterium tuberculosis;
dividing the patient's gene expression dataset into one or more gene modules associated with mycobacterium tuberculosis infection; and is
Comparing the gene expression dataset for the patient for each of the one or more gene modules to a gene expression dataset from a non-patient also classified as the same gene module; wherein an overall increase or decrease in gene expression in the gene expression dataset for the patient of one or more gene modules is indicative of an active mycobacterium tuberculosis infection rather than a latent/asymptomatic mycobacterium tuberculosis infection.
2. The method of claim 1, further comprising the step of using the determined comparative gene product information to formulate at least one diagnostic, prognostic, or therapeutic regimen.
3. The method of claim 1, further comprising the step of distinguishing patients with latent TB from active TB patients.
4. The method of claim 1, wherein the patient's gene expression dataset is obtained from cells obtained from at least one of whole blood, peripheral blood mononuclear cells, or saliva.
5. The method of claim 1 wherein the patient gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200, 250, 300, 350 or 393 genes selected from the genes in table 2.
6. The method of claim 1, wherein the patient gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200 modules M1.3, M2.8, M1.5, M2.6, M2.2 and M3.1.
7. The method of claim 1, wherein the gene module associated with mycobacterium tuberculosis infection is selected from the group consisting of: module M1.3, module M2.8, module M1.5, module M2.6, module M2.2 and module 3.1.
8. The method of claim 1, wherein the gene modules associated with mycobacterium tuberculosis infection are selected according to the following changes: increased in B-cell associated genes, decreased in T-cell associated genes, increased in bone marrow associated genes, and increased in neutrophil associated transcripts and interferon inducible genes (IFNs).
9. The method of claim 1, wherein the disease state of the patient is further determined by radiologic analysis of the patient's lungs.
10. The method of claim 1, further comprising the step of determining the treated patient gene expression dataset after treating the patient, and determining whether the treated patient gene expression dataset has returned to a normal gene expression dataset, thereby determining whether the patient has been treated.
11. A method for predicting whether a mycobacterium tuberculosis infection that appears latent/asymptomatic will become an active mycobacterium tuberculosis infection, the method comprising:
obtaining a first gene expression dataset from a first clinical group obtained from active mycobacterium tuberculosis infection, a second gene expression dataset from a second clinical group obtained from patients with latent mycobacterium tuberculosis infection, and a third gene expression dataset from a clinical group obtained from uninfected individuals;
generating a gene cluster data set comprising differential expression of genes between any two of the first, second and third data sets; and is
Determining a unique expression/representative pattern indicative of latent infection, active infection or health, wherein the patient gene expression dataset comprises at least 6, 10, 20, 40, 50, 70, 80, 90, 100, 125, 150 or 200 genes obtained from genes in at least one of the modules M1.3, M2.8, M1.5, M2.6, M2.2 and M3.1, wherein an overall increase or decrease in gene expression in the patient gene expression dataset of one or more gene modules is indicative of active mycobacterium tuberculosis infection and not latent/asymptomatic mycobacterium tuberculosis infection.
12. A kit for diagnosing infection in a patient suspected of being infected with mycobacterium tuberculosis, the kit comprising:
a gene expression detector for obtaining a patient gene expression dataset from a patient, wherein the expressed genes are obtained from the patient's whole blood; and
a processor capable of comparing the gene expression dataset to a pre-defined gene module dataset associated with a mycobacterium tuberculosis infection and differentiating between infected and non-infected patients, wherein whole blood demonstrates a global change in the polynucleotide level in one or more transcriptional gene expression modules as compared to a matching non-infected patient, thereby differentiating between active/asymptomatic mycobacterium tuberculosis infection.
13. The kit of claim 12, wherein the patient's gene expression dataset is obtained from peripheral blood mononuclear cells.
14. The kit of claim 12, wherein the patient's gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200, 250, 300, 350 or 393 genes selected from the genes in table 2.
15. The kit of claim 12, wherein the patient gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200 modules M1.3, M2.8, M1.5, M2.6, M2.2 and M3.1.
16. The kit of claim 12, wherein the gene module associated with mycobacterium tuberculosis infection is selected from the group consisting of: module M1.3, module M2.8, module M1.5, module M2.6, module M2.2 and module 3.1.
17. The kit of claim 12, wherein the gene modules associated with mycobacterium tuberculosis infection are selected according to the following changes: decreased in B-cell associated genes, decreased in T-cell associated genes, increased in bone marrow associated genes, increased in neutrophil associated transcripts and interferon inducible genes (IFNs).
18. The kit of claim 12, wherein the gene is selected from PDL-1, CASP5, CR1, CASP5, TLR5, MAPK14, STX11, BCL6, and C5.
19. A system for detecting active mycobacterium tuberculosis infection that appears latent/asymptomatic, the system comprising:
a gene expression detector for obtaining a patient gene expression dataset from a patient, wherein the expressed genes are obtained from the patient's whole blood; and
a processor capable of comparing a gene expression dataset comprising at least one of modules M1.3, M2.8, M1.5, M2.6, M2.2 and M3.1 to a pre-defined gene module dataset associated with a mycobacterium tuberculosis infection and distinguishing patients having a latent mycobacterium tuberculosis infection at risk of progressing to disease, wherein whole blood demonstrates a global change in the level of polynucleotides in one or more transcriptional gene expression modules compared to matched uninfected patients, thereby distinguishing patients having a latent mycobacterium tuberculosis infection at risk of progressing to disease.
20. The system of claim 19, wherein the patient's gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200, 250, 300, 350, or 393 genes selected from the genes in table 2.
21. The system of claim 19, wherein the patient gene expression dataset is compared to at least 10, 20, 40, 50, 70, 80, 90, 100, 125, 150, 200 modules M1.3, M2.8, M1.5, M2.6, M2.2 and M3.1.
22. The system of claim 19, wherein the gene module associated with mycobacterium tuberculosis infection is selected from the group consisting of: module M1.3, module M2.8, module M1.5, module M2.6, module M2.2 and module 3.1.
23. The system of claim 19, wherein the gene modules associated with mycobacterium tuberculosis infection are selected according to the following changes: a decrease in a B-cell associated gene, a decrease in a T-cell associated gene, an increase in a bone marrow associated gene, an increase in a neutrophil associated transcript and an interferon inducible gene (IFN).
24. The system of claim 19, wherein the gene is selected from the group consisting of PDL-1, CASP5, CR1, CASP5, TLR5, MAPK14, STX11, BCL6, and C5.
25. A method for monitoring efficacy in a therapeutic agent assay, the method comprising:
obtaining a gene expression dataset for a patient from the patient suspected of being infected with mycobacterium tuberculosis;
dividing the patient's gene expression dataset into one or more gene modules associated with mycobacterium tuberculosis infection; and is
Dividing the patient's gene expression dataset into one or more gene modules associated with mycobacterium tuberculosis infection; and is
Comparing the patient's gene expression dataset for each of the one or more gene modules to a gene expression dataset from a non-patient;
treating a patient with the therapeutic agent; and is
Determining whether the therapeutic agent alters the patient gene expression profile to a non-patient gene expression dataset; wherein an overall increase or decrease in gene expression in the gene expression dataset for the patient of one or more gene modules is indicative of an active mycobacterium tuberculosis infection rather than a latent/asymptomatic mycobacterium tuberculosis infection.
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/628148 | 2009-11-30 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| HK1179660A true HK1179660A (en) | 2013-10-04 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Berry et al. | An interferon-inducible neutrophil-driven blood transcriptional signature in human tuberculosis | |
| JP2013511981A (en) | Blood transcript signatures contrast active TB infection with latent M. infection | |
| McKinney et al. | T-cell exhaustion, co-stimulation and clinical outcome in autoimmunity and infection | |
| JP5670615B2 (en) | Diagnosis, prognosis and monitoring of disease progression in systemic lupus erythematosus via microarray analysis of blood leukocytes | |
| CN102150043A (en) | Blood transcriptional signature of mycobacterium tuberculosis infection | |
| Burel et al. | Host transcriptomics as a tool to identify diagnostic and mechanistic immune signatures of tuberculosis | |
| Wang et al. | Transcriptional profiling of human peripheral blood mononuclear cells identifies diagnostic biomarkers that distinguish active and latent tuberculosis | |
| CN101378764A (en) | Diagnosis, prognosis and monitoring of disease progression of systemic lupus erythematosus through blood leukocyte microarray analysis | |
| Tio-Coma et al. | Blood RNA signature RISK4LEP predicts leprosy years before clinical onset | |
| Tió-Coma et al. | Whole blood RNA signatures in leprosy patients identify reversal reactions before clinical onset: a prospective, multicenter study | |
| JP2010500038A (en) | Gene expression signatures in blood leukocytes enable differential diagnosis of acute infection | |
| Petzke et al. | Global transcriptome analysis identifies a diagnostic signature for early disseminated Lyme disease and its resolution | |
| Kulkarni et al. | A two-gene signature for tuberculosis diagnosis in persons with advanced HIV | |
| Sklavenitis-Pistofidis et al. | Single-cell RNA sequencing defines distinct disease subtypes and reveals hypo-responsiveness to interferon in asymptomatic Waldenstrom’s Macroglobulinemia | |
| Imran et al. | Immuno‐epigenomic analysis identifies attenuated interferon responses in naïve CD4 T cells of adolescents with peanut and multi‐food allergy | |
| Dhillon et al. | Gene expression profiling in pediatric appendicitis | |
| Addo et al. | Genetic diversity and drug resistance profiles of Mycobacterium tuberculosis complex isolates from patients with extrapulmonary tuberculosis in Ghana and their associated host immune responses | |
| Galbraith et al. | Peripheral blood gene expression in postinfective fatigue syndrome following from three different triggering infections | |
| Alcantara et al. | Multiplexed gene expression analysis of HLA class II-associated podoconiosis implicates chronic immune activation in its pathogenesis | |
| Liu et al. | Mendelian randomization and transcriptomic analysis reveal the protective role of NKT cells in Sepsis | |
| Das et al. | Mycobacterium leprae and host immune transcriptomic signatures for reactional states in leprosy | |
| Khanolkar et al. | CD4 T cell–restricted IL-2 signaling defect in a patient with a novel IFNGR1 deficiency | |
| HK1179660A (en) | Blood transcriptional signature of active versus latent mycobacterium tuberculosis infection | |
| AU2015203028A1 (en) | Blood transcriptional signature of active versus latent mycobacterium tuberculosis infection | |
| Gebremicael et al. | Low transcriptomic of PTPRCv1 and CD3E is an independent predictor of mortality in HIV and tuberculosis co-infected patient |