WO2016049927A1 - Biomarkers for obesity related diseases - Google Patents
Biomarkers for obesity related diseases Download PDFInfo
- Publication number
- WO2016049927A1 WO2016049927A1 PCT/CN2014/088056 CN2014088056W WO2016049927A1 WO 2016049927 A1 WO2016049927 A1 WO 2016049927A1 CN 2014088056 W CN2014088056 W CN 2014088056W WO 2016049927 A1 WO2016049927 A1 WO 2016049927A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- gene
- sample
- obesity
- subject
- markers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/689—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Definitions
- the present invention relates to biomarkers and methods for predicting the risk of a disease related to microbes, in particular obesity or related diseases.
- Obesity which is prevalent in developed countries, has increased considerably worldwide (de Carvalho Pereira et al. , 2013) . It is reported that the prevalence of overweight and obesity combined rose by 27.5%for adults and 47.1%for children between 1980 and 2013 in the world. The number of overweight individuals increased from 857 million in 1980, to 2.1 billion in 2013, and of these, 671 million are affected by obesity. More than 50%of which live in ten countries, and USA has the largest number of obese individuals, followed by China (Ng et al. , 2014) .
- BMI body mass index
- Embodiments of the present disclosure seek to solve at least one of the problems existing in the prior art to at least some extent.
- the present invention is based on the following findings by the inventors:
- GWAS Metagenome-Wide Association Study
- the inventors developed a disease classifier system based on the 54 gene markers that are defined as an optimal gene set by a minimum redundancy -maximum relevance (mRMR) feature selection method. For intuitive evaluation of the risk of obesity disease based on these 54 gut microbial gene markers, the inventors calculated a healthy index.
- the inventors'data provide insight into the characteristics of the gut metagenome related to obesity risk, a paradigm for future studies of the pathophysiological role of the gut metagenome in other relevant disorders, and the potential usefulness for a gut-microbiota-based approach for assessment of individuals at risk of such disorders.
- the markers of the present invention are more specific and sensitive as compared with conventional markers.
- analysis of stool promises accuracy, safety, affordability, and patient compliance. And samples of stool are transportable.
- the present invention relates to an in vitro method, which is comfortable and noninvasive, so people will participate in a given screening program more easily.
- the markers of the present invention may also serve as tools for therapy monitoring in cancer patients to detect the response to therapy.
- a biomarker set for predicting a disease related to microbiota in a subject consisting of:
- the disease is obesity or related disease.
- some disease related to the related to microbiota in a subject may be analyzed, for example obesity or related disease may be determined based on some sample from the subject , for example, some fecal sample may be used.
- kits for determining the gene marker set described above comprising primers used for PCR amplification and designed according to the DNA sequecne as set forth in at least a partial sequence of SEQ ID NO: 1 to 54.
- kits for determining the gene marker set described above comprising one or more probes designed according to the genes as set forth in SEQ ID NO: 1 to 54.
- the risk of obesity or related disorder in a subject may be predicted by the following step:
- a ij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
- N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition
- M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition
- an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
- the cutoff is at least 0.5834.
- the risk of obesity or related disorder in a subject may be predicted by the following step:
- a ij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
- N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition
- M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition
- an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
- the cutoff is at least 0.5834.
- a method of diagnosing whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota comprising:
- the risk of obesity or related disorder in a subject may be predicted by the following step:
- a ij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
- N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition
- M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition
- an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
- the cutoff is at least 0.5834.
- the abnormal condition related to microbiota is obesity or related disorder.
- Fig. 1 The association analysis of Obese p-value distribution identified a disproportionate over-representation of strongly associated markers at lower P-values.
- Example 1 Identifying biomarkers for evaluating obesity risk
- Fecal samples from 158 Chinese subjects, including 78 obesity patients and 80 control subjects (training set) were collected by Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine in 2012. Obesity patients were age from 18 to 30 with BMI over 25. Subjects were asked to collect fresh feces samples at hospital. Collected samples were put in sterile tubes and stored at -80°Cimmediately until further analysis.
- DNA library construction was performed following the manufacturer ⁇ s instruction (Illumina, insert size 350bp, read length 100bp) .
- the inventors used the same workflow as described previously to perform cluster generation, template hybridization, isothermal amplification, linearization, blocking and denaturation, and hybridiza-tion of the sequencing primers.
- the inventors constructed one paired-end (PE) library with insert size of 350 bp for each sample, followed by a high-throughput sequencing to obtain around 30 million PE reads of length 2x100bp. High-quality reads were obtained by filtering low-quality reads with ambiguous ⁇ N'bases, adapter contamination and human DNA contamination from the Illumina raw reads, and by trimming low-quality terminal bases of reads simultaneously.
- PE paired-end
- the inventors totally output about 5.9 Gb per sample of fecal micbiota sequencing data (high quality clean data) (Table 1) from 158 samples (78 cases and 80 controls) on Illumina HiSeq 2000 platform.
- Table 1 Summary of metagenomic data. Fourth column reports results from Wilcoxon rank-sum tests.
- the average reads mapping rate was shown on Table 1. This mapping rate was close to the samples in Li, J. et al. 2014, supra, which indicated that this mapping rate was sufficient for the further study.
- the inventors derived the gene profile (9.9Mb genes) from the mapping result using the same method as Li, J. et al. 2014, supra.
- Taxonomic assignment of genes was performed using an in-house pipeline which had described in the published paper (Li, J. et al. 2014, supra) .
- PERMANOVA permutational multivariate analysis of variance
- the inventors performed the analysis using the method implemented in package ′′vegan′′ in R, and the permuted p-value was obtained by 10,000 times permutations.
- the inventors also corrected for multiple testing using′′ p. adjust′′ in R with Benjamini-Hochberg method to get the q-value for each test.
- PERMANOA identified three significant factors associated with gut microbe (based on gene profiles) (q ⁇ 0.05, Table 2) .
- FDR false discovery rate
- Receiver Operator Characteristic (ROC) analysis The inventors applied the ROC analysis to assess the performance of the obesity classification based on metagenomic markers. The inventors then used the “pROC” package in R to draw the ROC curve.
- ROC Receiver Operator Characteristic
- 237 MLG species based on the 396, 100 obesity associated maker genes profile.
- the inventors used the 396, 100 gene markers to built the metagenomic linkage group (MLG) using the same method described in the published T2D paper (Qin et al. 2012, supra) . All the 396, 100 genes were annotated by aligning these genes to the 4, 653 reference genomes in IMG v400.
- An MLG was assigned to a genome if more than 50%constitutive genes were annotated to that genome, otherwise it was termed as unclassified.
- Total 237 MLG genomes with gene number > 100 were selected (P-value ⁇ 0.01) .
- the inventors estimated the average abundance of the genes of the MLG species, after removing the 5%lowest and 5%highest abundant genes (Qin et al. 2012, supra) .
- a random forest model (R. 2.14, randomForest4.6-7 package) (Liaw, Andy &Wiener, Matthew. Classification and Regression by randomForest, R News (2002) , Vol. 2/3 p. 18, incorporated herein by reference) was trained using the MLG abundance profile of the training cohort (158 samples) to select the optimal set of MLG markers. The model was tested on one or more testing sets and the prediction error was calculated.
- RandomForest4.6-7 package package in R vision 2.14
- input is a training dataset (namely relative abundance profiles of selected MLGs in training samples)
- sample disease status sample disease status of training samples is a vectot, 1 for obesity, 0 for control
- test set just the relative abundance profiles of selected MLGs in test set
- the inventors used the randomForest function from randomForest package in R software to build the classification, and predict function was used to predict the test set.
- Output is the prediction results (probability of illness; cutoff is 0.5 and if the probability of illness ⁇ 0.5, the subject is at risk of obesity)
- MLG species marker identification To identify 237 MLG species makers, the inventors used “randomForest4.6-7 package” package in R vision 2.14 based on the 237 obesity associated MLG species. Firstly, the inventors sorted all the 237 MLG species by the importance given by the “randomForest” method (Liaw, Andy &Wiener, Matthew. Classification and Regression by randomForest, R News (2002) , Vol. 2/3 p. 18, incorporated herein by reference) . MLG marker sets were constructed by creating incremental subsets of the top ranked MLG species, starting from 1 MLG species and ending at all 237 MLG species. For each MLG makers set, the inventors calculated the false predication ratio in the 158 samples.
- the 54 MLG species sets with lowest false prediction ratio were selected out as MLG species makers (Table 3-1) .
- the inventors drew the ROC curve using the OOB (out of bag) prediction probability of illness from randomForest model based on the selected MLG species markers (Table 3-2) and the area under the ROC curve (AUC) was 0.9651 in the 158 samples (Fig. 2) .
- AUC area under the ROC curve
- TPR true positive rate
- FPR false positive rate
- Table 3-1 54 most discriminant MLGs (species markers) associated with obesity
- mRMR minimum redundancy -maximum relevance
- the inventors developed a disease classifier system based on the 54 gene markers that the inventors defined. For intuitive evaluation of the risk of disease based on these gut microbial gene markers, the inventors calculated a gut healthy index (obesity index).
- the inventors defined and calculated the gut healthy index for each individual on the basis of the selected 54 gene markers as described above. For each individual sample, the gut healthy index of sample j that denoted by I j was calculated by the formula below:
- a ij is the relative abundance of marker i in sample j.
- N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers),
- M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers),
- an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
- the inventors computed a obesity index based on the relative abundance of these 54 gene markers, which clearly separated the obesity patient microbiomes from the control microbiomes (Table 6).
- Classification of the 78 obesity patient microbiomes against the 80 control microbiomes using the obesity index exhibited an area under the receiver operating characteristic (ROC) curve of 0.9784 (Fig. 3).
- ROC receiver operating characteristic
- a ij is the relative abundance of marker i in sample j.
- N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers) ,
- M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers) ,
- an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
- Table 7 shows the calculated index of each sample and Table 8 shows the relevant gene relative abundance of a representative sample DB68A.
- TPR true positive rate
- FPR false positive rate
- Case means before operation samples
- control means after operation 1 month and 3 month.
- a ij is the relative abundance of marker i in sample j.
- N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers) ,
- M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers) ,
- an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
- Table 10 shows the calculated index of each sample and Table 11 shows the relevant gene relative abundance of a representative sample DB62.
- the error rate was 18.18% (4/22) , validating that the 54 gene markers can classify obesity individuals.
- most of obesity patients (7/9) were diagnosed as obesity correctly.
- TPR true positive rate
- FPR false positive rate
- the inventors have identified and validated 54 markers set by a minimum redundancy -maximum relevance (mRMR) feature selection method based on 54 obesity-associated gut microbes. And the inventors have built a gut healthy index to evaluate the risk of obesity disease based on these 54 gut microbial gene markers.
- mRMR minimum redundancy -maximum relevance
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Analytical Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Genetics & Genomics (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Biomarkers and methods for predicting the risk of a disease related to microbes, in particular obesity or related diseases, are provided.
Description
CROSS-REFERENCE TO RELATED APPLICATION
None
The present invention relates to biomarkers and methods for predicting the risk of a disease related to microbes, in particular obesity or related diseases.
Obesity, which is prevalent in developed countries, has increased considerably worldwide (de Carvalho Pereira et al. , 2013) . It is reported that the prevalence of overweight and obesity combined rose by 27.5%for adults and 47.1%for children between 1980 and 2013 in the world. The number of overweight individuals increased from 857 million in 1980, to 2.1 billion in 2013, and of these, 671 million are affected by obesity. More than 50%of which live in ten countries, and USA has the largest number of obese individuals, followed by China (Ng et al. , 2014) .
There is a growing body of evidence suggesting that patients who are diagnosed by their physician that they are overweight are more likely to lose weight relative to those who are not diagnosed. However, low rates of physician diagnosis and advice for behavioral health risk factors related to obesity is concerning (Bleich et al. , 2011) .
In children, the diagnosis of obesity is based on age-and gender-specific body mass index (BMI) cut-points. This is in contrast to adults, in which an obesity diagnosis is based on a BMI regardless of age or gender. Unlike adults, for whom obesity diagnostic criteria are simpler, fewer obese children being accurately diagnosed for the more complicated diagnostic criteria and change in terminology for pediatric obesity (Walsh et al. , 2013) . Moreover, limitations of BMI in terms of identification of the different populations should be considered (Nevill et al. , 2006) . Therefore, waist circumference (WC) can be considered a reliable and useful tool for epidemiological studies to assess abdominal adiposity, but this measurement seems to
be harder to perform (Miguel‐Etayo et al. , 2014) . What’s more, regional studies of diagnosis of pediatric obesity using International Classification of Diseases, Ninth Revision (ICD-9) , National Ambulatory Medical Care Survey (NAMCS) , and National Hospital Medical Care Survey (NHAMCS) have shown relatively low sensitivity of a clinical diagnosis (Walsh et al. , 2013) .
Recent insight suggests that the human gut microbiota could play an important role in obesity. An early report, based on sequencing of amplified 16S rRNA genes, indicated a much higher ratio of Firmicutes to Bacteroidetes in faecal samples from 12 obese humans than in two lean controls (Ley et al. , 2006) . Recent observational studies using metagenomic sequencing in human obesity have demonstrated reduced bacterial diversity, a relative depletion of Bacteroidetes , and enrichment in genes involved in carbohydrate and lipid metabolism (Allin and Pedersen, 2014) . These correlative findings indicated the altered gut microbiota is a causal factor in the pathogenesis of obesity. This indicating that maybe we can use the characteristics of gut microbiota as criteria to diagnosis of obesity.
In summary, there are considerable missed opportunities and low sensitivity in the diagnosis of obesity. A more valid (less biased) assessment of overweight and/or obesity need to be developed.
SUMMARY
Embodiments of the present disclosure seek to solve at least one of the problems existing in the prior art to at least some extent.
The present invention is based on the following findings by the inventors:
Assessment and characterization of gut microbiota has become a major research area in human disease, including obesity. To carry out analysis on gut microbial content in obesity patients, the inventors carried out a protocol for a Metagenome-Wide Association Study (MGWAS) (Qin, J. et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490, 55–60 (2012) , incorporated herein by reference) based on deep shotgun sequencing of the gut microbial DNA from 158 individuals. The inventors identified and validated 396, 100 obesity-associated gene markers. To exploit the potential ability of obesity classification by gut
microbiota, the inventors developed a disease classifier system based on the 54 gene markers that are defined as an optimal gene set by a minimum redundancy -maximum relevance (mRMR) feature selection method. For intuitive evaluation of the risk of obesity disease based on these 54 gut microbial gene markers, the inventors calculated a healthy index. The inventors'data provide insight into the characteristics of the gut metagenome related to obesity risk, a paradigm for future studies of the pathophysiological role of the gut metagenome in other relevant disorders, and the potential usefulness for a gut-microbiota-based approach for assessment of individuals at risk of such disorders.
It is believed that gene markers of intestinal microbiota are valuable for increasing obesity detection at earlier stages due to the following. First, the markers of the present invention are more specific and sensitive as compared with conventional markers. Second, analysis of stool promises accuracy, safety, affordability, and patient compliance. And samples of stool are transportable. Thus, the present invention relates to an in vitro method, which is comfortable and noninvasive, so people will participate in a given screening program more easily. Third, the markers of the present invention may also serve as tools for therapy monitoring in cancer patients to detect the response to therapy.
In one aspect of present disclosure, there is provided with a biomarker set for predicting a disease related to microbiota in a subject consisting of:
at least a partial sequence of SEQ ID NO: 1 to 54.
According to embodiments of present disclosure, the disease is obesity or related disease.
Using these biomarkers, some disease related to the related to microbiota in a subject may be analyzed, for example obesity or related disease may be determined based on some sample from the subject , for example, some fecal sample may be used.
In another aspect of present disclosure, there is provided with a kit for determining the gene marker set described above, comprising primers used for PCR amplification and designed according to the DNA sequecne as set forth in at least a partial sequence of SEQ ID NO: 1 to 54.
In another aspect of present disclosure, there is provided with a kit for determining the gene marker set described above, comprising one or more probes designed according to the genes as set forth in SEQ ID NO: 1 to 54.
In another aspect of present disclosure, there is provided with use of the gene marker set described above for predicting the risk of obesity or related disorder in a subject. According to
embodiments of present disclosure, the risk of obesity or related disorder in a subject may be predicted by the following step:
(1) collecting a sample j from a subject;
(2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and
(3) calculating a index of sample j denoted by Ij by a formula below:
Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,
M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,
|N| and |M| are numbers of the biomarkers respectively in the first and second subsets,
wherein
an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
According to some embodiments of present disclosure, |N| is 24, and |M| is 30.
According to some embodiments of present disclosure, the cutoff is at least 0.5834.
In another aspect of present disclosure, there is provided with use of the gene marker set described above for preparation of a kit for predicting the risk of obesity or related disorder in a subject. According to embodiments of present disclosure, the risk of obesity or related disorder in a subject may be predicted by the following step:
(1) collecting a sample j from a subject;
(2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and
(3) calculating a index of sample j denoted by Ij by a formula below:
Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,
M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,
|N| and |M| are numbers of the biomarkers respectively in the first and second subsets,
wherein
an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
According to some embodiments of present disclosure, |N| is 24, and |M| is 30.
According to some embodiments of present disclosure, the cutoff is at least 0.5834.
In another aspect of present disclosure, there is provided with a method of diagnosing whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota, comprising:
determining the relative abundance of the biomarkers described above in a sample from the subject, and
determining whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota based on the relative abundance.
According to embodiments of present disclosure, According to embodiments of present disclosure, the risk of obesity or related disorder in a subject may be predicted by the following step:
(1) collecting a sample j from a subject;
(2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and
(3) calculating a index of sample j denoted by Ij by a formula below:
Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,
M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,
|N| and |M| are numbers of the biomarkers respectively in the first and second subsets,
wherein
an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
According to some embodiments of present disclosure, |N| is 24, and |M| is 30.
According to some embodiments of present disclosure, the cutoff is at least 0.5834.
According to embodiments of present disclosure, the abnormal condition related to microbiota is obesity or related disorder.
BRIEF DISCRIPTION OF DRAWINGS
These and other aspects and advantages of the present disclosure will become apparent and more readily appreciated from the following descriptions taken in conjunction with the drawings, in which:
Fig. 1 The association analysis of Obese p-value distribution identified a disproportionate over-representation of strongly associated markers at lower P-values.
Fig. 2 The ROC were drawned by the probability of the illness in training set, and AUC=0.9651.
Fig. 3 The ROC were drawn by the obesity index in training set, and AUC=0.9784.
Fig. 4 The ROC in test set (42 samples) were drawn by the obesity index in test set, and AUC=0.8729.
Fig. 5 The ROC in test set (22 samples) were drawn by the obesity index in test set, and
AUC=0.9487.
Examples
Terms used herein have meanings as commonly understood by a person of ordinary skill in the areas relevant to the present invention. Terms such as “a” , “an” and “the” are not intended to refer to only a singular entity, but include the general class of which a specific example may be used for illustration. The terminology herein is used to describe specific embodiments of the invention, but their usage does not delimit the invention, except as outlined in the claims.
The present invention is further exemplified in the following non-limiting Examples. Unless otherwise stated, parts and percentages are by weight and degrees are Celsius. As apparent to one of ordinary skill in the art, these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only, and the agents were all commercially available.
Example 1. Identifying biomarkers for evaluating obesity risk
1.1 Sample collection
Fecal samples from 158 Chinese subjects, including 78 obesity patients and 80 control subjects (training set) , were collected by Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine in 2012. Obesity patients were age from 18 to 30 with BMI over 25. Subjects were asked to collect fresh feces samples at hospital. Collected samples were put in sterile tubes and stored at -80℃immediately until further analysis.
The complete ethical approval has been obtained, and all the patients gave written informed consent. The study was approved by the Institutional Review Board of Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine.
1.2 DNA extraction
Fecal samples were thawed on ice and DNA extraction was performed using the Qiagen QIAamp DNA Stool Mini Kit (Qiagen) according to manufacturer`s instructions. Extracts were treated with DNase-free RNase to eliminate RNA contamination. DNA quantity was determined using NanoDrop spectrophotometer, Qubit Fluorometer (with the Quant-iTTMdsDNA BR Assay Kit) and gel electrophoresis.
1.3 DNA library construction and sequencing of fecal samples
DNA library construction was performed following the manufacturer`s instruction (Illumina,
insert size 350bp, read length 100bp) . The inventors used the same workflow as described previously to perform cluster generation, template hybridization, isothermal amplification, linearization, blocking and denaturation, and hybridiza-tion of the sequencing primers. The inventors constructed one paired-end (PE) library with insert size of 350 bp for each sample, followed by a high-throughput sequencing to obtain around 30 million PE reads of length 2x100bp. High-quality reads were obtained by filtering low-quality reads with ambiguous `N'bases, adapter contamination and human DNA contamination from the Illumina raw reads, and by trimming low-quality terminal bases of reads simultaneously.
The inventors totally output about 5.9 Gb per sample of fecal micbiota sequencing data (high quality clean data) (Table 1) from 158 samples (78 cases and 80 controls) on Illumina HiSeq 2000 platform.
Table 1 Summary of metagenomic data. Fourth column reports results from Wilcoxon rank-sum tests.
1.4 Metagenomic data processing and analysis
1.4.1 Reads mapping
The inventors used the updated human gut gene catalogue established in Li, J. et al. An integrated catalog of reference genes in the human gut microbiome. Nat. Biotechnol. (2014) (incorporated herein by reference) and mapped the high quality reads to it with the alignment criteria identity >=90. The average reads mapping rate was shown on Table 1. This mapping rate was close to the samples in Li, J. et al. 2014, supra, which indicated that this mapping rate was sufficient for the further study. After reads mapping, the inventors derived the gene profile (9.9Mb genes) from the mapping result using the same method as Li, J. et al. 2014, supra.
Taxonomic assignment of genes. Taxonomic assignment of the predicted genes was performed using an in-house pipeline which had described in the published paper (Li, J. et al. 2014, supra) .
1.4.2 Data profile construction
Gene profile. Based on the reads mapping results, the inventors use the same method described in the published T2D paper (Qin et al. 2012, supra) to compute the relative gene abundance.
1.4.3 Analysis of factors influencing gut microbiota gene profile. The inventors used the permutational multivariate analysis of variance (PERMANOVA) to assess the effect of 6 clinical parameters, including age, sex, height, weight, BMI and obese, based on gene profile . The inventors performed the analysis using the method implemented in package ″vegan″ in R, and the permuted p-value was obtained by 10,000 times permutations. The inventors also corrected for multiple testing using″ p. adjust″ in R with Benjamini-Hochberg method to get the q-value for each test. PERMANOA identified three significant factors associated with gut microbe (based on gene profiles) (q <0.05, Table 2) . The analysis indicated weight, BMI and obese status were strong associated markers, supporting the diseases (obese) status was the major determinant influencing the composition of gut microbiota.
Table 2 PERMANOVA based on euclidean distance analysis of gene profile. The analysis was conducted to test whether clinical parameters, and obese status have significant impact on the gut microbiota with q-value <0.05.
| phenotype | Df | Sums Of Sqs | Mean Sqs | F. Model | R2 | Pr (>F) |
| |
1 | 0.317034738 | 0.317034738 | 1.004112579 | 0.006395454 | 0.4094 |
| |
1 | 0.377329497 | 0.377329497 | 1.196542903 | 0.007611763 | 0.1727 |
| |
1 | 0.331409667 | 0.331409667 | 1.049947284 | 0.006685435 | 0.3291 |
| |
1 | 0.969536515 | 0.969536515 | 3.111941857 | 0.019558192 | 1.00 |
| BMI | ||||||
| 1 | 0.954186893 | 0.954186893 | 3.0617069 | 0.019248548 | 1.00 | |
| Obese | ||||||
| 1 | 0.972185352 | 0.972185352 | 3.120613959 | 0.019611626 | 2.00E-04 |
1.4.4 Identification of obesity associated markers
Identification of obesity associated genes. To identify the association between the metagenomic profile and obesity, a two-tailed Wilcoxon rank-sum test was used in 9, 879, 897 high occurrence gene (genes that were present in less than 10 samples across all 158 samples were removed) profiles. 396, 100 gene markers were obtained, which were enriched in either case or control with p-value < 0.01, FDR=3.8% (Fig. 1) .
Estimating the false discovery rate (FDR) . Instead of a sequential p-value rejection method, the inventors applied the “q-value” method proposed in a previous study to estimate the FDR (Storey, J. D. A direct approach to false discovery rates. Journal of the Royal Statistical Society 64, 479-498 (2002) , incorporated herein by reference) .
Receiver Operator Characteristic (ROC) analysis. The inventors applied the ROC analysis to assess the performance of the obesity classification based on metagenomic markers. The inventors then used the “pROC” package in R to draw the ROC curve.
1.4.5 Construction of MLG and identification of obesity associated MLG species markers
237 MLG species based on the 396, 100 obesity associated maker genes profile. The inventors used the 396, 100 gene markers to built the metagenomic linkage group (MLG) using the same method described in the published T2D paper (Qin et al. 2012, supra) . All the 396, 100 genes were annotated by aligning these genes to the 4, 653 reference genomes in IMG v400. An MLG was assigned to a genome if more than 50%constitutive genes were annotated to that genome, otherwise it was termed as unclassified. Total 237 MLG genomes with gene number > 100 were selected (P-value <0.01) . To estimate the relative abundance of an MLG species, the inventors estimated the average abundance of the genes of the MLG species, after removing the 5%lowest and 5%highest abundant genes (Qin et al. 2012, supra) .
1.5 MLG-based classifier
A random forest model (R. 2.14, randomForest4.6-7 package) (Liaw, Andy &Wiener, Matthew. Classification and Regression by randomForest, R News (2002) , Vol. 2/3 p. 18, incorporated herein by reference) was trained using the MLG abundance profile of the training cohort (158 samples) to select the optimal set of MLG markers. The model was tested on one or more testing sets and the prediction error was calculated.
About the randomForest model, using “randomForest4.6-7 package” package in R vision 2.14, input is a training dataset (namely relative abundance profiles of selected MLGs in training
samples) , sample disease status (sample disease status of training samples is a vectot, 1 for obesity, 0 for control) , and a test set (just the relative abundance profiles of selected MLGs in test set) . Then the inventors used the randomForest function from randomForest package in R software to build the classification, and predict function was used to predict the test set. Output is the prediction results (probability of illness; cutoff is 0.5 and if the probability of illness ≥0.5, the subject is at risk of obesity)
54 MLG species marker identification. To identify 237 MLG species makers, the inventors used “randomForest4.6-7 package” package in R vision 2.14 based on the 237 obesity associated MLG species. Firstly, the inventors sorted all the 237 MLG species by the importance given by the “randomForest” method (Liaw, Andy &Wiener, Matthew. Classification and Regression by randomForest, R News (2002) , Vol. 2/3 p. 18, incorporated herein by reference) . MLG marker sets were constructed by creating incremental subsets of the top ranked MLG species, starting from 1 MLG species and ending at all 237 MLG species. For each MLG makers set, the inventors calculated the false predication ratio in the 158 samples. Finally, the 54 MLG species sets with lowest false prediction ratio were selected out as MLG species makers (Table 3-1) . Furthermore, the inventors drew the ROC curve using the OOB (out of bag) prediction probability of illness from randomForest model based on the selected MLG species markers (Table 3-2) and the area under the ROC curve (AUC) was 0.9651 in the 158 samples (Fig. 2) . At the best cutoff 0.5294, true positive rate (TPR) was 0.8625, and false positive rate (FPR) was 0.07692, indicating that the 54 MLG markers could be used to accurately classify obesity individuals.
Table 3-1 54 most discriminant MLGs (species markers) associated with obesity
Table 3-2 Prediction results of 54 MLGs in 158 samples
1.6 Methods for selecting 54 best markers from biomarkers (Maximum Relevance Minimum Redundancy (mRMR) feature selection framework)
To identify an optimal gene set, a minimum redundancy -maximum relevance (mRMR) (for detailed information, see Peng, H. , Long, F. &Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell27, 1226-1238, doi: 10.1109/TPAMI. 2005.159 (2005) , which is incorporated herein by reference) feature selection method was used to select from 54 MLG markers. Using the 54 MLG markers, the inventors select one gene to represent the MLG. The inventors selected the represent gene by mRMR method in each MLG. And the inventors choose the first gene to represent this MLG. So the inventors got 54 gene markers, which were shown on Table 4 and Table 5.The gene id is from the published reference gene catalogue as Li, J. et al. 2014, supra.
Table 4.54 optimal gene markers’enrichment information
| Gene id | Enrichment (1=obesity, 0=control) |
| 4388 | 0 |
| 34851 | 0 |
| 58832 | 0 |
| 67971 | 1 |
| 85995 | 0 |
| 95220 | 0 |
| 156675 | 0 |
| 197527 | 0 |
| 201254 | 0 |
| 223292 | 0 |
| 232979 | 0 |
| 266046 | 1 |
| 331594 | 0 |
| 447739 | 0 |
| 515920 | 1 |
| 563367 | 0 |
| 565364 | 0 |
| 622928 | 0 |
| 727087 | 0 |
| 859299 | 1 |
| 903584 | 1 |
| 963362 | 0 |
| 1015556 | 0 |
| 1571678 | 1 |
| 1583339 | 1 |
| 1801041 | 0 |
| 2150523 | 0 |
| 2273710 | 0 |
| 2285506 | 0 |
| 2291624 | 1 |
| 2390685 | 0 |
| 2397559 | 1 |
| 2414703 | 1 |
| 2687923 | 1 |
| 2940024 | 0 |
| 3111759 | 0 |
| 3179344 | 1 |
| 3239706 | 1 |
| 3381319 | 0 |
| 3449966 | 1 |
| 3984550 | 0 |
| 4202903 | 1 |
| 5243950 | 1 |
| 5459014 | 1 |
| 5486089 | 1 |
| 5659078 | 1 |
| 6616419 | 0 |
| 6692162 | 1 |
| 7136991 | 1 |
| 7209512 | 0 |
| 7775163 | 0 |
| 8202342 | 1 |
| 8846481 | 1 |
| 9454625 | 1 |
Table 5. SEQ ID of the 54 optimal gene markers
| gene id | SEQ ID NO: |
| gene_id:85995 | 1 |
| gene_id:5659078 | 2 |
| gene_id:8846481 | 3 |
| gene_id:95220 | 4 |
| gene_id:3239706 | 5 |
| gene_id:727087 | 6 |
| gene_id:5243950 | 7 |
| gene_id:1015556 | 8 |
| gene_id:4388 | 9 |
| gene_id:2397559 | 10 |
| gene_id:2414703 | 11 |
| gene_id:3381319 | 12 |
| gene_id:201254 | 13 |
| gene_id:7209512 | 14 |
| gene_id:1801041 | 15 |
| gene_id:7775163 | 16 |
| gene_id:2273710 | 17 |
| gene_id:447739 | 18 |
| gene_id:5459014 | 19 |
| gene_id:6616419 | 20 |
| gene_id:3111759 | 21 |
| gene_id:4202903 | 22 |
| gene_id:859299 | 23 |
| gene_id:3449966 | 24 |
| gene_id:963362 | 25 |
| gene_id:565364 | 26 |
| gene_id:34851 | 27 |
| gene_id:1583339 | 28 |
| gene_id:8202342 | 29 |
| gene_id:622928 | 30 |
| gene_id:515920 | 31 |
| gene_id:2390685 | 32 |
| gene_id:7136991 | 33 |
| gene_id:2291624 | 34 |
| gene_id:331594 | 35 |
| gene_id:2687923 | 36 |
| gene_id:5486089 | 37 |
| gene_id:156675 | 38 |
| gene_id:1571678 | 39 |
| gene_id:3984550 | 40 |
| gene_id:232979 | 41 |
| gene_id:266046 | 42 |
| gene_id:223292 | 43 |
| gene_id:67971 | 44 |
| gene_id:6692162 | 45 |
| gene_id:9454625 | 46 |
| gene_id:3179344 | 47 |
| gene_id:2940024 | 48 |
| gene_id:197527 | 49 |
| gene_id:58832 | 50 |
| gene_id:903584 | 51 |
| gene_id:2285506 | 52 |
| gene_id:2150523 | 53 |
| gene_id:563367 | 54 |
1.7 Gut healthy index (obesity index)
To exploit the potential ability of disease classification by gut microbiota, the inventors developed a disease classifier system based on the 54 gene markers that the inventors defined. For intuitive evaluation of the risk of disease based on these gut microbial gene markers, the inventors calculated a gut healthy index (obesity index).
To evaluate the effect of the gut metagenome on obesity, the inventors defined and calculated the gut healthy index for each individual on the basis of the selected 54 gene markers as described above. For each individual sample, the gut healthy index of sample j that denoted by Ij was calculated by the formula below:
Aij is the relative abundance of marker i in sample j.
N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers),
M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers),
|N| and |M| are number (sizes) of the biomarker respectively in these two sets, wherein |N| is 24 and |M| is 30,
wherein an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
1.8 Gut-microbiota-based obesity classification
The inventors computed a obesity index based on the relative abundance of these 54 gene markers, which clearly separated the obesity patient microbiomes from the control microbiomes (Table 6). Classification of the 78 obesity patient microbiomes against the 80 control microbiomes using the obesity index exhibited an area under the receiver operating characteristic (ROC) curve of 0.9784 (Fig. 3). At the best index cutoff 0.5834, true positive rate (TPR) was 0.9103, and false positive rate (FPR) was 0.075, and error rate was 8.86% (14/158), indicating that the 54 gene markers could be used to accurately classify obesity individuals.
Table 6. 158 samples’ calculated gut healthy index (obesity patients and non-obesity controls)
Example 2. Validating the 54 gene biomarkers in 42 samples (test set)
The inventors validated the discriminatory power of the obesity classifier using another new independent study group, including 17 obesity patients and 25 non-obesity controls that were also collected in Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine .
For each sample, DNA was extracted and a DNA library was constructed followed by high throughput sequencing as described in Example 1. The inventors calculated the gene abundance profile for these samples using the same method as described in Qin et al. 2012, supra. Then the gene relative abundance of each of the markers as set forth in SEQ ID NOs: 1-54 was determined. Then the index of each sample was calculated by the formula below:
Aij is the relative abundance of marker i in sample j.
N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers) ,
M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers) ,
|N| and |M| are number (sizes) of the biomarker respectively in these two sets, wherein |N| is 24 and |M| is 30,
wherein an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
Table 7 shows the calculated index of each sample and Table 8 shows the relevant gene relative abundance of a representative sample DB68A. In this assessment analysis, at the cutoff 0.5834 (the best index cutoff in 158 samples above) , the error rate was 26.19% (11/42) , validating that the 54 gene markers can classify obesity individuals. And most of obesity patients (13/17) were diagnosed as obesity correctly. Also, the ROC in test set were drawned by the obesity index in test set, and AUC=0.8729 (Fig. 4) . At the best cutoff 0.7769, true positive rate (TPR) was 0.7647, and false positive rate (FPR) was 0.04.
Table 7.42 samples’ calculated gut healthy index
Table 8. Gene relative abundance of Sample DB68A
Example 3. Validating the 54 gene biomarkers in 22 samples (test set)
The inventors validated the discriminatory power of the obesity classifier using another 22 samples (Table 9) , including 9 case samples and 13 control samples (5 samples after operation 1 month and 8 samples after operation 3 month) that were also collected in Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine . Case means before operation samples, control means after operation 1 month and 3 month.
Table 9.22 samples information
*Before: before the operation; 1-M: operation after one month; 3-M: operation after three month.
For each sample, DNA was extracted and a DNA library was constructed followed by high throughput sequencing as described in Example 1. The inventors calculated the gene abundance profile for these samples using the same method as described in Qin et al. 2012, supra. Then the gene relative abundance of each of the markers as set forth in SEQ ID NOs: 1-54 was determined. Then the index of each sample was calculated by the formula below:
Aij is the relative abundance of marker i in sample j.
N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers) ,
M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers) ,
|N| and |M| are number (sizes) of the biomarker respectively in these two sets, wherein |N| is 24 and |M| is 30,
wherein an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
Table 10 shows the calculated index of each sample and Table 11 shows the relevant gene relative abundance of a representative sample DB62. In this assessment analysis, at the cutoff 0.5834 (the best index cutoff in 158 samples above) , the error rate was 18.18% (4/22) , validating that the 54 gene markers can classify obesity individuals. And most of obesity patients (7/9) were diagnosed as obesity correctly. Also, the ROC in test set were drawned by the obesity index in test set, and AUC=0.9487 (Fig. 5) . At the best cutoff 0.02538, true positive rate (TPR) was 1, and false
positive rate (FPR) was 0.1538.
Table 10. 22 samples’calculated gut healthy index
| Samples (DB:obesity) | obesity index |
| DB62 | 1.191905591 |
| DB67 | 0.025381992 |
| DB68 | 1.757974404 |
| DB78 | 1.344989391 |
| DB85 | 1.796053682 |
| DB124 | 0.072164965 |
| DB125 | 1.162137206 |
| DB126 | 0.979123077 |
| DB01 | 0.686585017 |
| DB.S1.62 | 0.879906331 |
| DB.S1.68 | -0.274438487 |
| DB.S1.85 | 0.0154326 |
| DB.S1.124 | -0.750440603 |
| DB.S1.125 | -0.893868407 |
| DB.S3.62 | 0.711881869 |
| DB.S3.67 | -0.007230148 |
| DB.S3.68 | -0.029903064 |
| DB.S3.78 | -0.761996663 |
| DB.S3.124 | -0.588485485 |
| DB.S3.125 | -0.575369569 |
| DB.S3.126 | -0.398672766 |
| DB.S3.01 | -0.420476048 |
Table 11. Gene relative abundance of Sample DB62
Thus the inventors have identified and validated 54 markers set by a minimum redundancy -maximum relevance (mRMR) feature selection method based on 54 obesity-associated gut microbes. And the inventors have built a gut healthy index to evaluate the risk of obesity disease based on these 54 gut microbial gene markers.
Although explanatory embodiments have been shown and described, it would be appreciated by those skilled in the art that the above embodiments can not be construed to limit the present disclosure, and changes, alternatives, and modifications can be made in the embodiments without departing from spirit, principles and scope of the present disclosure.
Claims (15)
- A biomarker set for predicting a disease related to microbiota in a subject consisting of:at least a partial sequence of SEQ ID NO: 1 to 54.
- The biomarker set for predicting a disease related to microbiota in a subject of claim 1, wherein the disease is obesity or related disease.
- A kit for determining the gene marker set of claim 1, comprising primers used for PCR amplification and designed according to the DNA sequecne as set forth in claim 1.
- A kit for determining the gene marker claim 1, comprising one or more probes designed according to at least a partial sequence of SEQ ID NO: 1 to 54.
- Use of the gene marker set of claim 1 for predicting the risk of obesity or related disorder in a subject, comprising:(1) collecting a sample j from a subject;(2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and(3) calculating a index of sample j denoted by Ij by a formula below:Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set;N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,|N| and |M| are numbers of the biomarkers respectively in the first and second subsets,whereinan index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
- The use according to claim 5, wherein |N| is 24, and |M| is 30.
- The use according to claim 5, wherein the cutoff is at least 0.5834.
- Use of the gene marker set of claim 1 for preparation of a kit for predicting the risk of obesity or related disorder in a subject, comprising:(1) collecting a sample j from a subject;(2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and(3) calculating a index of sample j denoted by Ij by a formula below:Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set;N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,|N| and |M| are numbers of the biomarkers respectively in the first and second subsets,whereinan index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
- The use according to claim 8, wherein |N| is 24, and |M| is 30.
- The use according to claim 9, wherein the cutoff is at least 0.5834.
- A method of diagnosing whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota, comprising:determining the relative abundance of the biomarkers of claim 1 in a sample from the subject, anddetermining whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota based on the relative abundance.
- The method according to claim 11, comprising:(1) collecting a sample j from a subject;(2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and(3) calculating a index of sample j denoted by Ij by a formula below:Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set;N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,|N| and |M| are numbers of the biomarkers respectively in the first and second subsets,whereinan index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
- The method according to claim 12, wherein |N| is 24, and |M| is 30.
- The method according to claim 13, wherein the cutoff is at least 0.5834.
- The method according to claim 11, wherein the abnormal condition related to microbiota is obesity or related disorder.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201480082401.4A CN106795481B (en) | 2014-09-30 | 2014-09-30 | Biomarkers for Obesity-Related Diseases |
| PCT/CN2014/088056 WO2016049927A1 (en) | 2014-09-30 | 2014-09-30 | Biomarkers for obesity related diseases |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2014/088056 WO2016049927A1 (en) | 2014-09-30 | 2014-09-30 | Biomarkers for obesity related diseases |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2016049927A1 true WO2016049927A1 (en) | 2016-04-07 |
Family
ID=55629350
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2014/088056 Ceased WO2016049927A1 (en) | 2014-09-30 | 2014-09-30 | Biomarkers for obesity related diseases |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN106795481B (en) |
| WO (1) | WO2016049927A1 (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3677272B1 (en) * | 2017-08-29 | 2022-11-16 | BGI Shenzhen | Application of alistipes shahii in preparing a composition for preventing and/or treating lipid metabolism related diseases |
| CN112888448B (en) * | 2018-12-07 | 2023-07-25 | 深圳华大生命科学研究院 | Use of megamonas simplex for preventing and/or treating metabolic diseases |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101886132A (en) * | 2009-07-15 | 2010-11-17 | 北京百迈客生物科技有限公司 | Method for screening molecular markers correlative with properties based on sequencing technique and BSA (Bulked Segregant Analysis) technique |
| CN101921748A (en) * | 2010-06-30 | 2010-12-22 | 深圳华大基因科技有限公司 | DNA Molecular Tags for High-Throughput Detection of Human Papillomaviruses |
| WO2014019271A1 (en) * | 2012-08-01 | 2014-02-06 | Bgi Shenzhen | Biomarkers for diabetes and usages thereof |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050239706A1 (en) * | 2003-10-31 | 2005-10-27 | Washington University In St. Louis | Modulation of fiaf and the gastrointestinal microbiota as a means to control energy storage in a subject |
| NZ589863A (en) * | 2008-05-16 | 2012-11-30 | Interleukin Genetics Inc | Genetic markers for weight management and methods of use thereof |
-
2014
- 2014-09-30 WO PCT/CN2014/088056 patent/WO2016049927A1/en not_active Ceased
- 2014-09-30 CN CN201480082401.4A patent/CN106795481B/en active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101886132A (en) * | 2009-07-15 | 2010-11-17 | 北京百迈客生物科技有限公司 | Method for screening molecular markers correlative with properties based on sequencing technique and BSA (Bulked Segregant Analysis) technique |
| CN101921748A (en) * | 2010-06-30 | 2010-12-22 | 深圳华大基因科技有限公司 | DNA Molecular Tags for High-Throughput Detection of Human Papillomaviruses |
| WO2014019271A1 (en) * | 2012-08-01 | 2014-02-06 | Bgi Shenzhen | Biomarkers for diabetes and usages thereof |
Non-Patent Citations (3)
| Title |
|---|
| KOETH, R.A. ET AL.: "Intestinal microbiota metabolism of L-carnitine, a nutrient in red meat, promotes atherosclerosis.", NATURE MEDICINE, 7 April 2013 (2013-04-07), pages 576 - 585 * |
| QIN, JUNJIE ET AL.: "A metagenome-wide association study of gut microbiota in type 2 diabetes.", NATURE, vol. 490, 4 October 2012 (2012-10-04), pages 55 - 60, XP055111695, DOI: doi:10.1038/nature11450 * |
| WANG, ZENENG ET AL.: "Gut flora metabolism of phosphatidylcholine promotes cardiovascular disease.", NATURE, vol. 472, 7 April 2011 (2011-04-07), pages 57 - 63, XP055120871, DOI: doi:10.1038/nature09922 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN106795481B (en) | 2021-05-04 |
| CN106795481A (en) | 2017-05-31 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2016049932A1 (en) | Biomarkers for obesity related diseases | |
| US20190367995A1 (en) | Biomarkers for colorectal cancer | |
| CN105132518B (en) | Large intestine carcinoma marker and its application | |
| CN110904213B (en) | An ulcerative colitis biomarker based on intestinal flora and its application | |
| CN105473738A (en) | Biomarkers for colorectal cancer | |
| CN107075453B (en) | Biomarkers of Coronary Artery Disease | |
| WO2020244017A1 (en) | Intestinal flora-based schizophrenia biomarker combination, and applications thereof and motu screening method therefor | |
| WO2016112488A1 (en) | Biomarkers for colorectal cancer related diseases | |
| CN105473739A (en) | Biomarkers for colorectal cancer | |
| JP2019511922A (en) | Methods and systems for early risk assessment for preterm birth outcomes | |
| CN111676291B (en) | A miRNA marker for risk assessment of lung cancer | |
| CN109072306A (en) | Isolated nucleic acid and application | |
| WO2016049927A1 (en) | Biomarkers for obesity related diseases | |
| CN114231633A (en) | Kit, device and method for lung cancer diagnosis | |
| CN112384634B (en) | Osteoporosis biomarkers and their uses | |
| CN109715828B (en) | Biomarker combination for detecting endometriosis and application thereof | |
| CN120400347A (en) | Meningioma intestinal microbial markers, products and applications | |
| CN119662826A (en) | Pancreatic cancer biomarker based on intestinal flora and application thereof | |
| CN115881229B (en) | Allergy prediction model construction method based on intestinal microbial information | |
| CN109072278A (en) | Isolated Nucleic Acids and Applications | |
| WO2016049917A1 (en) | Biomarkers for obesity related diseases | |
| TWI485252B (en) | A method of detecting the possibility of crc by specific gene profile from stool samples | |
| CN118961905A (en) | Serum metabolite profile, intestinal flora profile and their application in craniopharyngioma | |
| HK1240266A1 (en) | Biomarkers for obesity related diseases | |
| CN119360942A (en) | A ulcerative colitis intestinal microbial marker and its application and ulcerative colitis detection model |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14903455 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 14903455 Country of ref document: EP Kind code of ref document: A1 |