[go: up one dir, main page]

WO2016049927A1 - Biomarkers for obesity related diseases - Google Patents

Biomarkers for obesity related diseases Download PDF

Info

Publication number
WO2016049927A1
WO2016049927A1 PCT/CN2014/088056 CN2014088056W WO2016049927A1 WO 2016049927 A1 WO2016049927 A1 WO 2016049927A1 CN 2014088056 W CN2014088056 W CN 2014088056W WO 2016049927 A1 WO2016049927 A1 WO 2016049927A1
Authority
WO
WIPO (PCT)
Prior art keywords
gene
sample
obesity
subject
markers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2014/088056
Other languages
French (fr)
Inventor
Qiang FENG
Dongya ZHANG
Longqing TANG
Jun Wang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Priority to CN201480082401.4A priority Critical patent/CN106795481B/en
Priority to PCT/CN2014/088056 priority patent/WO2016049927A1/en
Publication of WO2016049927A1 publication Critical patent/WO2016049927A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the present invention relates to biomarkers and methods for predicting the risk of a disease related to microbes, in particular obesity or related diseases.
  • Obesity which is prevalent in developed countries, has increased considerably worldwide (de Carvalho Pereira et al. , 2013) . It is reported that the prevalence of overweight and obesity combined rose by 27.5%for adults and 47.1%for children between 1980 and 2013 in the world. The number of overweight individuals increased from 857 million in 1980, to 2.1 billion in 2013, and of these, 671 million are affected by obesity. More than 50%of which live in ten countries, and USA has the largest number of obese individuals, followed by China (Ng et al. , 2014) .
  • BMI body mass index
  • Embodiments of the present disclosure seek to solve at least one of the problems existing in the prior art to at least some extent.
  • the present invention is based on the following findings by the inventors:
  • GWAS Metagenome-Wide Association Study
  • the inventors developed a disease classifier system based on the 54 gene markers that are defined as an optimal gene set by a minimum redundancy -maximum relevance (mRMR) feature selection method. For intuitive evaluation of the risk of obesity disease based on these 54 gut microbial gene markers, the inventors calculated a healthy index.
  • the inventors'data provide insight into the characteristics of the gut metagenome related to obesity risk, a paradigm for future studies of the pathophysiological role of the gut metagenome in other relevant disorders, and the potential usefulness for a gut-microbiota-based approach for assessment of individuals at risk of such disorders.
  • the markers of the present invention are more specific and sensitive as compared with conventional markers.
  • analysis of stool promises accuracy, safety, affordability, and patient compliance. And samples of stool are transportable.
  • the present invention relates to an in vitro method, which is comfortable and noninvasive, so people will participate in a given screening program more easily.
  • the markers of the present invention may also serve as tools for therapy monitoring in cancer patients to detect the response to therapy.
  • a biomarker set for predicting a disease related to microbiota in a subject consisting of:
  • the disease is obesity or related disease.
  • some disease related to the related to microbiota in a subject may be analyzed, for example obesity or related disease may be determined based on some sample from the subject , for example, some fecal sample may be used.
  • kits for determining the gene marker set described above comprising primers used for PCR amplification and designed according to the DNA sequecne as set forth in at least a partial sequence of SEQ ID NO: 1 to 54.
  • kits for determining the gene marker set described above comprising one or more probes designed according to the genes as set forth in SEQ ID NO: 1 to 54.
  • the risk of obesity or related disorder in a subject may be predicted by the following step:
  • a ij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
  • N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition
  • M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition
  • an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
  • the cutoff is at least 0.5834.
  • the risk of obesity or related disorder in a subject may be predicted by the following step:
  • a ij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
  • N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition
  • M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition
  • an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
  • the cutoff is at least 0.5834.
  • a method of diagnosing whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota comprising:
  • the risk of obesity or related disorder in a subject may be predicted by the following step:
  • a ij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
  • N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition
  • M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition
  • an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
  • the cutoff is at least 0.5834.
  • the abnormal condition related to microbiota is obesity or related disorder.
  • Fig. 1 The association analysis of Obese p-value distribution identified a disproportionate over-representation of strongly associated markers at lower P-values.
  • Example 1 Identifying biomarkers for evaluating obesity risk
  • Fecal samples from 158 Chinese subjects, including 78 obesity patients and 80 control subjects (training set) were collected by Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine in 2012. Obesity patients were age from 18 to 30 with BMI over 25. Subjects were asked to collect fresh feces samples at hospital. Collected samples were put in sterile tubes and stored at -80°Cimmediately until further analysis.
  • DNA library construction was performed following the manufacturer ⁇ s instruction (Illumina, insert size 350bp, read length 100bp) .
  • the inventors used the same workflow as described previously to perform cluster generation, template hybridization, isothermal amplification, linearization, blocking and denaturation, and hybridiza-tion of the sequencing primers.
  • the inventors constructed one paired-end (PE) library with insert size of 350 bp for each sample, followed by a high-throughput sequencing to obtain around 30 million PE reads of length 2x100bp. High-quality reads were obtained by filtering low-quality reads with ambiguous ⁇ N'bases, adapter contamination and human DNA contamination from the Illumina raw reads, and by trimming low-quality terminal bases of reads simultaneously.
  • PE paired-end
  • the inventors totally output about 5.9 Gb per sample of fecal micbiota sequencing data (high quality clean data) (Table 1) from 158 samples (78 cases and 80 controls) on Illumina HiSeq 2000 platform.
  • Table 1 Summary of metagenomic data. Fourth column reports results from Wilcoxon rank-sum tests.
  • the average reads mapping rate was shown on Table 1. This mapping rate was close to the samples in Li, J. et al. 2014, supra, which indicated that this mapping rate was sufficient for the further study.
  • the inventors derived the gene profile (9.9Mb genes) from the mapping result using the same method as Li, J. et al. 2014, supra.
  • Taxonomic assignment of genes was performed using an in-house pipeline which had described in the published paper (Li, J. et al. 2014, supra) .
  • PERMANOVA permutational multivariate analysis of variance
  • the inventors performed the analysis using the method implemented in package ′′vegan′′ in R, and the permuted p-value was obtained by 10,000 times permutations.
  • the inventors also corrected for multiple testing using′′ p. adjust′′ in R with Benjamini-Hochberg method to get the q-value for each test.
  • PERMANOA identified three significant factors associated with gut microbe (based on gene profiles) (q ⁇ 0.05, Table 2) .
  • FDR false discovery rate
  • Receiver Operator Characteristic (ROC) analysis The inventors applied the ROC analysis to assess the performance of the obesity classification based on metagenomic markers. The inventors then used the “pROC” package in R to draw the ROC curve.
  • ROC Receiver Operator Characteristic
  • 237 MLG species based on the 396, 100 obesity associated maker genes profile.
  • the inventors used the 396, 100 gene markers to built the metagenomic linkage group (MLG) using the same method described in the published T2D paper (Qin et al. 2012, supra) . All the 396, 100 genes were annotated by aligning these genes to the 4, 653 reference genomes in IMG v400.
  • An MLG was assigned to a genome if more than 50%constitutive genes were annotated to that genome, otherwise it was termed as unclassified.
  • Total 237 MLG genomes with gene number > 100 were selected (P-value ⁇ 0.01) .
  • the inventors estimated the average abundance of the genes of the MLG species, after removing the 5%lowest and 5%highest abundant genes (Qin et al. 2012, supra) .
  • a random forest model (R. 2.14, randomForest4.6-7 package) (Liaw, Andy &Wiener, Matthew. Classification and Regression by randomForest, R News (2002) , Vol. 2/3 p. 18, incorporated herein by reference) was trained using the MLG abundance profile of the training cohort (158 samples) to select the optimal set of MLG markers. The model was tested on one or more testing sets and the prediction error was calculated.
  • RandomForest4.6-7 package package in R vision 2.14
  • input is a training dataset (namely relative abundance profiles of selected MLGs in training samples)
  • sample disease status sample disease status of training samples is a vectot, 1 for obesity, 0 for control
  • test set just the relative abundance profiles of selected MLGs in test set
  • the inventors used the randomForest function from randomForest package in R software to build the classification, and predict function was used to predict the test set.
  • Output is the prediction results (probability of illness; cutoff is 0.5 and if the probability of illness ⁇ 0.5, the subject is at risk of obesity)
  • MLG species marker identification To identify 237 MLG species makers, the inventors used “randomForest4.6-7 package” package in R vision 2.14 based on the 237 obesity associated MLG species. Firstly, the inventors sorted all the 237 MLG species by the importance given by the “randomForest” method (Liaw, Andy &Wiener, Matthew. Classification and Regression by randomForest, R News (2002) , Vol. 2/3 p. 18, incorporated herein by reference) . MLG marker sets were constructed by creating incremental subsets of the top ranked MLG species, starting from 1 MLG species and ending at all 237 MLG species. For each MLG makers set, the inventors calculated the false predication ratio in the 158 samples.
  • the 54 MLG species sets with lowest false prediction ratio were selected out as MLG species makers (Table 3-1) .
  • the inventors drew the ROC curve using the OOB (out of bag) prediction probability of illness from randomForest model based on the selected MLG species markers (Table 3-2) and the area under the ROC curve (AUC) was 0.9651 in the 158 samples (Fig. 2) .
  • AUC area under the ROC curve
  • TPR true positive rate
  • FPR false positive rate
  • Table 3-1 54 most discriminant MLGs (species markers) associated with obesity
  • mRMR minimum redundancy -maximum relevance
  • the inventors developed a disease classifier system based on the 54 gene markers that the inventors defined. For intuitive evaluation of the risk of disease based on these gut microbial gene markers, the inventors calculated a gut healthy index (obesity index).
  • the inventors defined and calculated the gut healthy index for each individual on the basis of the selected 54 gene markers as described above. For each individual sample, the gut healthy index of sample j that denoted by I j was calculated by the formula below:
  • a ij is the relative abundance of marker i in sample j.
  • N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers),
  • M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers),
  • an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
  • the inventors computed a obesity index based on the relative abundance of these 54 gene markers, which clearly separated the obesity patient microbiomes from the control microbiomes (Table 6).
  • Classification of the 78 obesity patient microbiomes against the 80 control microbiomes using the obesity index exhibited an area under the receiver operating characteristic (ROC) curve of 0.9784 (Fig. 3).
  • ROC receiver operating characteristic
  • a ij is the relative abundance of marker i in sample j.
  • N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers) ,
  • M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers) ,
  • an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
  • Table 7 shows the calculated index of each sample and Table 8 shows the relevant gene relative abundance of a representative sample DB68A.
  • TPR true positive rate
  • FPR false positive rate
  • Case means before operation samples
  • control means after operation 1 month and 3 month.
  • a ij is the relative abundance of marker i in sample j.
  • N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers) ,
  • M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers) ,
  • an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
  • Table 10 shows the calculated index of each sample and Table 11 shows the relevant gene relative abundance of a representative sample DB62.
  • the error rate was 18.18% (4/22) , validating that the 54 gene markers can classify obesity individuals.
  • most of obesity patients (7/9) were diagnosed as obesity correctly.
  • TPR true positive rate
  • FPR false positive rate
  • the inventors have identified and validated 54 markers set by a minimum redundancy -maximum relevance (mRMR) feature selection method based on 54 obesity-associated gut microbes. And the inventors have built a gut healthy index to evaluate the risk of obesity disease based on these 54 gut microbial gene markers.
  • mRMR minimum redundancy -maximum relevance

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Biomarkers and methods for predicting the risk of a disease related to microbes, in particular obesity or related diseases, are provided.

Description

BIOMARKERS FOR OBESITY RELATED DISEASES
CROSS-REFERENCE TO RELATED APPLICATION
None
FIELD
The present invention relates to biomarkers and methods for predicting the risk of a disease related to microbes, in particular obesity or related diseases.
BACKGROUND
Obesity, which is prevalent in developed countries, has increased considerably worldwide (de Carvalho Pereira et al. , 2013) . It is reported that the prevalence of overweight and obesity combined rose by 27.5%for adults and 47.1%for children between 1980 and 2013 in the world. The number of overweight individuals increased from 857 million in 1980, to 2.1 billion in 2013, and of these, 671 million are affected by obesity. More than 50%of which live in ten countries, and USA has the largest number of obese individuals, followed by China (Ng et al. , 2014) .
There is a growing body of evidence suggesting that patients who are diagnosed by their physician that they are overweight are more likely to lose weight relative to those who are not diagnosed. However, low rates of physician diagnosis and advice for behavioral health risk factors related to obesity is concerning (Bleich et al. , 2011) . 
In children, the diagnosis of obesity is based on age-and gender-specific body mass index (BMI) cut-points. This is in contrast to adults, in which an obesity diagnosis is based on a BMI regardless of age or gender. Unlike adults, for whom obesity diagnostic criteria are simpler, fewer obese children being accurately diagnosed for the more complicated diagnostic criteria and change in terminology for pediatric obesity (Walsh et al. , 2013) . Moreover, limitations of BMI in terms of identification of the different populations should be considered (Nevill et al. , 2006) . Therefore, waist circumference (WC) can be considered a reliable and useful tool for epidemiological studies to assess abdominal adiposity, but this measurement seems to  be harder to perform (Miguel‐Etayo et al. , 2014) . What’s more, regional studies of diagnosis of pediatric obesity using International Classification of Diseases, Ninth Revision (ICD-9) , National Ambulatory Medical Care Survey (NAMCS) , and National Hospital Medical Care Survey (NHAMCS) have shown relatively low sensitivity of a clinical diagnosis (Walsh et al. , 2013) .
Recent insight suggests that the human gut microbiota could play an important role in obesity. An early report, based on sequencing of amplified 16S rRNA genes, indicated a much higher ratio of Firmicutes to Bacteroidetes in faecal samples from 12 obese humans than in two lean controls (Ley et al. , 2006) . Recent observational studies using metagenomic sequencing in human obesity have demonstrated reduced bacterial diversity, a relative depletion of Bacteroidetes , and enrichment in genes involved in carbohydrate and lipid metabolism (Allin and Pedersen, 2014) . These correlative findings indicated the altered gut microbiota is a causal factor in the pathogenesis of obesity. This indicating that maybe we can use the characteristics of gut microbiota as criteria to diagnosis of obesity.
In summary, there are considerable missed opportunities and low sensitivity in the diagnosis of obesity. A more valid (less biased) assessment of overweight and/or obesity need to be developed.
SUMMARY
Embodiments of the present disclosure seek to solve at least one of the problems existing in the prior art to at least some extent.
The present invention is based on the following findings by the inventors:
Assessment and characterization of gut microbiota has become a major research area in human disease, including obesity. To carry out analysis on gut microbial content in obesity patients, the inventors carried out a protocol for a Metagenome-Wide Association Study (MGWAS) (Qin, J. et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490, 55–60 (2012) , incorporated herein by reference) based on deep shotgun sequencing of the gut microbial DNA from 158 individuals. The inventors identified and validated 396, 100 obesity-associated gene markers. To exploit the potential ability of obesity classification by gut  microbiota, the inventors developed a disease classifier system based on the 54 gene markers that are defined as an optimal gene set by a minimum redundancy -maximum relevance (mRMR) feature selection method. For intuitive evaluation of the risk of obesity disease based on these 54 gut microbial gene markers, the inventors calculated a healthy index. The inventors'data provide insight into the characteristics of the gut metagenome related to obesity risk, a paradigm for future studies of the pathophysiological role of the gut metagenome in other relevant disorders, and the potential usefulness for a gut-microbiota-based approach for assessment of individuals at risk of such disorders.
It is believed that gene markers of intestinal microbiota are valuable for increasing obesity detection at earlier stages due to the following. First, the markers of the present invention are more specific and sensitive as compared with conventional markers. Second, analysis of stool promises accuracy, safety, affordability, and patient compliance. And samples of stool are transportable. Thus, the present invention relates to an in vitro method, which is comfortable and noninvasive, so people will participate in a given screening program more easily. Third, the markers of the present invention may also serve as tools for therapy monitoring in cancer patients to detect the response to therapy.
In one aspect of present disclosure, there is provided with a biomarker set for predicting a disease related to microbiota in a subject consisting of:
at least a partial sequence of SEQ ID NO: 1 to 54.
According to embodiments of present disclosure, the disease is obesity or related disease.
Using these biomarkers, some disease related to the related to microbiota in a subject may be analyzed, for example obesity or related disease may be determined based on some sample from the subject , for example, some fecal sample may be used.
In another aspect of present disclosure, there is provided with a kit for determining the gene marker set described above, comprising primers used for PCR amplification and designed according to the DNA sequecne as set forth in at least a partial sequence of SEQ ID NO: 1 to 54. 
In another aspect of present disclosure, there is provided with a kit for determining the gene marker set described above, comprising one or more probes designed according to the genes as set forth in SEQ ID NO: 1 to 54.
In another aspect of present disclosure, there is provided with use of the gene marker set described above for predicting the risk of obesity or related disorder in a subject. According to  embodiments of present disclosure, the risk of obesity or related disorder in a subject may be predicted by the following step:
(1) collecting a sample j from a subject;
(2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and
(3) calculating a index of sample j denoted by Ij by a formula below:
Figure PCTCN2014088056-appb-000001
Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,
M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,
|N| and |M| are numbers of the biomarkers respectively in the first and second subsets,
wherein
an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
According to some embodiments of present disclosure, |N| is 24, and |M| is 30.
According to some embodiments of present disclosure, the cutoff is at least 0.5834. 
In another aspect of present disclosure, there is provided with use of the gene marker set described above for preparation of a kit for predicting the risk of obesity or related disorder in a subject. According to embodiments of present disclosure, the risk of obesity or related disorder in a subject may be predicted by the following step:
(1) collecting a sample j from a subject;
(2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and
(3) calculating a index of sample j denoted by Ij by a formula below: 
Figure PCTCN2014088056-appb-000002
Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,
M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,
|N| and |M| are numbers of the biomarkers respectively in the first and second subsets,
wherein
an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
According to some embodiments of present disclosure, |N| is 24, and |M| is 30.
According to some embodiments of present disclosure, the cutoff is at least 0.5834.
In another aspect of present disclosure, there is provided with a method of diagnosing whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota, comprising:
determining the relative abundance of the biomarkers described above in a sample from the subject, and
determining whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota based on the relative abundance.
According to embodiments of present disclosure, According to embodiments of present disclosure, the risk of obesity or related disorder in a subject may be predicted by the following step:
(1) collecting a sample j from a subject;
(2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and
(3) calculating a index of sample j denoted by Ij by a formula below:
Figure PCTCN2014088056-appb-000003
Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set; .
N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,
M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,
|N| and |M| are numbers of the biomarkers respectively in the first and second subsets,
wherein
an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
According to some embodiments of present disclosure, |N| is 24, and |M| is 30.
According to some embodiments of present disclosure, the cutoff is at least 0.5834.
According to embodiments of present disclosure, the abnormal condition related to microbiota is obesity or related disorder.
BRIEF DISCRIPTION OF DRAWINGS
These and other aspects and advantages of the present disclosure will become apparent and more readily appreciated from the following descriptions taken in conjunction with the drawings, in which:
Fig. 1 The association analysis of Obese p-value distribution identified a disproportionate over-representation of strongly associated markers at lower P-values.
Fig. 2 The ROC were drawned by the probability of the illness in training set, and AUC=0.9651.
Fig. 3 The ROC were drawn by the obesity index in training set, and AUC=0.9784.
Fig. 4 The ROC in test set (42 samples) were drawn by the obesity index in test set, and AUC=0.8729.
Fig. 5 The ROC in test set (22 samples) were drawn by the obesity index in test set, and  AUC=0.9487.
Examples
Terms used herein have meanings as commonly understood by a person of ordinary skill in the areas relevant to the present invention. Terms such as “a” , “an” and “the” are not intended to refer to only a singular entity, but include the general class of which a specific example may be used for illustration. The terminology herein is used to describe specific embodiments of the invention, but their usage does not delimit the invention, except as outlined in the claims.
The present invention is further exemplified in the following non-limiting Examples. Unless otherwise stated, parts and percentages are by weight and degrees are Celsius. As apparent to one of ordinary skill in the art, these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only, and the agents were all commercially available.
Example 1. Identifying biomarkers for evaluating obesity risk
1.1 Sample collection
Fecal samples from 158 Chinese subjects, including 78 obesity patients and 80 control subjects (training set) , were collected by Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine in 2012. Obesity patients were age from 18 to 30 with BMI over 25. Subjects were asked to collect fresh feces samples at hospital. Collected samples were put in sterile tubes and stored at -80℃immediately until further analysis.
The complete ethical approval has been obtained, and all the patients gave written informed consent. The study was approved by the Institutional Review Board of Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine.
1.2 DNA extraction
Fecal samples were thawed on ice and DNA extraction was performed using the Qiagen QIAamp DNA Stool Mini Kit (Qiagen) according to manufacturer`s instructions. Extracts were treated with DNase-free RNase to eliminate RNA contamination. DNA quantity was determined using NanoDrop spectrophotometer, Qubit Fluorometer (with the Quant-iTTMdsDNA BR Assay Kit) and gel electrophoresis.
1.3 DNA library construction and sequencing of fecal samples
DNA library construction was performed following the manufacturer`s instruction (Illumina,  insert size 350bp, read length 100bp) . The inventors used the same workflow as described previously to perform cluster generation, template hybridization, isothermal amplification, linearization, blocking and denaturation, and hybridiza-tion of the sequencing primers. The inventors constructed one paired-end (PE) library with insert size of 350 bp for each sample, followed by a high-throughput sequencing to obtain around 30 million PE reads of length 2x100bp. High-quality reads were obtained by filtering low-quality reads with ambiguous `N'bases, adapter contamination and human DNA contamination from the Illumina raw reads, and by trimming low-quality terminal bases of reads simultaneously.
The inventors totally output about 5.9 Gb per sample of fecal micbiota sequencing data (high quality clean data) (Table 1) from 158 samples (78 cases and 80 controls) on Illumina HiSeq 2000 platform.
Table 1 Summary of metagenomic data. Fourth column reports results from Wilcoxon rank-sum tests.
Figure PCTCN2014088056-appb-000004
1.4 Metagenomic data processing and analysis
1.4.1 Reads mapping
The inventors used the updated human gut gene catalogue established in Li, J. et al. An integrated catalog of reference genes in the human gut microbiome. Nat. Biotechnol. (2014) (incorporated herein by reference) and mapped the high quality reads to it with the alignment criteria identity >=90. The average reads mapping rate was shown on Table 1. This mapping rate was close to the samples in Li, J. et al. 2014, supra, which indicated that this mapping rate was sufficient for the further study. After reads mapping, the inventors derived the gene profile (9.9Mb genes) from the mapping result using the same method as Li, J. et al. 2014, supra.
Taxonomic assignment of genes. Taxonomic assignment of the predicted genes was performed using an in-house pipeline which had described in the published paper (Li, J. et al. 2014, supra) .
1.4.2 Data profile construction
Gene profile. Based on the reads mapping results, the inventors use the same method described in the published T2D paper (Qin et al. 2012, supra) to compute the relative gene abundance.
1.4.3 Analysis of factors influencing gut microbiota gene profile. The inventors used the permutational multivariate analysis of variance (PERMANOVA) to assess the effect of 6 clinical parameters, including age, sex, height, weight, BMI and obese, based on gene profile . The inventors performed the analysis using the method implemented in package ″vegan″ in R, and the permuted p-value was obtained by 10,000 times permutations. The inventors also corrected for multiple testing using″ p. adjust″ in R with Benjamini-Hochberg method to get the q-value for each test. PERMANOA identified three significant factors associated with gut microbe (based on gene profiles) (q <0.05, Table 2) . The analysis indicated weight, BMI and obese status were strong associated markers, supporting the diseases (obese) status was the major determinant influencing the composition of gut microbiota.
Table 2 PERMANOVA based on euclidean distance analysis of gene profile. The analysis was conducted to test whether clinical parameters, and obese status have significant impact on the gut microbiota with q-value <0.05.
phenotype Df Sums Of Sqs Mean Sqs F. Model R2 Pr (>F)
Age 1 0.317034738 0.317034738 1.004112579 0.006395454 0.4094
Sex 1 0.377329497 0.377329497 1.196542903 0.007611763 0.1727
Height 1 0.331409667 0.331409667 1.049947284 0.006685435 0.3291
Weight 1 0.969536515 0.969536515 3.111941857 0.019558192 1.00E-04
BMI
1 0.954186893 0.954186893 3.0617069 0.019248548 1.00E-04
Obese
1 0.972185352 0.972185352 3.120613959 0.019611626 2.00E-04
1.4.4 Identification of obesity associated markers
Identification of obesity associated genes. To identify the association between the metagenomic profile and obesity, a two-tailed Wilcoxon rank-sum test was used in 9, 879, 897 high occurrence gene (genes that were present in less than 10 samples across all 158 samples were removed) profiles. 396, 100 gene markers were obtained, which were enriched in either case or control with p-value < 0.01, FDR=3.8% (Fig. 1) .
Estimating the false discovery rate (FDR) . Instead of a sequential p-value rejection method, the inventors applied the “q-value” method proposed in a previous study to estimate the FDR (Storey, J. D. A direct approach to false discovery rates. Journal of the Royal Statistical Society 64, 479-498 (2002) , incorporated herein by reference) .
Receiver Operator Characteristic (ROC) analysis. The inventors applied the ROC analysis to assess the performance of the obesity classification based on metagenomic markers. The inventors then used the “pROC” package in R to draw the ROC curve.
1.4.5 Construction of MLG and identification of obesity associated MLG species markers
237 MLG species based on the 396, 100 obesity associated maker genes profile. The inventors used the 396, 100 gene markers to built the metagenomic linkage group (MLG) using the same method described in the published T2D paper (Qin et al. 2012, supra) . All the 396, 100 genes were annotated by aligning these genes to the 4, 653 reference genomes in IMG v400. An MLG was assigned to a genome if more than 50%constitutive genes were annotated to that genome, otherwise it was termed as unclassified. Total 237 MLG genomes with gene number > 100 were selected (P-value <0.01) . To estimate the relative abundance of an MLG species, the inventors estimated the average abundance of the genes of the MLG species, after removing the 5%lowest and 5%highest abundant genes (Qin et al. 2012, supra) .
1.5 MLG-based classifier
A random forest model (R. 2.14, randomForest4.6-7 package) (Liaw, Andy &Wiener, Matthew. Classification and Regression by randomForest, R News (2002) , Vol. 2/3 p. 18, incorporated herein by reference) was trained using the MLG abundance profile of the training cohort (158 samples) to select the optimal set of MLG markers. The model was tested on one or more testing sets and the prediction error was calculated.
About the randomForest model, using “randomForest4.6-7 package” package in R vision 2.14, input is a training dataset (namely relative abundance profiles of selected MLGs in training  samples) , sample disease status (sample disease status of training samples is a vectot, 1 for obesity, 0 for control) , and a test set (just the relative abundance profiles of selected MLGs in test set) . Then the inventors used the randomForest function from randomForest package in R software to build the classification, and predict function was used to predict the test set. Output is the prediction results (probability of illness; cutoff is 0.5 and if the probability of illness ≥0.5, the subject is at risk of obesity)
54 MLG species marker identification. To identify 237 MLG species makers, the inventors used “randomForest4.6-7 package” package in R vision 2.14 based on the 237 obesity associated MLG species. Firstly, the inventors sorted all the 237 MLG species by the importance given by the “randomForest” method (Liaw, Andy &Wiener, Matthew. Classification and Regression by randomForest, R News (2002) , Vol. 2/3 p. 18, incorporated herein by reference) . MLG marker sets were constructed by creating incremental subsets of the top ranked MLG species, starting from 1 MLG species and ending at all 237 MLG species. For each MLG makers set, the inventors calculated the false predication ratio in the 158 samples. Finally, the 54 MLG species sets with lowest false prediction ratio were selected out as MLG species makers (Table 3-1) . Furthermore, the inventors drew the ROC curve using the OOB (out of bag) prediction probability of illness from randomForest model based on the selected MLG species markers (Table 3-2) and the area under the ROC curve (AUC) was 0.9651 in the 158 samples (Fig. 2) . At the best cutoff 0.5294, true positive rate (TPR) was 0.8625, and false positive rate (FPR) was 0.07692, indicating that the 54 MLG markers could be used to accurately classify obesity individuals.
Table 3-1 54 most discriminant MLGs (species markers) associated with obesity
Figure PCTCN2014088056-appb-000005
Figure PCTCN2014088056-appb-000006
Figure PCTCN2014088056-appb-000007
Table 3-2 Prediction results of 54 MLGs in 158 samples
Figure PCTCN2014088056-appb-000008
Figure PCTCN2014088056-appb-000009
Figure PCTCN2014088056-appb-000010
1.6 Methods for selecting 54 best markers from biomarkers (Maximum Relevance Minimum Redundancy (mRMR) feature selection framework)
To identify an optimal gene set, a minimum redundancy -maximum relevance (mRMR) (for detailed information, see Peng, H. , Long, F. &Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell27, 1226-1238, doi: 10.1109/TPAMI. 2005.159 (2005) , which is incorporated herein by reference) feature selection method was used to select from 54 MLG markers. Using the 54 MLG markers, the inventors select one gene to represent the MLG. The inventors selected the represent gene by mRMR method in each MLG. And the inventors choose the first gene to represent this MLG. So the inventors got 54 gene markers, which were shown on Table 4 and Table 5.The gene id is from the published reference gene catalogue as Li, J. et al. 2014, supra.
Table 4.54 optimal gene markers’enrichment information
Gene id Enrichment (1=obesity, 0=control)
4388 0
34851 0
58832 0
67971 1
85995 0
95220 0
156675 0
197527 0
201254 0
223292 0
232979 0
266046 1
331594 0
447739 0
515920 1
563367 0
565364 0
622928 0
727087 0
859299 1
903584 1
963362 0
1015556 0
1571678 1
1583339 1
1801041 0
2150523 0
2273710 0
2285506 0
2291624 1
2390685 0
2397559 1
2414703 1
2687923 1
2940024 0
3111759 0
3179344 1
3239706 1
3381319 0
3449966 1
3984550 0
4202903 1
5243950 1
5459014 1
5486089 1
5659078 1
6616419 0
6692162 1
7136991 1
7209512 0
7775163 0
8202342 1
8846481 1
9454625 1
Table 5. SEQ ID of the 54 optimal gene markers
gene id SEQ ID NO:
gene_id:85995 1
gene_id:5659078 2
gene_id:8846481 3
gene_id:95220 4
gene_id:3239706 5
gene_id:727087 6
gene_id:5243950 7
gene_id:1015556 8
gene_id:4388 9
gene_id:2397559 10
gene_id:2414703 11
gene_id:3381319 12
gene_id:201254 13
gene_id:7209512 14
gene_id:1801041 15
gene_id:7775163 16
gene_id:2273710 17
gene_id:447739 18
gene_id:5459014 19
gene_id:6616419 20
gene_id:3111759 21
gene_id:4202903 22
gene_id:859299 23
gene_id:3449966 24
gene_id:963362 25
gene_id:565364 26
gene_id:34851 27
gene_id:1583339 28
gene_id:8202342 29
gene_id:622928 30
gene_id:515920 31
gene_id:2390685 32
gene_id:7136991 33
gene_id:2291624 34
gene_id:331594 35
gene_id:2687923 36
gene_id:5486089 37
gene_id:156675 38
gene_id:1571678 39
gene_id:3984550 40
gene_id:232979 41
gene_id:266046 42
gene_id:223292 43
gene_id:67971 44
gene_id:6692162 45
gene_id:9454625 46
gene_id:3179344 47
gene_id:2940024 48
gene_id:197527 49
gene_id:58832 50
gene_id:903584 51
gene_id:2285506 52
gene_id:2150523 53
gene_id:563367 54
1.7 Gut healthy index (obesity index)
To exploit the potential ability of disease classification by gut microbiota, the inventors developed a disease classifier system based on the 54 gene markers that the inventors defined. For intuitive evaluation of the risk of disease based on these gut microbial gene markers, the inventors calculated a gut healthy index (obesity index).
To evaluate the effect of the gut metagenome on obesity, the inventors defined and calculated the gut healthy index for each individual on the basis of the selected 54 gene markers as described above. For each individual sample, the gut healthy index of sample j that denoted by Ij was calculated by the formula below:
Figure PCTCN2014088056-appb-000011
Aij is the relative abundance of marker i in sample j.
N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers),
M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers),
|N| and |M| are number (sizes) of the biomarker respectively in these two sets, wherein |N| is 24 and |M| is 30,
wherein an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
1.8 Gut-microbiota-based obesity classification
The inventors computed a obesity index based on the relative abundance of these 54 gene markers, which clearly separated the obesity patient microbiomes from the control microbiomes (Table 6). Classification of the 78 obesity patient microbiomes against the 80 control microbiomes using the obesity index exhibited an area under the receiver operating characteristic (ROC) curve of 0.9784 (Fig. 3). At the best index cutoff 0.5834, true positive rate (TPR) was 0.9103, and false positive rate (FPR) was 0.075, and error rate was 8.86% (14/158), indicating that the 54 gene markers could be used to accurately classify obesity individuals.
Table 6. 158 samples’ calculated gut healthy index (obesity patients and non-obesity controls)
Figure PCTCN2014088056-appb-000012
Figure PCTCN2014088056-appb-000013
Figure PCTCN2014088056-appb-000014
Example 2. Validating the 54 gene biomarkers in 42 samples (test set)
The inventors validated the discriminatory power of the obesity classifier using another new independent study group, including 17 obesity patients and 25 non-obesity controls that were also collected in Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine .
For each sample, DNA was extracted and a DNA library was constructed followed by high throughput sequencing as described in Example 1. The inventors calculated the gene abundance profile for these samples using the same method as described in Qin et al. 2012, supra. Then the gene relative abundance of each of the markers as set forth in SEQ ID NOs: 1-54 was determined. Then the index of each sample was calculated by the formula below:
Figure PCTCN2014088056-appb-000015
Aij is the relative abundance of marker i in sample j.
N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers) ,
M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers) ,
|N| and |M| are number (sizes) of the biomarker respectively in these two sets, wherein |N| is 24 and |M| is 30,
wherein an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
Table 7 shows the calculated index of each sample and Table 8 shows the relevant gene relative abundance of a representative sample DB68A. In this assessment analysis, at the cutoff 0.5834 (the best index cutoff in 158 samples above) , the error rate was 26.19% (11/42) , validating that the 54 gene markers can classify obesity individuals. And most of obesity patients (13/17) were diagnosed as obesity correctly. Also, the ROC in test set were drawned by the obesity index in test set, and AUC=0.8729 (Fig. 4) . At the best cutoff 0.7769, true positive rate (TPR) was 0.7647, and false positive rate (FPR) was 0.04.
Table 7.42 samples’ calculated gut healthy index
Figure PCTCN2014088056-appb-000016
Figure PCTCN2014088056-appb-000017
Table 8. Gene relative abundance of Sample DB68A
Figure PCTCN2014088056-appb-000018
Figure PCTCN2014088056-appb-000019
Example 3. Validating the 54 gene biomarkers in 22 samples (test set)
The inventors validated the discriminatory power of the obesity classifier using another 22 samples (Table 9) , including 9 case samples and 13 control samples (5 samples after operation 1 month and 8 samples after operation 3 month) that were also collected in Rui Jin Hospital Shanghai Jiao Tong Univeristy School of Medicine . Case means before operation samples, control means after operation 1 month and 3 month.
Table 9.22 samples information
Figure PCTCN2014088056-appb-000020
Figure PCTCN2014088056-appb-000021
*Before: before the operation; 1-M: operation after one month; 3-M: operation after three month.
For each sample, DNA was extracted and a DNA library was constructed followed by high throughput sequencing as described in Example 1. The inventors calculated the gene abundance profile for these samples using the same method as described in Qin et al. 2012, supra. Then the gene relative abundance of each of the markers as set forth in SEQ ID NOs: 1-54 was determined. Then the index of each sample was calculated by the formula below:
Figure PCTCN2014088056-appb-000022
Aij is the relative abundance of marker i in sample j.
N is a subset of all patient-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all obesity-enriched markers in these 54 selected gene markers) ,
M is a subset of all control-enriched markers in selected biomarkers related to the abnormal condition (namely, a subset of all control-enriched markers in these 54 selected gene markers) ,
|N| and |M| are number (sizes) of the biomarker respectively in these two sets, wherein |N| is 24 and |M| is 30,
wherein an index greater than a cutoff indicates that the subject has or is at the risk of developing obesity.
Table 10 shows the calculated index of each sample and Table 11 shows the relevant gene relative abundance of a representative sample DB62. In this assessment analysis, at the cutoff 0.5834 (the best index cutoff in 158 samples above) , the error rate was 18.18% (4/22) , validating that the 54 gene markers can classify obesity individuals. And most of obesity patients (7/9) were diagnosed as obesity correctly. Also, the ROC in test set were drawned by the obesity index in test set, and AUC=0.9487 (Fig. 5) . At the best cutoff 0.02538, true positive rate (TPR) was 1, and false  positive rate (FPR) was 0.1538.
Table 10. 22 samples’calculated gut healthy index
Samples (DB:obesity) obesity index
DB62 1.191905591
DB67 0.025381992
DB68 1.757974404
DB78 1.344989391
DB85 1.796053682
DB124 0.072164965
DB125 1.162137206
DB126 0.979123077
DB01 0.686585017
DB.S1.62 0.879906331
DB.S1.68 -0.274438487
DB.S1.85 0.0154326
DB.S1.124 -0.750440603
DB.S1.125 -0.893868407
DB.S3.62 0.711881869
DB.S3.67 -0.007230148
DB.S3.68 -0.029903064
DB.S3.78 -0.761996663
DB.S3.124 -0.588485485
DB.S3.125 -0.575369569
DB.S3.126 -0.398672766
DB.S3.01 -0.420476048
Table 11. Gene relative abundance of Sample DB62
Figure PCTCN2014088056-appb-000023
Figure PCTCN2014088056-appb-000024
Thus the inventors have identified and validated 54 markers set by a minimum redundancy -maximum relevance (mRMR) feature selection method based on 54 obesity-associated gut microbes. And the inventors have built a gut healthy index to evaluate the risk of obesity disease based on these 54 gut microbial gene markers.
Although explanatory embodiments have been shown and described, it would be appreciated by those skilled in the art that the above embodiments can not be construed to limit the present disclosure, and changes, alternatives, and modifications can be made in the embodiments without departing from spirit, principles and scope of the present disclosure.

Claims (15)

  1. A biomarker set for predicting a disease related to microbiota in a subject consisting of:
    at least a partial sequence of SEQ ID NO: 1 to 54.
  2.  The biomarker set for predicting a disease related to microbiota in a subject of claim 1, wherein the disease is obesity or related disease.
  3.  A kit for determining the gene marker set of claim 1, comprising primers used for PCR amplification and designed according to the DNA sequecne as set forth in claim 1.
  4.  A kit for determining the gene marker claim 1, comprising one or more probes designed according to at least a partial sequence of SEQ ID NO: 1 to 54.
  5.  Use of the gene marker set of claim 1 for predicting the risk of obesity or related disorder in a subject, comprising:
    (1) collecting a sample j from a subject;
    (2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and
    (3) calculating a index of sample j denoted by Ij by a formula below:
    Figure PCTCN2014088056-appb-100001
    Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set;
    N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,
    M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,
    |N| and |M| are numbers of the biomarkers respectively in the first and second subsets,
    wherein
    an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
  6. The use according to claim 5, wherein |N| is 24, and |M| is 30.
  7. The use according to claim 5, wherein the cutoff is at least 0.5834.
  8. Use of the gene marker set of claim 1 for preparation of a kit for predicting the risk of obesity or related disorder in a subject, comprising:
    (1) collecting a sample j from a subject;
    (2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the DNA of the sample; and
    (3) calculating a index of sample j denoted by Ij by a formula below:
    Figure PCTCN2014088056-appb-100002
    Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set;
    N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,
    M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,
    |N| and |M| are numbers of the biomarkers respectively in the first and second subsets,
    wherein
    an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
  9. The use according to claim 8, wherein |N| is 24, and |M| is 30.
  10. The use according to claim 9, wherein the cutoff is at least 0.5834.
  11. A method of diagnosing whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota, comprising:
    determining the relative abundance of the biomarkers of claim 1 in a sample from the subject, and
    determining whether a subject has an abnormal condition related to microbiota or is at the risk of developing an abnormal condition related to microbiota based on the relative abundance.
  12. The method according to claim 11, comprising:
    (1) collecting a sample j from a subject;
    (2) determining the relative abundance information of each of SEQ ID NO: 1 to 54 in the  DNA of the sample; and
    (3) calculating a index of sample j denoted by Ij by a formula below:
    Figure PCTCN2014088056-appb-100003
    Aij is the relative abundance of marker i in sample j, wherein i refers to each of the gene markers in said gene marker set;
    N is a first subset of all patient-enriched markers in selected biomarkers related to the abnormal condition,
    M is a second subset of all control-enriched markers in selected biomarkers related to the abnormal condition,
    |N| and |M| are numbers of the biomarkers respectively in the first and second subsets,
    wherein
    an index greater than a cutoff indicates that the subject has or is at the risk of developing abnormal condition.
  13. The method according to claim 12, wherein |N| is 24, and |M| is 30.
  14. The method according to claim 13, wherein the cutoff is at least 0.5834.
  15. The method according to claim 11, wherein the abnormal condition related to microbiota is obesity or related disorder.
PCT/CN2014/088056 2014-09-30 2014-09-30 Biomarkers for obesity related diseases Ceased WO2016049927A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201480082401.4A CN106795481B (en) 2014-09-30 2014-09-30 Biomarkers for Obesity-Related Diseases
PCT/CN2014/088056 WO2016049927A1 (en) 2014-09-30 2014-09-30 Biomarkers for obesity related diseases

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/088056 WO2016049927A1 (en) 2014-09-30 2014-09-30 Biomarkers for obesity related diseases

Publications (1)

Publication Number Publication Date
WO2016049927A1 true WO2016049927A1 (en) 2016-04-07

Family

ID=55629350

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/088056 Ceased WO2016049927A1 (en) 2014-09-30 2014-09-30 Biomarkers for obesity related diseases

Country Status (2)

Country Link
CN (1) CN106795481B (en)
WO (1) WO2016049927A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3677272B1 (en) * 2017-08-29 2022-11-16 BGI Shenzhen Application of alistipes shahii in preparing a composition for preventing and/or treating lipid metabolism related diseases
CN112888448B (en) * 2018-12-07 2023-07-25 深圳华大生命科学研究院 Use of megamonas simplex for preventing and/or treating metabolic diseases

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101886132A (en) * 2009-07-15 2010-11-17 北京百迈客生物科技有限公司 Method for screening molecular markers correlative with properties based on sequencing technique and BSA (Bulked Segregant Analysis) technique
CN101921748A (en) * 2010-06-30 2010-12-22 深圳华大基因科技有限公司 DNA Molecular Tags for High-Throughput Detection of Human Papillomaviruses
WO2014019271A1 (en) * 2012-08-01 2014-02-06 Bgi Shenzhen Biomarkers for diabetes and usages thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050239706A1 (en) * 2003-10-31 2005-10-27 Washington University In St. Louis Modulation of fiaf and the gastrointestinal microbiota as a means to control energy storage in a subject
NZ589863A (en) * 2008-05-16 2012-11-30 Interleukin Genetics Inc Genetic markers for weight management and methods of use thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101886132A (en) * 2009-07-15 2010-11-17 北京百迈客生物科技有限公司 Method for screening molecular markers correlative with properties based on sequencing technique and BSA (Bulked Segregant Analysis) technique
CN101921748A (en) * 2010-06-30 2010-12-22 深圳华大基因科技有限公司 DNA Molecular Tags for High-Throughput Detection of Human Papillomaviruses
WO2014019271A1 (en) * 2012-08-01 2014-02-06 Bgi Shenzhen Biomarkers for diabetes and usages thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KOETH, R.A. ET AL.: "Intestinal microbiota metabolism of L-carnitine, a nutrient in red meat, promotes atherosclerosis.", NATURE MEDICINE, 7 April 2013 (2013-04-07), pages 576 - 585 *
QIN, JUNJIE ET AL.: "A metagenome-wide association study of gut microbiota in type 2 diabetes.", NATURE, vol. 490, 4 October 2012 (2012-10-04), pages 55 - 60, XP055111695, DOI: doi:10.1038/nature11450 *
WANG, ZENENG ET AL.: "Gut flora metabolism of phosphatidylcholine promotes cardiovascular disease.", NATURE, vol. 472, 7 April 2011 (2011-04-07), pages 57 - 63, XP055120871, DOI: doi:10.1038/nature09922 *

Also Published As

Publication number Publication date
CN106795481B (en) 2021-05-04
CN106795481A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
WO2016049932A1 (en) Biomarkers for obesity related diseases
US20190367995A1 (en) Biomarkers for colorectal cancer
CN105132518B (en) Large intestine carcinoma marker and its application
CN110904213B (en) An ulcerative colitis biomarker based on intestinal flora and its application
CN105473738A (en) Biomarkers for colorectal cancer
CN107075453B (en) Biomarkers of Coronary Artery Disease
WO2020244017A1 (en) Intestinal flora-based schizophrenia biomarker combination, and applications thereof and motu screening method therefor
WO2016112488A1 (en) Biomarkers for colorectal cancer related diseases
CN105473739A (en) Biomarkers for colorectal cancer
JP2019511922A (en) Methods and systems for early risk assessment for preterm birth outcomes
CN111676291B (en) A miRNA marker for risk assessment of lung cancer
CN109072306A (en) Isolated nucleic acid and application
WO2016049927A1 (en) Biomarkers for obesity related diseases
CN114231633A (en) Kit, device and method for lung cancer diagnosis
CN112384634B (en) Osteoporosis biomarkers and their uses
CN109715828B (en) Biomarker combination for detecting endometriosis and application thereof
CN120400347A (en) Meningioma intestinal microbial markers, products and applications
CN119662826A (en) Pancreatic cancer biomarker based on intestinal flora and application thereof
CN115881229B (en) Allergy prediction model construction method based on intestinal microbial information
CN109072278A (en) Isolated Nucleic Acids and Applications
WO2016049917A1 (en) Biomarkers for obesity related diseases
TWI485252B (en) A method of detecting the possibility of crc by specific gene profile from stool samples
CN118961905A (en) Serum metabolite profile, intestinal flora profile and their application in craniopharyngioma
HK1240266A1 (en) Biomarkers for obesity related diseases
CN119360942A (en) A ulcerative colitis intestinal microbial marker and its application and ulcerative colitis detection model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14903455

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14903455

Country of ref document: EP

Kind code of ref document: A1