[go: up one dir, main page]

CN119069003A - A prediction system and electronic device for congenital heart disease combined with intellectual disability - Google Patents

A prediction system and electronic device for congenital heart disease combined with intellectual disability Download PDF

Info

Publication number
CN119069003A
CN119069003A CN202411127026.2A CN202411127026A CN119069003A CN 119069003 A CN119069003 A CN 119069003A CN 202411127026 A CN202411127026 A CN 202411127026A CN 119069003 A CN119069003 A CN 119069003A
Authority
CN
China
Prior art keywords
heart disease
sample
congenital heart
model
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411127026.2A
Other languages
Chinese (zh)
Inventor
高明磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Women And Children Medical Center Group
Original Assignee
Dalian Women And Children Medical Center Group
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Women And Children Medical Center Group filed Critical Dalian Women And Children Medical Center Group
Priority to CN202411127026.2A priority Critical patent/CN119069003A/en
Publication of CN119069003A publication Critical patent/CN119069003A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Primary Health Care (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Pathology (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Artificial Intelligence (AREA)
  • Bioethics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a congenital heart disease combined mental retardation prediction system and electronic equipment, and also provides a congenital heart disease combined mental retardation prediction method, which comprises the steps of obtaining expression data of target genes of a sample to be detected, wherein the target genes are any one or more of FKBP8 and FAM118A, SPTB, carrying out classification prediction based on the expression data of the target genes to obtain a classification result of whether the sample to be detected is a congenital heart disease combined mental retardation sample, and the method has the advantages of high accuracy, high detection sensitivity, high specificity, low cost and the like, has a wide application range, and provides support for clinicians to timely take more personalized control schemes for patients with congenital heart disease combined mental retardation.

Description

Prediction system and electronic equipment for congenital heart disease combined with mental retardation
Technical Field
The invention belongs to the technical field of disease prediction method development, and particularly relates to a prediction system and electronic equipment for congenital heart disease combined with mental retardation, and more particularly relates to a prediction method, a prediction system, electronic equipment and a computer readable storage medium for congenital heart disease combined with mental retardation.
Background
Possible causes of congenital heart disease combined with mental retardation include chromosomal abnormalities (e.g., exposure of pregnant women to radioactive substances, prolonged exposure to chemical substances, or viral infections during pregnancy, which may result in chromosomal abnormalities), intrauterine infections (e.g., infection of pregnant women with rubella, coxsackie virus, mumps, etc.), prolonged exposure to harmful substances during pregnancy (e.g., prolonged exposure of pregnant women to heavy metals, pesticides, benzene, etc.), and the like. Congenital heart disease with mental retardation is one of congenital heart diseases (congenital HEART DISEASE, CHD). At present, a prediction method or a prediction product capable of effectively diagnosing or predicting congenital heart disease combined with mental retardation is not known in the art.
Disclosure of Invention
In view of the above, the present invention aims to provide a prediction system and an electronic device for congenital heart disease complicated with mental retardation.
The invention also provides a prediction method of congenital heart disease combined mental retardation, which comprises the steps of obtaining expression data of target genes of a sample to be detected, wherein the target genes are any one or more of FKBP8 and FAM118A, SPTB, and carrying out classification prediction based on the expression data of the target genes to obtain a classification result of whether the sample to be detected is the congenital heart disease combined mental retardation sample.
The prediction method provided by the invention has the advantages of high accuracy, high detection sensitivity and specificity, low cost and the like, has a wide application range, and provides support for clinicians to timely take more personalized control schemes for patients with congenital heart diseases complicated with mental retardation.
The invention adopts the following technical scheme to realize the technical purposes:
prediction method for congenital heart disease combined with mental retardation
The invention firstly provides a prediction method for congenital heart disease combined with mental retardation, which comprises the following steps:
Obtaining expression data of a target gene of a sample to be detected, wherein the target gene is any one or more of FKBP8 and FAM118A, SPTB;
based on the expression data of the target gene, carrying out classification prediction to obtain a classification result of whether the sample to be detected is a congenital heart disease combined mental retardation sample;
If the expression level of any one or more of the target genes FKBP8 and SPTB is lower than a threshold value and/or the expression level of FAM118A is higher than a threshold value, obtaining a classification result that the sample to be detected is a congenital heart disease combined mental retardation sample;
If the expression level of any one or more of the target genes FKBP8 and SPTB is higher than a threshold value and/or the expression level of FAM118A is lower than the threshold value, a classification result that the sample to be detected is a sample with non-congenital heart disease combined with mental retardation is obtained.
Further, the classification result is obtained based on a prediction model;
the method for constructing the prediction model comprises the steps of obtaining target gene expression data of a training set sample and clinical characteristics corresponding to the sample, wherein the clinical characteristics comprise a congenital heart disease combined mental retardation patient and a congenital heart disease combined mental retardation patient, extracting target gene expression data in the training set, inputting the target gene expression data into a machine learning model, constructing the prediction model, and obtaining a constructed prediction model.
Further, the machine learning model includes a linear regression model, a logistic regression model, a random forest model, a Lasso regression model, a neural network model, a decision tree model, a perceptron model, a support vector machine model, and/or a naive bayes model.
Further, the target gene expression data includes mRNA expression amount data of the target gene or protein expression amount data of the target gene;
the mRNA expression quantity data are mRNA expression quantity data obtained by a high-throughput sequencing technology, RT-PCR, qRT-PCR or in situ hybridization technology;
The protein expression amount data are obtained by an immunoblotting method, an immunohistochemical method, a mass spectrometry method or a co-immunoprecipitation method.
Further, the sample to be tested includes a blood sample, a serum sample, a plasma sample, a tissue sample, and/or a cell sample.
Prediction system for congenital heart disease combined with mental retardation
The invention also provides a prediction system for congenital heart disease combined with mental retardation, which comprises the following steps:
The method comprises the steps of obtaining a data unit, wherein the data unit is used for obtaining expression data of a target gene of a sample to be detected, and the target gene is any one or more of FKBP8 and FAM118A, SPTB;
The analysis and prediction unit is used for carrying out classification and prediction based on the expression data of the target gene to obtain a classification result of whether the sample to be detected is a congenital heart disease combined mental retardation sample;
If the expression level of any one or more of the target genes FKBP8 and SPTB is lower than a threshold value and/or the expression level of FAM118A is higher than a threshold value, obtaining a classification result that the sample to be detected is a congenital heart disease combined mental retardation sample;
If the expression level of any one or more of the target genes FKBP8 and SPTB is higher than a threshold value and/or the expression level of FAM118A is lower than the threshold value, obtaining a classification result that the sample to be detected is a sample with non-congenital heart disease combined with mental retardation;
the classification result is obtained based on a prediction model;
Obtaining target gene expression data of a training set sample and clinical characteristics corresponding to the sample, wherein the clinical characteristics comprise a congenital heart disease combined mental retardation patient and a congenital heart disease combined mental retardation patient, extracting the target gene expression data in the training set, inputting the target gene expression data into a machine learning model, and constructing a prediction model to obtain a constructed prediction model;
And the output result unit is used for outputting the classification result.
Further, the machine learning model includes a linear regression model, a logistic regression model, a random forest model, a Lasso regression model, a neural network model, a decision tree model, a perceptron model, a support vector machine model, and/or a naive bayes model.
Electronic equipment
The invention also provides an electronic device comprising a memory for storing program instructions and a processor for invoking the program instructions for performing the steps of the congenital heart disease combined mental retardation prediction method as described above, when the program instructions are executed.
Computer readable storage medium
The invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of a congenital heart disease combined intellectual development lag prediction method as described above.
New use of reagent for detecting target gene FKBP8, FAM118A and/or SPTB expression level
The invention also provides application of the reagent for detecting the expression level of the target genes FKBP8, FAM118A and/or SPTB in the sample to be detected in preparation of products for diagnosing congenital heart disease with mental retardation.
Additional advantages, objects, and features of the application will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and drawings.
It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present application are not limited to the above-described specific ones, and that the above and other objects that can be achieved with the present application will be more clearly understood from the following detailed description.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a prediction method for congenital heart disease combined with mental retardation according to an embodiment of the invention;
Fig. 2 is a schematic diagram of a prediction system for congenital heart disease combined with mental retardation according to an embodiment of the invention;
fig. 3 is a schematic diagram of an electronic device for predicting congenital heart disease combined with mental retardation according to an embodiment of the invention;
FIG. 4 is an exemplary diagram of a read FASTQ data format;
FIG. 5 is a sequencing quality histogram;
FIG. 6 is a sequence average mass fraction;
FIG. 7 is a graph of correlation analysis among mRNA samples;
FIG. 8 is a volcanic and thermal map (top 100) corresponding to a differentially expressed gene;
FIG. 9 shows the result of enrichment of differentially expressed genes (GO enrichment on the left and KEGG enrichment on the right);
FIG. 10 is a graph showing the differential expression results of FKBP8 and FAM118A, SPTB, wherein A is a sample derived from a patient suffering from a congenital heart disease with normal nutrition and development of intelligence, and B is a sample derived from a patient suffering from a congenital heart disease with delayed development of intelligence;
FIG. 11 shows the results of diagnostic efficacy validation of FKBP8 and FAM118A, SPTB for predicting combined mental retardation of congenital heart disease, respectively;
FIG. 12 shows the results of a diagnostic efficacy test for the combination of any two of FKBP8 and FAM118A, SPTB for predicting combined mental retardation in the case of congenital heart disease;
FIG. 13 shows the results of diagnosis and efficacy verification of the combination of FKBP8 and FAM118A, SPTB for predicting the combined mental retardation of the congenital heart disease.
Detailed Description
In order to enable those skilled in the art to better understand the present invention, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present invention with reference to the accompanying drawings.
In some of the flows described in the specification and claims of the present invention and in the above figures, a plurality of operations appearing in a particular order are included, but it should be clearly understood that the operations may be performed in other than the order in which they appear herein or in parallel, the sequence numbers of the operations such as S101, S102, etc. are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel.
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments according to the invention without any creative effort, are within the protection scope of the invention.
Fig. 1 is a schematic flow chart of a prediction method for congenital heart disease combined with mental retardation, which is provided by the embodiment of the invention, specifically, the method comprises the following steps:
s101, obtaining expression data of a target gene of a sample to be detected, wherein the target gene is any one or more of FKBP8 and FAM118A, SPTB;
In one embodiment, the test sample is derived from a subject.
In one embodiment, the subject includes a mammal and a non-mammal. Examples of mammals include, but are not limited to, any member of the class mammalia, humans, non-human primates such as chimpanzees and other apes and monkeys, farm animals such as cows, horses, sheep, goats, pigs, domestic animals such as rabbits, dogs and cats, laboratory animals including rodents such as rats, mice and guinea pigs, and the like. Examples of non-mammals include, but are not limited to, birds, fish, or other non-mammals, and the like.
In one embodiment, the subject is a human.
In one embodiment, the test sample refers to a composition obtained or derived from a target subject comprising cellular entities and/or other molecular entities to be characterized and/or identified, e.g. based on physical, biochemical, chemical and/or physiological characteristics. The sample may be obtained from blood and other fluid samples of biological origin and tissue samples of the subject, such as biopsy tissue samples or tissue cultures or cells derived therefrom. The source of the tissue sample may be solid tissue, such as tissue from fresh, frozen and/or preserved organs or tissue samples, biopsy tissue or aspirates, blood or any blood component, body fluids, cells from any time of pregnancy or development of the individual, or plasma.
In one embodiment, the sample to be tested comprises a subject-derived blood sample, a tissue sample, a blood-derived cell sample, a serum sample, a plasma sample, a lymph sample, a synovial fluid sample, a cell extract sample, and/or any combination thereof.
In one embodiment, the sample to be tested is a blood sample from a subject.
In one embodiment, the expression data of the target gene includes mRNA expression level data of the target or protein expression level data of the target.
In one embodiment, the mRNA expression level data is mRNA expression level data obtained by high throughput sequencing techniques, RT-PCR, qRT-PCR, or in situ hybridization techniques. Any method or technique that can be used to detect the amount of mRNA expression corresponding to a gene can be used in the present invention.
In one embodiment, the protein expression level data is protein expression level data obtained by mass spectrometry, immunoblotting, immunohistochemistry, or co-immunoprecipitation. Any method or technique that can be used to detect the amount of protein expression corresponding to a gene can be used in the present invention.
In one embodiment, the target gene expression data is mRNA expression level data of a target gene, and the mRNA expression level data of the target gene can be obtained by a high-throughput sequencing technology, and in a specific embodiment of the present invention, the high-throughput sequencing technology is an Illumina high-throughput sequencing platform.
In one embodiment, the target gene is any one or a combination of a plurality of FKBP8 and FAM118A, SPTB, specifically, the target gene comprises any one of FKBP8 and FAM118A, SPTB, any two combinations of FKBP8 and FAM118A, SPTB, and three combinations of FKBP8 and FAM118A, SPTB.
S102, carrying out classification prediction based on the expression data of the target gene to obtain a classification result of whether the sample to be detected is a congenital heart disease combined intelligence development lag sample;
in one embodiment, the classification result is derived based on a predictive model;
the method for constructing the prediction model comprises the steps of obtaining target gene expression data of a training set sample and clinical characteristics corresponding to the sample, wherein the clinical characteristics comprise a congenital heart disease combined mental retardation patient and a congenital heart disease combined mental retardation patient, extracting target gene expression data in the training set, inputting the target gene expression data into a machine learning model, constructing the prediction model, and obtaining a constructed prediction model.
In one embodiment, the machine learning model includes a linear regression model, a logistic regression model, a random forest model, a Lasso regression model, a neural network model, a decision tree model, a perceptron model, a support vector machine model, and/or a naive bayes model.
In one embodiment, the target gene expression data includes mRNA expression amount data of the target gene or protein expression amount data of the target gene;
the mRNA expression quantity data are mRNA expression quantity data obtained by a high-throughput sequencing technology, RT-PCR, qRT-PCR or in situ hybridization technology;
The protein expression amount data are obtained by an immunoblotting method, an immunohistochemical method, a mass spectrometry method or a co-immunoprecipitation method.
In one embodiment, the target gene expression data is mRNA expression level data of a target gene, and the mRNA expression level data of the target gene can be obtained by a high-throughput sequencing technology, and in a specific embodiment of the present invention, the high-throughput sequencing technology is an Illumina high-throughput sequencing platform.
In one embodiment, the sample to be tested comprises a blood sample, a serum sample, a plasma sample, a tissue sample, and/or a cell sample.
In one embodiment, the method further comprises determining whether the sample to be tested is a congenital heart disease combined with mental retardation sample based on the following criteria:
If the expression level of any one or more of the target genes FKBP8 and SPTB is lower than a threshold value and/or the expression level of FAM118A is higher than a threshold value, obtaining a classification result that the sample to be detected is a congenital heart disease combined mental retardation sample;
If the expression level of any one or more of the target genes FKBP8 and SPTB is higher than a threshold value and/or the expression level of FAM118A is lower than the threshold value, a classification result that the sample to be detected is a sample with non-congenital heart disease combined with mental retardation is obtained.
In one embodiment, the efficiency of the constructed prediction model may be further predicted, that is, a data set containing the target gene expression data corresponding to the patients with the combined mental retardation of the congenital heart disease and the patients with the combined mental retardation of the congenital heart disease may be further taken, and the efficiency of the constructed prediction model may be verified in the data set. Proved by verification, the constructed prediction model can be effectively used for predicting the disease of the combined mental retardation of the congenital heart disease.
In one embodiment, the inventors collected blood samples from patients with combined mental retardation of congenital heart disease and sequenced analysis were performed to verify the expression of the above-mentioned target genes and the diagnostic efficacy of the disease with combined mental retardation of congenital heart disease. Clinical information of patients with the mental retardation due to the combined development of the congenital heart disease and patients with the mental retardation due to the combined development of the congenital heart disease are shown in the following table 1.
TABLE 1 clinical information of patients with advanced heart disease and mental retardation and patients with normal heart disease and mental retardation
In one embodiment, sequencing a cDNA library using an Illumina high throughput sequencing platform based on sequencing-by-synthesis (Sequencing By Synthesis, SBS) technology can yield large amounts of high quality Reads, and these Reads or bases produced by the sequencing platform are referred to as Raw Data (Raw Data). In order to facilitate analysis of sequencing data, the original image data obtained by Illumina sequencing is converted into sequence data, namely a FASTQ format, by Base rolling, so as to obtain the most original sequencing data file. The FASTQ format file may record the base and mass fraction of the read (read) measured. As shown in FIG. 4, the FASTQ format is stored in units of sequencing reads, each read occupies 4 rows, with the first and third rows consisting of a file identification tag (sequence identifiers) and a read name (ID) (the first row beginning with "@" and the third row beginning with "+"; the ID in the third row may be omitted but "+" cannot be omitted), the second row base sequence, and the fourth row the sequencing mass fraction of the corresponding position base.
The Illumina sequencer had 2 flowcell run and 8 lanes in flowcell, and one lane contained 2 columns, each column containing 60 tiles, each tile in turn was seeded with a different cluster, and the detailed information in the sequencing file identifier (Sequence Identifiers) generated is shown in table 2.
TABLE 2 sequencing File identification tag interpretation
The mass fraction of the read is represented in different characters, wherein the ASCII value corresponding to each character is subtracted by 33, which is the corresponding sequencing mass value. Typically, the base quality is from 0-40, i.e., the corresponding ASCII code is from "|" (0+33) to "I" (40+33). If the sequencing error rate is represented by E and the base quality value of Illunima HiSeq/Miseq is represented by Q, then there is the relationship that equation 1:Q = -10log 10 (E). Sequencing reads error rate increases as sequencing proceeds (see table 3), which is caused by the consumption of chemical reagents during sequencing and is a common feature of Illumina high throughput sequencing platforms.
TABLE 3 simple correspondence between sequencing error Rate and sequencing quality value
In one embodiment, the present application performs data volume statistics on a sequence of raw data, the results of which are shown in Table 4. In one embodiment, the present application provides quality control of the sequencing data, which can severely impact the quality of subsequent assembly, as sequencing adapter sequences, low quality reads, and other types of redundant sequences can be included in the original sequencing data. To ensure accuracy of subsequent bioinformatic analysis, the raw sequencing data is first filtered to obtain high quality sequencing data (CLEAN DATA). Cutadapt can find and delete adaptors, primers, polyA tails and other types of redundant sequences in reads. Sequencing data were washed using Cutadapt (version, 3.5, default parameters). In one embodiment, the application performs statistics on the data after quality control, multiQC supports integrating the quality control result, and performs data volume statistics on the sequence after data quality control. In one embodiment, the quality of data is evaluated after quality control, the Illumina sequencing belongs to a second generation sequencing technology, billions of reads can be generated by single operation, so that massive data cannot display the quality of each read one by one, and the quality evaluation is performed on all sequencing reads by using a statistical method, so that the sequencing quality of a sample and the library construction quality can be intuitively reflected from a macroscopic view. We performed sequencing related quality assessment on sequencing data for each sample. The mass values averaged for each base position in reads are shown in FIG. 5. Statistics of the average quality score of the sequences and the number of reads are shown in fig. 6. In one embodiment, the application performs gene quantification, HISAT2 (version, 2.1.0) alignment genome, samtools (version, 1.9) ordering and indexing bam civilization, qualiMap (version, 2.2.2) quality control bam file, featureCounts (version, 1.6.4) performs gene quantification. Reference genome version grch38.Primary_assembly. Genome. Fa.
TABLE 4 sequencing data statistics table
Correlation of gene expression levels between samples is an important indicator that the reliability of the test and the selection of samples are reasonable. Biological replicates are necessary for biological experiments and are provided for mainly two purposes, one to prove that the biological experimental procedure involved is not occasional but reproducible and the other to ensure more reliable results for subsequent differential gene analysis. Correlation of gene expression levels between samples is an important indicator for testing the reliability of experiments and for reasonably selecting samples. The closer the correlation coefficient is to1, the higher the similarity of expression patterns between samples. If there are biological repeats in the sample, the correlation coefficient between the biological repeats is generally required to be high. The Encode program suggests that the square of the pearson correlation coefficient (R2) is greater than 0.92 (ideal sampling and experimental conditions). In particular embodiments, it is desirable that R2 be at least greater than 0.8 between biological replicates. And calculating correlation coefficients in and among groups according to the count values of all genes/transcripts of the samples, and drawing a heat map, so that sample differences among groups and sample repetition conditions in the groups can be visually displayed. The sample correlation heat map according to this embodiment is shown in fig. 7.
In one embodiment, the application performs differential expression analysis, in which it is determined whether there is a difference in the expression level of a certain gene in different samples in the transcriptome, which is one of the core contents of the analysis. After the gene expression level is obtained, the expression data can be statistically analyzed, and then the genes with obvious differences among different samples can be screened. The variance analysis is largely divided into two steps, (1) the original read count is first normalized (normalization), mainly to correct the sequencing depth. (2) Calculation of hypothesis test probability (P-value) by statistical model differential expression analysis using DEseq2, screening parameters pvalue <0.05 and |log2foldchange | >1. Screening by the above standard to obtain 362 genes with different expression between patients with mental retardation due to congenital heart disease and patients with mental retardation due to congenital heart disease, wherein 145 genes with up-regulated expression and 217 genes with down-regulated expression, and volcanic and heat maps are shown in figure 8.
In one embodiment, the application further performs functional enrichment analysis on the differential expression genes between the patients with the mental retardation of the prior heart disease and the mental retardation of the prior heart disease, and uses a David database (https:// David. Ncifcrf. Gov/tools. Jsp) to perform GO functional enrichment and KEGG functional enrichment analysis on the differential expression genes. The screening criteria were pvalue <0.05 and the enrichment results are shown in FIG. 9 and Table 5.
TABLE 5KEGG enrichment results
In one embodiment, the present application screens for differential expression genes including FKBP8 and FAM118A, SPTB between patients with combined mental retardation of congenital heart disease and patients with combined mental retardation of congenital heart disease, and the present application has found that the FKBP8 and FAM118A, SPTB have significant differential expression between patients with combined mental retardation of congenital heart disease and patients with combined mental retardation of congenital heart disease (see figure 10).
In one embodiment, the application further verifies the diagnostic efficacy of the screened differentially expressed genes FKBP8, FAM118A, SPTB for predicting the disease with combined mental retardation of congenital heart disease, performs Receiver Operating Characteristic (ROC) analysis by using R package "pROC" (version 1.15.0), calculates the area under the working characteristic curve (AUC) of the subject to evaluate the accuracy of the single differentially expressed gene, any two differentially expressed gene combinations and three differentially expressed gene combinations respectively for predicting the disease with combined mental retardation of congenital heart disease, wherein the AUC value ranges from 0 to 1. When judging the diagnosis efficacy of a single differential expression gene on the disease of predicting the advanced heart disease combined with the mental retardation, directly analyzing the expression quantity of the single differential expression gene, and selecting the level corresponding to the point with the maximum Youden index as the cutoff value. When judging the diagnosis efficacy of any two differential expression gene combinations and three differential expression gene combinations for predicting the disease of the combined mental retardation of the congenital heart disease, carrying out Logistics regression analysis on the expression level of each differential expression gene, calculating the probability of each patient being the combined mental retardation of the congenital heart disease through a fitted regression curve, determining different probability division thresholds, and calculating the accuracy, the specificity, the sensitivity and the like of any two differential expression gene combinations and three differential expression gene combinations for predicting the disease of the combined mental retardation of the congenital heart disease according to the determined probability division thresholds.
The results of the verification are shown in fig. 11-13, and the results show that the single genes FKBP8 and FAM118A, SPTB have higher diagnostic efficacy on the disease with the combined mental retardation of the predicted congenital heart disease, wherein the AUC value corresponding to FKBP8 is 0.786, the sensitivity is 80%, the specificity is 71.4%, the cutoff value is 22311.000, the AUC value corresponding to FAM118A is 0.829, the sensitivity is 80%, the specificity is 92.9%, the cutoff value is 715.500, the AUC value corresponding to SPTB is 0.800, the sensitivity is 60%, the specificity is 92.9% and the cutoff value is 15.000. The combination of any two genes in FKBP8 and FAM118A, SPTB has higher diagnosis efficacy on the disease with the complicated mental retardation of the predicted congenital heart disease, wherein the AUC value corresponding to FKBP8+FAM118A is 0.957, the sensitivity is 80%, the specificity is 100%, the cutoff value is 0.682, the AUC value corresponding to FKBP8+SPTB is 0.857, the sensitivity is 80%, the specificity is 92.9%, the cutoff value is 0.441, and the AUC value corresponding to FAM118A+SPTB is 0.986, the sensitivity is 100%, the specificity is 92.9%, and the cutoff value is 0.199. The combination of FKBP8 and FAM118A, SPTB has higher diagnosis efficacy on the disease with the complicated mental retardation of the predicted congenital heart disease, and the corresponding AUC value of FKBP8+ FAM118A + SPTB is 1.000, the sensitivity is 100%, the specificity is 100% and the cutoff value is 0.500. The verification results prove that FKBP8, FAM118A and/or SPTB have higher diagnosis efficacy for predicting the disease with the combined mental retardation of the congenital heart disease, namely the effectiveness (higher diagnosis efficacy, accuracy, sensitivity and specificity) of the prediction method provided by the application is proved, and the method can be used for effectively predicting the disease with the combined mental retardation of the congenital heart disease.
S103, if the expression level of any one or more of the target genes FKBP8 and SPTB is lower than a threshold value and/or the expression level of FAM118A is higher than the threshold value, obtaining a classification result that the sample to be detected is a congenital heart disease combined mental retardation sample;
If the expression level of any one or more of the target genes FKBP8 and SPTB is higher than a threshold value and/or the expression level of FAM118A is lower than the threshold value, a classification result that the sample to be detected is a sample with non-congenital heart disease combined with mental retardation is obtained.
In one embodiment, the classification result is obtained based on a prediction model, and the method for constructing the prediction model is as described above.
In one embodiment, as described above, the application collects blood samples of patients with mental retardation due to congenital heart disease and patients with mental retardation due to congenital heart disease, and performs sequencing analysis to obtain the target gene expression data corresponding to two clinical characteristics of patients with mental retardation due to congenital heart disease and patients with mental retardation due to congenital heart disease. Clinical information of patients with the mental retardation caused by the combined development of the congenital heart disease and patients with the normal mental development caused by the combined development of the congenital heart disease is shown in table 1. And extracting the target gene expression data and inputting the target gene expression data into a machine learning model to construct a prediction model, thus obtaining a constructed prediction model.
In one embodiment, the machine learning model comprises a machine learning model as previously described.
In an embodiment, the threshold value is obtained in the prediction model, i.e. included in the prediction model, in the case of the determination of the above-mentioned construction method of the prediction model, i.e. the threshold value is also determined in the case of the determination of the prediction model. The prediction model can predict whether the sample to be tested is a classification result of a sample with the first heart disease and the mental retardation based on the determined threshold value. The specific judgment result based on the prediction model is that if the expression level of any one or more of the target genes FKBP8 and SPTB is lower than a threshold value and/or the expression level of FAM118A is higher than the threshold value, a classification result that the sample to be detected is a congenital heart disease combined mental retardation sample is obtained, and if the expression level of any one or more of the target genes FKBP8 and SPTB is higher than the threshold value and/or the expression level of FAM118A is lower than the threshold value, a classification result that the sample to be detected is a non-congenital heart disease combined mental retardation sample is obtained.
Fig. 2 is a schematic diagram of a prediction system for congenital heart disease combined with mental retardation according to an embodiment of the invention, specifically, the prediction system includes:
The method comprises the steps of obtaining a data unit, wherein the data unit is used for obtaining expression data of a target gene of a sample to be detected, and the target gene is any one or more of FKBP8 and FAM118A, SPTB;
The analysis and prediction unit is used for carrying out classification and prediction based on the expression data of the target gene to obtain a classification result of whether the sample to be detected is a congenital heart disease combined mental retardation sample;
If the expression level of any one or more of the target genes FKBP8 and SPTB is lower than a threshold value and/or the expression level of FAM118A is higher than a threshold value, obtaining a classification result that the sample to be detected is a congenital heart disease combined mental retardation sample;
If the expression level of any one or more of the target genes FKBP8 and SPTB is higher than a threshold value and/or the expression level of FAM118A is lower than the threshold value, obtaining a classification result that the sample to be detected is a sample with non-congenital heart disease combined with mental retardation;
the classification result is obtained based on a prediction model;
Obtaining target gene expression data of a training set sample and clinical characteristics corresponding to the sample, wherein the clinical characteristics comprise a congenital heart disease combined mental retardation patient and a congenital heart disease combined mental retardation patient, extracting the target gene expression data in the training set, inputting the target gene expression data into a machine learning model, and constructing a prediction model to obtain a constructed prediction model;
And the output result unit is used for outputting the classification result.
In one embodiment, the target gene expression data includes mRNA expression amount data of the target gene or protein expression amount data of the target gene;
the mRNA expression quantity data are mRNA expression quantity data obtained by a high-throughput sequencing technology, RT-PCR, qRT-PCR or in situ hybridization technology;
The protein expression amount data are obtained by an immunoblotting method, an immunohistochemical method, a mass spectrometry method or a co-immunoprecipitation method.
In one embodiment, the machine learning model includes a linear regression model, a logistic regression model, a random forest model, a Lasso regression model, a neural network model, a decision tree model, a perceptron model, a support vector machine model, and/or a naive bayes model.
In one embodiment, the predictive model predicts whether the sample to be tested is a classification of a heart disease and mental retardation sample based on a determined threshold.
Fig. 3 is a schematic diagram of an electronic device for predicting congenital heart disease combined mental retardation according to an embodiment of the invention, specifically, the electronic device includes a memory and a processor, the memory is used for storing program instructions, and the processor is used for calling the program instructions, when the program instructions are executed, for executing the steps of the method for predicting congenital heart disease combined mental retardation as described above.
The embodiment of the invention also provides a computer readable storage medium, in particular to a computer program stored on the computer readable storage medium, and the computer program is executed by a processor to realize the steps of the prediction method for congenital heart disease combined with mental retardation.
The embodiment of the invention also provides application of the reagent for detecting the expression level of the target genes FKBP8, FAM118A and/or SPTB in the sample to be detected in preparing a product for diagnosing congenital heart disease with mental retardation.
In one embodiment, the product comprises a detection kit, a detection chip or a detection reagent strip.
In one embodiment, the detection kit, detection chip or detection kit strip comprises a reagent for detecting the expression level of the target genes FKBP8, FAM118A and/or SPTB in a test sample derived from a subject.
In one embodiment, the reagent comprises a reagent for detecting the mRNA expression level of FKBP8, FAM118A and/or SPTB in the test sample, a reagent for detecting the protein expression level of FKBP8, FAM118A and/or SPTB in the test sample, and/or a reagent for detecting the number of FKBP8, FAM118A and/or SPTB positive expressing cells in the test sample.
In one embodiment, the reagent for detecting the mRNA expression level of FKBP8, FAM118A and/or SPTB in the test sample comprises a primer for specifically amplifying FKBP8, FAM118A and/or SPTB, and/or a probe for specifically recognizing FKBP8, FAM118A and/or SPTB.
In one embodiment, the reagent for detecting the protein expression level of FKBP8, FAM118A and/or SPTB in the test sample comprises an antibody that specifically binds to a protein encoded by FKBP8, FAM118A and/or SPTB, a peptide that specifically binds to a protein encoded by FKBP8, FAM118A and/or SPTB, an aptamer that specifically binds to a protein encoded by FKBP8, FAM118A and/or SPTB, a small molecule compound that specifically binds to a protein encoded by FKBP8, FAM118A and/or SPTB, and/or an affinity protein that specifically binds to a protein encoded by FKBP8, FAM118A and/or SPTB.
In one embodiment, the reagent for detecting the number of FKBP8, FAM118A and/or SPTB-positive expressing cells in the test sample comprises a reagent for detecting the number of FKBP8, FAM118A and/or SPTB-positive expressing cells by an immunohistochemical assay.
In one embodiment, the primer is identical to an amplification primer, meaning a nucleic acid fragment comprising 5-100 nucleotides, preferably the primer or amplification primer comprises 15-30 nucleotides capable of initiating an enzymatic reaction (e.g., an enzymatic amplification reaction), in a specific embodiment of the invention the primer is a primer for a specific amplification gene FKBP8, FAM118A and/or SPTB.
In one embodiment, the probe refers to a molecule that binds to a specific sequence or subsequence or other portion of another molecule. In a specific embodiment of the present invention, the probe refers to a probe that specifically recognizes FKBP8, FAM118A and/or SPTB. Unless otherwise indicated, a probe generally refers to a polynucleotide probe that is capable of binding to another polynucleotide (often referred to as a target polynucleotide) by complementary base pairing. Depending on the stringency of the hybridization conditions, the probe is able to bind to a target polynucleotide that lacks complete sequence complementarity with the probe. Hybridization means include, but are not limited to, solution phase, solid phase, mixed phase or in situ hybridization assays. Exemplary probes in the present invention include gene-specific DNA oligonucleotide probes, such as microarray probes immobilized on a microarray substrate, quantitative nuclease protection test probes, probes attached to molecular barcodes, and probes immobilized on beads.
In one embodiment, the detection kit comprises an RT-PCR detection kit, an ELISA detection kit, a protein chip detection kit, a rapid detection kit, a DNA chip detection kit, an immunohistochemical detection kit, or an MRM (multiple reaction monitoring) detection kit.
In one embodiment, the detection kit may further comprise elements necessary for the reverse transcription polymerase chain reaction. The RT-PCR detection kit comprises a pair of primers specific for the gene encoding the marker protein. Each primer is a nucleotide having a nucleic acid sequence specific for the gene and may be about 7 to 50bp in length, more particularly about 10-39bp. In addition, the kit may further comprise primers specific for the nucleic acid sequence of the control gene, preferably the RT-PCR detection kit may further comprise a test tube or a suitable vessel, reaction buffers (different pH values and magnesium concentrations), deoxynucleotides (dNTPs), enzymes (e.g., taq polymerase and reverse transcriptase), deoxyribonuclease inhibitors, ribonuclease inhibitors, DEPC-water, and sterile water.
In one embodiment, the detection kit may contain the elements necessary for the manipulation of the DNA chip. The DNA chip kit may comprise a substrate to which a gene or cDNA or an oligonucleotide corresponding to a fragment thereof is bound, and reagents, agents and enzymes for constructing a fluorescently labeled probe. In addition, the substrate may comprise a control gene or cDNA or an oligonucleotide corresponding to a fragment thereof.
In one embodiment, the presently disclosed detection kits may contain the necessary elements for performing an ELISA. The ELISA detection kit may comprise antibodies specific for proteins (FKBP 8, FAM118A and/or SPTB-encoded proteins according to the invention). The antibodies have high selectivity and affinity for the marker protein, are non-cross-reactive with other proteins, and may be monoclonal, polyclonal or recombinant. Furthermore, the ELISA detection kit may comprise antibodies specific to a control protein. In addition, the ELISA detection kit may further comprise reagents capable of detecting the bound antibody, e.g., a labeled secondary antibody, a chromophore, an enzyme (e.g., conjugated to an antibody), and substrates thereof or substances capable of binding to the antibody.
In one embodiment, the detection chip, also referred to as a biochip or array, refers to a solid support comprising attached nucleic acid or peptide probes. The array typically comprises a plurality of different nucleic acid or peptide probes attached to the surface of a substrate at different known locations. These arrays, also known as microarrays, can typically be produced using either mechanical synthesis methods or light-guided synthesis methods that combine a combination of photolithographic methods and solid-phase synthesis methods. The array may comprise a planar surface or may be a bead, gel, polymer surface, fiber such as optical fiber, glass or any other suitable nucleic acid or peptide on a substrate. The array may be packaged in a manner that allows for diagnosis or other manipulation of the fully functional device. Microarrays are ordered arrays of hybridization array elements, such as polynucleotide probes (e.g., oligonucleotides) or binding agents (e.g., antibodies), on a substrate. The substrate may be a solid substrate, for example, a glass or silica slide, beads, a fiber optic binder, or a semi-solid substrate, for example, a nitrocellulose membrane. The nucleotide sequence may be DNA, RNA or any arrangement thereof.
In one embodiment, the detection chip comprises a gene chip and a protein chip, wherein the gene chip comprises a solid phase carrier, and oligonucleotide probes orderly fixed on the solid phase carrier, and the oligonucleotide probes specifically correspond to part or all of the sequences shown in FKBP8, FAM118A and/or SPTB. The protein chip comprises a solid phase carrier, and FKBP8, FAM118A and/or SPTB-encoded protein specific antibodies or ligands immobilized on the solid phase carrier.
The results of the verification of the present verification embodiment show that assigning an inherent weight to an indication may moderately improve the performance of the present method relative to the default settings.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program for instructing related hardware, and the program may be stored in a computer readable storage medium, where the storage medium may include a Read Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in implementing the methods of the above embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, where the storage medium may be a read only memory, a magnetic disk or optical disk, etc.
While the foregoing describes a computer device provided by the present invention in detail, those skilled in the art will appreciate that the foregoing description is not meant to limit the invention thereto, as long as the scope of the invention is defined by the claims appended hereto.

Claims (10)

1.一种先天性心脏病合并智力发育落后的预测方法,其特征在于,所述方法包括:1. A method for predicting congenital heart disease combined with intellectual disability, characterized in that the method comprises: 获取待测样本的目标基因的表达数据,所述目标基因为FKBP8、FAM118A、SPTB中的任一种或多种;Acquire expression data of target genes of the sample to be tested, wherein the target genes are any one or more of FKBP8, FAM118A, and SPTB; 基于所述目标基因的表达数据进行分类预测,得到待测样本是否为先天性心脏病合并智力发育落后样本的分类结果;Based on the expression data of the target gene, classification prediction is performed to obtain a classification result of whether the sample to be tested is a sample of congenital heart disease combined with intellectual disability; 如果所述目标基因FKBP8、SPTB中的任意一种或多种的表达量低于阈值,和/或FAM118A的表达量高于阈值,得到所述待测样本为先天性心脏病合并智力发育落后样本的分类结果;If the expression level of any one or more of the target genes FKBP8 and SPTB is lower than the threshold, and/or the expression level of FAM118A is higher than the threshold, a classification result is obtained that the sample to be tested is a sample of congenital heart disease combined with intellectual disability; 如果所述目标基因FKBP8、SPTB中的任意一种或多种的表达量高于阈值,和/或FAM118A的表达量低于阈值,得到所述待测样本为非先天性心脏病合并智力发育落后样本的分类结果。If the expression level of any one or more of the target genes FKBP8 and SPTB is higher than the threshold, and/or the expression level of FAM118A is lower than the threshold, the classification result of the sample to be tested is obtained as a sample of non-congenital heart disease combined with intellectual disability. 2.根据权利要求1所述的先天性心脏病合并智力发育落后的预测方法,其特征在于,所述分类结果是基于预测模型得到;2. The prediction method for congenital heart disease combined with intellectual disability according to claim 1, wherein the classification result is obtained based on a prediction model; 所述预测模型的构建方法包括:获取训练集样本的所述目标基因表达数据和样本对应的临床特征,临床特征包括先天性心脏病合并智力发育落后患者和先天性心脏病合并智力发育正常患者,提取训练集中所述目标基因表达数据输入机器学习模型构建预测模型,得到构建好的预测模型。The method for constructing the prediction model includes: obtaining the target gene expression data of the training set samples and the clinical characteristics corresponding to the samples, the clinical characteristics including patients with congenital heart disease combined with intellectual disability and patients with congenital heart disease combined with normal intellectual development, extracting the target gene expression data in the training set and inputting it into a machine learning model to construct a prediction model, thereby obtaining a constructed prediction model. 3.根据权利要求2所述的先天性心脏病合并智力发育落后的预测方法,其特征在于,所述机器学习模型包括线性回归模型、逻辑回归模型、随机森林模型、Lasso回归模型、神经网络模型、决策树模型、感知机模型、支持向量机模型和/或朴素贝叶斯模型。3. The prediction method for congenital heart disease combined with intellectual disability according to claim 2, characterized in that the machine learning model includes a linear regression model, a logistic regression model, a random forest model, a Lasso regression model, a neural network model, a decision tree model, a perceptron model, a support vector machine model and/or a naive Bayes model. 4.根据权利要求1所述的先天性心脏病合并智力发育落后的预测方法,其特征在于,所述目标基因表达数据包括所述目标基因的mRNA表达量数据或所述目标基因的蛋白表达量数据;4. The prediction method for congenital heart disease combined with intellectual disability according to claim 1, wherein the target gene expression data comprises mRNA expression data of the target gene or protein expression data of the target gene; 所述mRNA表达量数据为通过高通量测序技术、RT-PCR、qRT-PCR或原位杂交技术获得的mRNA表达量数据;The mRNA expression data is obtained by high-throughput sequencing technology, RT-PCR, qRT-PCR or in situ hybridization technology; 所述蛋白表达量数据为通过免疫印迹法、免疫组织化学法、质谱分析法或免疫共沉淀法获得的蛋白表达量数据。The protein expression data are obtained by immunoblotting, immunohistochemistry, mass spectrometry or immunoprecipitation. 5.根据权利要求1所述的先天性心脏病合并智力发育落后的预测方法,其特征在于,所述待测样本包括血液样本、血清样本、血浆样本、组织样本和/或细胞样本。5. The prediction method for congenital heart disease combined with intellectual disability according to claim 1, wherein the sample to be tested comprises a blood sample, a serum sample, a plasma sample, a tissue sample and/or a cell sample. 6.一种先天性心脏病合并智力发育落后的预测系统,其特征在于,所述系统包括:6. A prediction system for congenital heart disease combined with intellectual disability, characterized in that the system comprises: 获取数据单元,用于获取待测样本的目标基因的表达数据,所述目标基因为FKBP8、FAM118A、SPTB中的任一种或多种;A data acquisition unit is used to acquire the expression data of the target gene of the sample to be tested, wherein the target gene is any one or more of FKBP8, FAM118A, and SPTB; 分析预测单元,用于基于所述目标基因的表达数据进行分类预测,得到待测样本是否为先天性心脏病合并智力发育落后样本的分类结果;An analysis and prediction unit, used to perform classification prediction based on the expression data of the target gene to obtain a classification result of whether the sample to be tested is a sample of congenital heart disease combined with intellectual disability; 如果所述目标基因FKBP8、SPTB中的任意一种或多种的表达量低于阈值,和/或FAM118A的表达量高于阈值,得到所述待测样本为先天性心脏病合并智力发育落后样本的分类结果;If the expression level of any one or more of the target genes FKBP8 and SPTB is lower than the threshold, and/or the expression level of FAM118A is higher than the threshold, a classification result is obtained that the sample to be tested is a sample of congenital heart disease combined with intellectual disability; 如果所述目标基因FKBP8、SPTB中的任意一种或多种的表达量高于阈值,和/或FAM118A的表达量低于阈值,得到所述待测样本为非先天性心脏病合并智力发育落后样本的分类结果;If the expression level of any one or more of the target genes FKBP8 and SPTB is higher than the threshold, and/or the expression level of FAM118A is lower than the threshold, a classification result is obtained that the sample to be tested is a sample of non-congenital heart disease combined with intellectual development retardation; 所述分类结果是基于预测模型得到;The classification result is obtained based on the prediction model; 所述预测模型的构建方法包括:获取训练集样本的所述目标基因表达数据和样本对应的临床特征,临床特征包括先天性心脏病合并智力发育落后患者和先天性心脏病合并智力发育正常患者,提取训练集中所述目标基因表达数据输入机器学习模型构建预测模型,得到构建好的预测模型;The method for constructing the prediction model comprises: obtaining the target gene expression data of the training set samples and the clinical characteristics corresponding to the samples, the clinical characteristics including patients with congenital heart disease combined with intellectual retardation and patients with congenital heart disease combined with normal intellectual development, extracting the target gene expression data in the training set and inputting it into a machine learning model to construct a prediction model, thereby obtaining a constructed prediction model; 输出结果单元,用于输出分类结果。The output result unit is used to output the classification result. 7.根据权利要求6所述的先天性心脏病合并智力发育落后的预测系统,其特征在于,所述机器学习模型包括线性回归模型、逻辑回归模型、随机森林模型、Lasso回归模型、神经网络模型、决策树模型、感知机模型、支持向量机模型和/或朴素贝叶斯模型。7. The prediction system for congenital heart disease combined with intellectual disability according to claim 6, characterized in that the machine learning model includes a linear regression model, a logistic regression model, a random forest model, a Lasso regression model, a neural network model, a decision tree model, a perceptron model, a support vector machine model and/or a naive Bayes model. 8.一种电子设备,其特征在于,所述电子设备包括存储器和处理器;所述存储器用于存储程序指令;所述处理器用于调用程序指令,当程序指令被执行时,用于执行权利要求1-5中任意一项所述的先天性心脏病合并智力发育落后的预测方法的步骤。8. An electronic device, characterized in that the electronic device comprises a memory and a processor; the memory is used to store program instructions; the processor is used to call the program instructions, and when the program instructions are executed, it is used to execute the steps of the prediction method for congenital heart disease combined with intellectual disability as described in any one of claims 1-5. 9.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-5中任意一项所述的先天性心脏病合并智力发育落后的预测方法的步骤。9. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the prediction method for congenital heart disease combined with intellectual disability described in any one of claims 1 to 5 are implemented. 10.检测待测样本中目标基因FKBP8、FAM118A和/或SPTB表达水平的试剂在制备用于诊断先天性心脏病合并智力发育落后的产品中的应用。10. Use of a reagent for detecting the expression level of the target gene FKBP8, FAM118A and/or SPTB in a test sample in the preparation of a product for diagnosing congenital heart disease combined with intellectual disability.
CN202411127026.2A 2024-08-16 2024-08-16 A prediction system and electronic device for congenital heart disease combined with intellectual disability Pending CN119069003A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411127026.2A CN119069003A (en) 2024-08-16 2024-08-16 A prediction system and electronic device for congenital heart disease combined with intellectual disability

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411127026.2A CN119069003A (en) 2024-08-16 2024-08-16 A prediction system and electronic device for congenital heart disease combined with intellectual disability

Publications (1)

Publication Number Publication Date
CN119069003A true CN119069003A (en) 2024-12-03

Family

ID=93639975

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411127026.2A Pending CN119069003A (en) 2024-08-16 2024-08-16 A prediction system and electronic device for congenital heart disease combined with intellectual disability

Country Status (1)

Country Link
CN (1) CN119069003A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119068973A (en) * 2024-08-16 2024-12-03 大连市妇女儿童医疗中心(集团) A prediction method, system, electronic device and storage medium for congenital heart disease combined with intellectual disability
CN120636531A (en) * 2025-06-03 2025-09-12 北京师范大学 Developmental dynamic expression pattern prediction method, model, system and database structure based on gene characteristics

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119068973A (en) * 2024-08-16 2024-12-03 大连市妇女儿童医疗中心(集团) A prediction method, system, electronic device and storage medium for congenital heart disease combined with intellectual disability
CN119068973B (en) * 2024-08-16 2025-04-18 大连市妇女儿童医疗中心(集团) A prediction method, system, electronic device and storage medium for congenital heart disease combined with intellectual disability
CN120636531A (en) * 2025-06-03 2025-09-12 北京师范大学 Developmental dynamic expression pattern prediction method, model, system and database structure based on gene characteristics

Similar Documents

Publication Publication Date Title
Kallioniemi Biochip technologies in cancer research
CN119069003A (en) A prediction system and electronic device for congenital heart disease combined with intellectual disability
US20220025443A1 (en) Methods for screening biological samples for contamination
CA2905517A1 (en) Methods and compositions for tagging and analyzing samples
WO2004097051A2 (en) Methods for diagnosing aml and mds differential gene expression
CN103168118A (en) Gene expression profiling with reduced number of transcript measurements
CN114875149A (en) Application of reagent for detecting biomarkers in preparation of product for predicting gastric cancer prognosis
CN114164264A (en) Method for evaluating endometrial receptivity of a patient and kit for carrying out the method
CN112921083A (en) Genetic markers in the assessment of intestinal polyps and colorectal cancer
WO2011060080A2 (en) Genes differentially expressed by cumulus cells and assays using same to identify pregnancy competent oocytes
CN106033087B (en) The method system of built-in property standard curve detection substance molecular number
CN113999900A (en) Method for evaluating fetal DNA concentration by using free DNA of pregnant woman and application
CN111748640A (en) Application of intestinal flora in sarcopenia
CN116445606A (en) Application of serum molecular marker COMP in auxiliary diagnosis of depression
CN112662754B (en) Application method of composition for predicting the probability of occurrence of microtia
WO2010142751A1 (en) In vitro diagnosis/prognosis method and kit for assessment of chronic antibody mediated rejection in kidney transplantation
CN119068973B (en) A prediction method, system, electronic device and storage medium for congenital heart disease combined with intellectual disability
CN113151465A (en) Products and related applications for identifying polyps and cancers based on genetic markers
CN112980959A (en) Genetic markers for predicting or diagnosing colorectal cancer/colorectal cancer risk
CN115831367B (en) A risk prediction model for pregnancy complications and its application
CN119842887B (en) Biomarkers and their application in evaluating severe bronchiolitis in children with respiratory syncytial virus infection
CN116287208B (en) A non-invasive method for diagnosing coronary artery ectasia
CN118127149B (en) Biomarker, model and kit for assessing risk of sepsis and infection in a subject
US20170121774A1 (en) Methods and compositions for assessing predicting responsiveness to a tnf inhibitor
CN116875673A (en) System for diagnosing myocardial infarction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination