CN107887023A - A kind of microbial diseases Relationship Prediction method based on similitude and double random walks - Google Patents
A kind of microbial diseases Relationship Prediction method based on similitude and double random walks Download PDFInfo
- Publication number
- CN107887023A CN107887023A CN201711293802.6A CN201711293802A CN107887023A CN 107887023 A CN107887023 A CN 107887023A CN 201711293802 A CN201711293802 A CN 201711293802A CN 107887023 A CN107887023 A CN 107887023A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- disease
- similarity
- mtd
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 219
- 201000010099 disease Diseases 0.000 title claims abstract description 218
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000000813 microbial effect Effects 0.000 title claims abstract description 46
- 238000005295 random walk Methods 0.000 title claims abstract description 24
- 244000005700 microbiome Species 0.000 claims abstract description 55
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 40
- 239000011159 matrix material Substances 0.000 claims description 77
- 238000004364 calculation method Methods 0.000 claims description 21
- 230000008569 process Effects 0.000 claims description 16
- 238000002790 cross-validation Methods 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 8
- 230000010354 integration Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 3
- 238000013508 migration Methods 0.000 claims 2
- 230000005012 migration Effects 0.000 claims 2
- 230000002068 genetic effect Effects 0.000 claims 1
- 239000000463 material Substances 0.000 abstract description 3
- 238000011161 development Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 7
- 238000012795 verification Methods 0.000 description 6
- 238000005094 computer simulation Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 238000010200 validation analysis Methods 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 241001156739 Actinobacteria <phylum> Species 0.000 description 2
- 241000736262 Microbiota Species 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000007481 next generation sequencing Methods 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 208000011594 Autoinflammatory disease Diseases 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 208000037384 Clostridium Infections Diseases 0.000 description 1
- 206010012438 Dermatitis atopic Diseases 0.000 description 1
- 208000001145 Metabolic Syndrome Diseases 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 208000012902 Nervous system disease Diseases 0.000 description 1
- 208000008589 Obesity Diseases 0.000 description 1
- 208000007107 Stomach Ulcer Diseases 0.000 description 1
- 201000000690 abdominal obesity-metabolic syndrome Diseases 0.000 description 1
- 201000009961 allergic asthma Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 208000006673 asthma Diseases 0.000 description 1
- 201000008937 atopic dermatitis Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 210000003169 central nervous system Anatomy 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 208000000718 duodenal ulcer Diseases 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000002550 fecal effect Effects 0.000 description 1
- 230000002496 gastric effect Effects 0.000 description 1
- 201000005917 gastric ulcer Diseases 0.000 description 1
- 208000021302 gastroesophageal reflux disease Diseases 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003871 intestinal function Effects 0.000 description 1
- 235000020824 obesity Nutrition 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Landscapes
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Epidemiology (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Bioethics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Public Health (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Description
技术领域technical field
本发明属于系统生物学领域,涉及一种基于相似性和双随机游走的微生物-疾病关系预测方法。The invention belongs to the field of systems biology and relates to a microorganism-disease relationship prediction method based on similarity and double random walks.
背景技术Background technique
越来越多的研究表明微生物在许多人类复杂疾病中起着非常重要的作用。随着目前下一代DNA测序技术的快速发展促进了人类身体之间微生物与疾病关联关系的发现,比如微生物群和各种癌症疾病,心血管疾病,代谢综合征(例如肥胖症和糖尿病),中枢神经系统疾病和自身炎症性疾病等。这些研究不但有助于对疾病机制的了解,也有利于对疾病的新治疗和诊断方案的发展。比如,确认粪便微生物群移植是治疗梭菌感染的安全有效的治疗方案,其通过重新引入正常的菌群到供体粪便,校正不平衡并重新建立正常的肠功能。所以对生物与疾病的关系的系统理解变得越来越紧迫。More and more studies have shown that microbes play a very important role in many complex human diseases. With the rapid development of the current next-generation DNA sequencing technology, the discovery of the relationship between microorganisms and diseases in the human body has been promoted, such as microbiota and various cancer diseases, cardiovascular diseases, metabolic syndrome (such as obesity and diabetes), central nervous system Nervous system diseases and autoinflammatory diseases, etc. These studies not only contribute to the understanding of the disease mechanism, but also contribute to the development of new treatment and diagnosis schemes for the disease. For example, fecal microbiota transplantation has been identified as a safe and effective treatment option for Clostridium infection by reintroducing normal flora into the donor stool, correcting imbalances and re-establishing normal bowel function. Therefore, a systematic understanding of the relationship between organisms and diseases is becoming more and more urgent.
目前普遍使用的是通过常规的基于实验的方法发现微生物与疾病的关系,其缺点是耗时且昂贵,同时也受到实验环境的限制,比如一些细菌不能在现有的种植实验室环境中进行培养。与此同时,通过计算模型对微生物与疾病之间的关系进行预测方式并没有得到大力的应用发展。到目前为止,很少有通过计算模型对微生物与疾病的关系进行预测的方法出现。KATZHMDA方法是目前通过已知的微生物与疾病的关系来进行预测新的微生物与疾病的关系的第一个模型,其通过集成疾病表征相似性、高斯核相似性以及微生物高斯核相似性,利用KATZ度信息来预测新的微生物与疾病的关系。当前微生物在计算方面的研究,大都集中在微生物分类方面,而对其与疾病的关系的关注严重不够。当前对微生物与疾病的关系的计算预测模型的发展程度和预测结果还不足以让生物实验人员认识计算模型预测的有效性,并进一步以此作为后续实验研究的基础。At present, it is commonly used to discover the relationship between microorganisms and diseases through conventional experiment-based methods, which are time-consuming and expensive, and are also limited by the experimental environment. For example, some bacteria cannot be cultivated in the existing planting laboratory environment. . At the same time, the prediction of the relationship between microorganisms and diseases through computational models has not been vigorously applied. So far, few methods have emerged for predicting microbe-disease relationships through computational models. The KATZHMDA method is currently the first model to predict the relationship between new microorganisms and diseases through the known relationship between microorganisms and diseases. It integrates disease representation similarity, Gaussian kernel similarity and microbial Gaussian kernel similarity, using KATZ degree information to predict novel microbe-disease relationships. Most of the current research on microbes in computing is focused on the classification of microbes, while the relationship between them and diseases is seriously insufficient. The current level of development and prediction results of computational prediction models for the relationship between microorganisms and diseases are not enough to allow biological experimenters to recognize the validity of computational model predictions, and further use them as the basis for subsequent experimental research.
受制于生物实验验证的效率、通过计算模型预测微生物与疾病的关系的关注和进展不够以及其预测结果的有待进一步提高,当前对微生物与疾病的关系的系统理解还是有限的。迫切需要提出更加有效的预测模型,充分利用已有的生物信息,通过更加科学的方式发现新的微生物与疾病的关系,为后续其关系预测的研究奠定基础,并进一步为生物实验研究提供重要的基本依据。此外,随着当前计算生物学和下一代DNA测序技术的发展,了解到微生物对疾病的重要程度越来越高,进而对微生物与疾病的关系预测模型的发展提出了紧迫的需求。因此,为了进一步确认微生物与疾病的关系的重要性,为后续其关系预测模型的发展和生物实验验证提供帮助,有必要设计一种有效的微生物疾病关联关系预测的方法。Constrained by the efficiency of biological experimental verification, insufficient attention and progress in predicting the relationship between microorganisms and diseases through computational models, and the need for further improvement in the prediction results, the current systematic understanding of the relationship between microorganisms and diseases is still limited. There is an urgent need to propose a more effective prediction model, make full use of existing biological information, discover new relationships between microorganisms and diseases in a more scientific way, lay the foundation for subsequent research on their relationship prediction, and further provide important insights for biological experimental research. basic basis. In addition, with the development of current computational biology and next-generation DNA sequencing technology, the importance of microbes to diseases is becoming more and more important, which in turn puts forward an urgent need for the development of predictive models for the relationship between microbes and diseases. Therefore, in order to further confirm the importance of the relationship between microorganisms and diseases, and provide assistance for the development of subsequent relationship prediction models and biological experiment verification, it is necessary to design an effective method for predicting the relationship between microorganisms and diseases.
发明内容Contents of the invention
本发明所要解决的技术问题是,针对现有技术的不足,提供一种基于相似性和双随机游走的微生物-疾病关系预测方法,能够较准确的预测微生物与疾病的关系,为后续其关系预测模型的发展提供基础并进一步有效的避免生物化学实验所消耗的大量人力物力。The technical problem to be solved by the present invention is to provide a method for predicting the relationship between microorganisms and diseases based on similarity and double random walks, which can accurately predict the relationship between microorganisms and diseases, and provide a basis for the follow-up on the relationship between microorganisms and diseases. The development of predictive models provides the basis and further effectively avoids a large amount of manpower and material resources consumed by biochemical experiments.
发明的技术解决方案如下:The technical solution of the invention is as follows:
一种基于相似性和双随机游走的微生物-疾病关系预测方法,包括以下步骤:A method for predicting the relationship between microorganisms and diseases based on similarity and double random walks, comprising the following steps:
步骤1:分别构建疾病功能相似性矩阵Dfunsim、疾病高斯核相似性矩阵KGIP,d和微生物高斯核相似性矩阵KGIP,m;Step 1: construct disease functional similarity matrix D funsim , disease Gaussian kernel similarity matrix K GIP,d and microbial Gaussian kernel similarity matrix K GIP,m respectively;
步骤2:集成疾病功能相似性矩阵Dfunsim和疾病高斯核相似性矩阵KGIP,d得到疾病最终相似性矩阵Sd;将微生物高斯核相似性矩阵KGIP,m作为微生物最终相似性矩阵Sm;Step 2: Integrate the disease functional similarity matrix D funsim and the disease Gaussian kernel similarity matrix K GIP,d to obtain the final disease similarity matrix S d ; use the microbial Gaussian kernel similarity matrix K GIP,m as the microbial final similarity matrix S m ;
步骤3:根据已知的微生物-疾病关系、微生物最终相似性矩阵Sm和疾病最终相似性矩阵Sd,构建一个双层的异构网络,利用双随机游走方法对微生物-疾病对进行关联分数预测;关联分数越大,则对应的微生物-疾病对存在关系的可能性越大。Step 3: According to the known microbe-disease relationship, microbe final similarity matrix S m and disease final similarity matrix S d , construct a two-layer heterogeneous network, and use double random walk method to associate microbe-disease pairs Score prediction; the larger the association score, the more likely the corresponding microbe-disease pair is related.
进一步地,所述步骤1中,首先根据已知的疾病-基因关系和基因的功能相似性计算两种疾病之间的功能相似性,然后由所有疾病两两之间的功能相似性构建疾病功能相似性矩阵Dfunsim;Further, in step 1, the functional similarity between two diseases is first calculated based on the known disease-gene relationship and the functional similarity of genes, and then the disease function is constructed from the functional similarity between all diseases similarity matrix D funsim ;
对于任意两种疾病A和B,其功能相似性计算公式如下:For any two diseases A and B, the functional similarity calculation formula is as follows:
其中,GA={gA1,gA2,......,gAm}为与疾病A相关联的基因集合,同样,GB={gB1,gB2,......,gBn}为与疾B相关联的基因集合,m和n分别为基因集合GA和GB中的基因数目;为基因gAi与基因集合GB的功能相似性值,为基因gBj与基因集合GA的功能相似性值,对应的计算公式如下:Among them, G A ={g A1 ,g A2 ,...,g Am } is the gene set associated with disease A, similarly, G B ={g B1 ,g B2 ,... , g Bn } is the gene set associated with disease B, m and n are the gene numbers in the gene set GA and G B respectively; is the functional similarity value of gene g Ai and gene set G B , is the functional similarity value of gene g Bj and gene set GA , and the corresponding calculation formula is as follows:
其中F(gAi,gBj)为基因gAi和gBj之间的语义相似性值,HumanNet数据库提供了基于对数似然函数的语义相似性值计算值,具体计算方式如下:Among them, F(g Ai , g Bj ) is the semantic similarity value between genes g Ai and g Bj . The HumanNet database provides the calculated semantic similarity value based on the logarithmic likelihood function. The specific calculation method is as follows:
F(gAi,gBj)=LLS(gAi,gBj).F(g Ai ,g Bj )=LLS(g Ai ,g Bj ).
其中LLS表示对数似然函数(在HumanNet数据库中,使用对数似然函数计算基因语义相似性值为现有技术)。Where LLS represents the logarithmic likelihood function (in the HumanNet database, using the logarithmic likelihood function to calculate the gene semantic similarity value is the prior art).
进一步地,所述步骤1中,根据已知的微生物-疾病关系,分别构建疾病高斯核相似性矩阵KGIP,d和微生物高斯核相似性矩阵KGIP,m,过程如下:Further, in the step 1, according to the known microorganism-disease relationship, the disease Gaussian kernel similarity matrix K GIP,d and the microbial Gaussian kernel similarity matrix K GIP,m are respectively constructed, and the process is as follows:
首先,定义为微生物的集合,Nm为微生物的数量;为所有疾病的集合,Nd为疾病的数量;邻接矩阵Y∈Nm×Nd表示各微生物与疾病之间是否存在已知的关系;如果微生物mi和疾病dj存在已知的关联关系则yij值为1,否则值为0;First, define is the collection of microorganisms, N m is the number of microorganisms; is the collection of all diseases, N d is the number of diseases; the adjacency matrix Y∈N m ×N d indicates whether there is a known relationship between each microorganism and the disease; if there is a known relationship between microorganism m i and disease d j Then the value of y ij is 1, otherwise the value is 0;
然后,计算所有疾病两两之间的高斯核相似性;对于任意两种疾病d1和d2,其高斯核相似性计算方式如下:Then, the Gaussian kernel similarity between all diseases is calculated; for any two diseases d 1 and d 2 , the Gaussian kernel similarity is calculated as follows:
KGIP,d(d1,d2)=exp(-γd||yd1-yd2||2)K GIP,d (d 1 ,d 2 )=exp(-γ d ||yd 1 -yd 2 || 2 )
其中,γd为控制核宽度的调节参数,γ'd为疾病带宽参数,依照(高斯核使用)经验设置为1;in, γ d is the adjustment parameter to control the kernel width, γ' d is the disease bandwidth parameter, which is set to 1 according to (Gaussian kernel use) experience;
再计算所有微生物两两之间的高斯核相似性;对于任意两种微生物m1和m2,其高斯核相似性计算方式如下:Then calculate the Gaussian kernel similarity between all microorganisms; for any two microorganisms m 1 and m 2 , the calculation method of the Gaussian kernel similarity is as follows:
KGIP,m(m1,m2)=exp(-γm||ym1-ym2||2).K GIP,m (m 1 ,m 2 )=exp(-γ m ||ym 1 -ym 2 || 2 ).
其中,γm为控制核宽度的调节参数,γ'm为微生物带宽参数,依照经验设置为1;、in, γ m is an adjustment parameter to control the width of the nucleus, and γ' m is a microbial bandwidth parameter, which is set to 1 according to experience;
最后,由所有疾病两两之间的高斯核相似性构建疾病高斯核相似性矩阵KGIP,d,由所有微生物两两之间的高斯核相似性构建微生物高斯核相似性矩阵KGIP,m。Finally, the disease Gaussian kernel similarity matrix K GIP,d is constructed from the Gaussian kernel similarity between all diseases, and the microbial Gaussian kernel similarity matrix K GIP,m is constructed from the Gaussian kernel similarity between all microorganisms.
进一步地,所述步骤2中,集成疾病功能相似性矩阵Dfunsim和疾病高斯核相似性矩阵KGIP,d得到疾病最终相似性矩阵Sd,具体集成方式计算如下:Further, in the step 2, the disease functional similarity matrix D funsim and the disease Gaussian kernel similarity matrix K GIP,d are integrated to obtain the final disease similarity matrix S d , and the specific integration method is calculated as follows:
即疾病最终相似性为功能相似性和高斯核相似性的平均值。That is, the final similarity of disease is the average of functional similarity and Gaussian kernel similarity.
进一步地,所述步骤5中,根据微生物最终相似性Sm,疾病最终相似性Sd,已知的微生物-疾病数据邻接矩阵Y集成一个双层的异构网络,利用双随机游走方法继续预测,其预测流程如下:Further, in the step 5, according to the final similarity of microorganisms S m , the final similarity of diseases S d , the known microorganism-disease data adjacency matrix Y is integrated into a two-layer heterogeneous network, and the double random walk method is used to continue Forecasting, the forecasting process is as follows:
首先,对微生物最终相似性矩阵Sm数据做列归一化处理,得到随机游走的微生物相似性关系矩阵MM∈Nm×Nm,其计算方式如下:First, column normalization is performed on the final microbial similarity matrix S m data to obtain a random walk microbial similarity matrix MM∈N m ×N m , which is calculated as follows:
同样,对疾病最终相似性矩阵Sd数据做列归一化处理,得到随机游走的疾病相似性关系矩阵MD∈Nd×Nd,其计算方式如下:Similarly, column normalization is performed on the final disease similarity matrix S d data to obtain a random walk disease similarity matrix MD∈N d ×N d , which is calculated as follows:
然后,在这个双层的异构网络中同时游走,过程如下:Then, walking in this two-layer heterogeneous network at the same time, the process is as follows:
在微生物网络中迭代进行左游走:Iterative left walk in the microbial network:
在疾病网络中迭代进行右游走:Iteratively perform right walks in the disease network:
其中,t为当前迭代的次数,Pt∈Nm×Nd表示第t次迭代预测得到的微生物-疾病关联分数矩阵,Pt(i,j)表示微生物i和疾病j的关联分数(关联程度);L_Pt表示微生物网络上进行第t次迭代预测得到的新的微生物-疾病关联分数矩阵,R_Pt表示在疾病网络上进行第t次迭代预测得到的微生物-疾病关联分数矩阵;P0为邻接矩阵Y∈Nm×Nd的归一化矩阵,Among them, t is the number of current iterations, P t ∈ N m ×N d represents the microorganism-disease association score matrix predicted by the t-th iteration, P t (i,j) represents the association score between microorganism i and disease j (association degree); L_P t represents the new microorganism-disease association score matrix obtained from the t-th iteration prediction on the microbial network, and R_P t represents the microorganism-disease association score matrix obtained from the t-th iteration prediction on the disease network; P 0 is the normalization matrix of the adjacency matrix Y∈N m ×N d ,
α为衰减参数,Il和Ir分别为微生物网络和疾病网络最大迭代次数参数,α、Il和Ir的值根据经验或交叉验证确定(设置衰减参数的值为0.1,Il和Ir的值分别为2和1);Lnum和Rnum分别为微生物网络和疾病网络已经完成迭代的次数, α is the attenuation parameter, I l and I r are the parameters of the maximum number of iterations of the microbial network and the disease network respectively, and the values of α, I l and I r are determined based on experience or cross-validation (set the value of the attenuation parameter to 0.1, I l and I The values of r are 2 and 1 respectively); L num and R num are respectively the number of iterations completed by the microbial network and the disease network,
当Pt收敛(Pt+1-Pt小于某个很小的阈值时(比如10-10),认为游走达到稳定状态)或者在微生物网络和疾病网络中的迭代游走均达到最大迭代次数时,结束迭代,最终的Pt即为预测得到的微生物-疾病关联分数矩阵。When P t converges (P t+1 -P t is less than a small threshold (such as 10 -10 ), it is considered that the walk reaches a steady state) or the iterative walk in the microbial network and the disease network both reach the maximum iteration When the number of times is , the iteration ends, and the final Pt is the predicted microbe-disease correlation score matrix.
有益效果:Beneficial effect:
本发明提出了一种基于相似性和双随机游走的微生物-疾病关系预测方法来预测新的微生物与疾病的关系。首先计算通过疾病基因关系和基因功能相似性信息来计算疾病的功能相似性,再根据已知的微生物-疾病关系计算疾病的高斯核相似性,进一步集成得到疾病最终相似性。同样,也根据已知的微生物-疾病关系计算出微生物高斯核相似性并作为最终的微生物相似性。再将微生物最终相似性、疾病最终相似性和已知的微生物-疾病关系集成到一个双层的异构网络中。最后在微生物相似性网络和疾病相似性网络中设置不同随走游走步数,通过迭代到一定的稳定状态预测最终的微生物疾病关系对关联关系分数。通过五倍交叉和留一验证与其他方法的预测结果比较表明,本发明能够对微生物与疾病之间的关系进行更有效的预测。能够为后续预测微生物-疾病关系的计算模型的发展提供重要基础,为生物医学实验提供基本的指导作用,节省其人力物力成本。The present invention proposes a microorganism-disease relationship prediction method based on similarity and double random walks to predict new relationships between microorganisms and diseases. First, calculate the functional similarity of the disease through the relationship between the disease gene and the similarity of gene function information, and then calculate the Gaussian kernel similarity of the disease based on the known microorganism-disease relationship, and further integrate to obtain the final similarity of the disease. Likewise, the microbial Gaussian kernel similarity was calculated based on the known microbial-disease relationship and served as the final microbial similarity. Microbial final similarity, disease final similarity, and known microbial-disease relationships are then integrated into a two-layer heterogeneous network. Finally, different walking steps are set in the microbial similarity network and disease similarity network, and the final microbial-disease relationship-to-association score is predicted by iterating to a certain steady state. Comparison of prediction results with other methods through five-fold crossover and leave-one-out validation shows that the present invention can more effectively predict the relationship between microorganisms and diseases. It can provide an important foundation for the subsequent development of computational models for predicting the relationship between microorganisms and diseases, provide basic guidance for biomedical experiments, and save human and material costs.
本发明在对构建双层异构网络过程中对微生物相似性矩阵和疾病相似性矩阵进行了列归一化处理。在随机游走过程中对微生物和疾病网络中分别设置了随机游走的迭代步数限制。通过结合两个网络中相似的疾病关联相似的微生物和相似的微生物关联相似的疾病来预测最终的关联分数。并采取了和KATZHMDA方法中同样的五倍交叉验证和留一验证方法进行了预测性能的比较,通过对AUC指标的分析表明本发明的预测性能。In the present invention, column normalization processing is performed on the microbial similarity matrix and the disease similarity matrix in the process of constructing the double-layer heterogeneous network. In the process of random walk, the iterative step limit of random walk is set respectively for the microbial network and the disease network. The final association score was predicted by combining similar diseases associated with similar microbes and similar microbes associated with similar diseases in both networks. And the same five-fold cross-validation and leave-one-out validation methods as in the KATZHMDA method were adopted to compare the prediction performance, and the prediction performance of the present invention was shown through the analysis of the AUC index.
本发明针对微生物-疾病关系领域,提供了一种通过计算模型预测其关系的有效的微生物-疾病关联关系预测方法,能够为后续这个领域的计算模型的研究提供重要基础,对疾病机制的整体理解提供帮助,并进一步推动药物的开发和复杂疾病的诊断治疗。Aiming at the field of microorganism-disease relationship, the present invention provides an effective method for predicting the relationship between microorganisms and diseases through a calculation model, which can provide an important basis for subsequent research on calculation models in this field and an overall understanding of disease mechanisms To help and further advance the development of drugs and the diagnosis and treatment of complex diseases.
附图说明Description of drawings
图1为本发明总体流程图;Fig. 1 is the overall flow chart of the present invention;
图2为本发明在数据集上五倍交叉验证比较图;Fig. 2 is the five-fold cross-validation comparison diagram of the present invention on the data set;
图3为本发明在数据集上留一交叉验证比较图;Fig. 3 is that the present invention leaves one-out cross-validation comparative figure on data set;
具体实施方式Detailed ways
以下将结合附图和具体实施例对本发明做进一步详细说明:The present invention will be described in further detail below in conjunction with accompanying drawing and specific embodiment:
实施例1:Example 1:
首先利用疾病基因关系和基因相似性信息计算疾病功能相似性;基于已知的微生物-疾病关系计算疾病高斯核相似性和微生物高斯核相似性;利用疾病功能相似性和高斯核相似性集成疾病最终相似性,具体集成方式为取疾病高斯核相似性和疾病功能相似性均值。将微生物高斯核相似性作为微生物最终相似性。再将微生物相似性信息、疾病相似性信息和已知的微生物-疾病关系信息集成到一个双层异构网络中。基于相似的微生物关联相似的疾病和相似的疾病关联相似的微生物的出发点利用双随机游走方法在异构网络中进行微生物-疾病关系预测。对随机游走方法的关键过程在微生物相似性网络和疾病相似性网络中设置不同随走游走步数,迭代到一定的稳定状态获取最终的微生物-疾病关系对的关联分数。First, disease functional similarity is calculated using disease gene relationship and gene similarity information; disease Gaussian kernel similarity and microbial Gaussian kernel similarity are calculated based on known microorganism-disease relationship; disease functional similarity and Gaussian kernel similarity are used to integrate disease finally Similarity, the specific integration method is to take the mean value of disease Gaussian kernel similarity and disease function similarity. The microbial Gaussian kernel similarity was used as the final similarity of microorganisms. Then the microbial similarity information, disease similarity information and known microbial-disease relationship information are integrated into a two-layer heterogeneous network. Based on the starting point that similar microorganisms are associated with similar diseases and similar diseases are associated with similar microorganisms, the double random walk method is used to predict the microorganism-disease relationship in heterogeneous networks. The key process of the random walk method is to set different walking steps in the microbial similarity network and disease similarity network, and iterate to a certain steady state to obtain the final association score of the microorganism-disease relationship pair.
本发明用到的已知微生物-疾病关系来自于HMDAD(http://www.cuilab.cn/hmdad)数据库,总共包括39种疾病和292种微生物,其已知的微生物-疾病关系数目为483。通过去重处理,最终的关系数目为450,疾病和微生物数目分别为39和292。疾病基因关系数据来自于DisGeNET数据库。The known microorganism-disease relationship used in the present invention comes from the HMDAD (http://www.cuilab.cn/hmdad) database, including 39 kinds of diseases and 292 kinds of microorganisms in total, and its known microorganism-disease relationship number is 483 . After deduplication processing, the final number of relations is 450, and the number of diseases and microorganisms are 39 and 292, respectively. Disease gene relationship data comes from the DisGeNET database.
基于相似性和双随机游走的微生物-疾病关系预测的整个流程如图1所示,可以划分为以下几个步骤:The whole process of microorganism-disease relationship prediction based on similarity and double random walk is shown in Figure 1, which can be divided into the following steps:
(1)计算疾病功能相似性Dfunsim的具体过程为:(1) The specific process of calculating the disease functional similarity D funsim is:
首先,对于疾病对A和B,定义其功能相似性计算公式如下:First, for the disease pair A and B, define its functional similarity calculation formula as follows:
其中,GA={gA1,gA2,......,gAm}为与疾病A相关联的基因集合,同样,GB={gB1,gB2,......,gBn}为与疾B相关联的基因集合,m和n分别为基因集合GA和GB中的基因数目;为基因gAi与基因集合GB的功能相似性值,为基因gBj与基因集合GA的功能相似性值,对应的计算公式如下:Among them, G A ={g A1 ,g A2 ,...,g Am } is the gene set associated with disease A, similarly, G B ={g B1 ,g B2 ,... , g Bn } is the gene set associated with disease B, m and n are the gene numbers in the gene set GA and G B respectively; is the functional similarity value of gene g Ai and gene set G B , is the functional similarity value of gene g Bj and gene set GA , and the corresponding calculation formula is as follows:
其中F(gAi,gBj)为基因gAi和gBj之间的语义相似性值,HumanNet数据库提供了基于对数似然函数的语义相似性值计算值,具体计算方式如下:Among them, F(g Ai , g Bj ) is the semantic similarity value between genes g Ai and g Bj . The HumanNet database provides the calculated semantic similarity value based on the logarithmic likelihood function. The specific calculation method is as follows:
F(gAi,gBj)=LLS(gAi,gBj).F(g Ai ,g Bj )=LLS(g Ai ,g Bj ).
在HumanNet数据库中,给出的基因6188和6209功能相似性值为0.9697,根据疾病关联的基因,以及基因的功能相似性,疾病Gastric and duodenal ulcer和Gastro-oesophageal reflux的功能相似性值为0.1655。In the HumanNet database, the functional similarity value of genes 6188 and 6209 is 0.9697. According to the genes associated with the disease and the functional similarity of the genes, the functional similarity value of the diseases Gastric and duodenal ulcer and Gastro-oesophageal reflux is 0.1655.
(2)根据已知的微生物-疾病关系,构建微生物高斯核相似性的过程如下:(2) According to the known microorganism-disease relationship, the process of constructing the microbial Gaussian kernel similarity is as follows:
首先,定义为微生物的集合,Nm为微生物的数量;为所有疾病的集合,Nd为疾病的数量;邻接矩阵Y∈Nm*Nd表示各微生物与疾病之间是否存在已知的关系。如果微生物mi和疾病dj存在已知的关联关系则yij值为1,否则值为0。比如微生物m1和m2的高斯核相似性计算方式定义如下:First, define is the collection of microorganisms, N m is the number of microorganisms; is the collection of all diseases, N d is the number of diseases; the adjacency matrix Y∈N m *N d indicates whether there is a known relationship between each microorganism and the disease. If there is a known relationship between microorganism m i and disease d j , then the value of y ij is 1, otherwise the value is 0. For example, the Gaussian kernel similarity calculation method of microorganisms m 1 and m 2 is defined as follows:
KGIP,m(m1,m2)=exp(-γm||ym1-ym2||2).K GIP,m (m 1 ,m 2 )=exp(-γ m ||ym 1 -ym 2 || 2 ).
其中,γm为控制核宽度的调节参数,其计算方式如下:in, γ m is an adjustment parameter to control the width of the kernel, and its calculation method is as follows:
其中γ'm依照高斯核使用经验设置为1。依据上述计算公式,微生物Actinobacillus和微生物Actinobacteria的高斯核相似性值为0.0390。Among them, γ' m is set to 1 according to the Gaussian kernel experience. According to the above calculation formula, the Gaussian kernel similarity value of Actinobacteria and Actinobacteria is 0.0390.
同样,定义疾病d1和d2的高斯核相似性计算方式如下:Similarly, the Gaussian kernel similarity for defining diseases d1 and d2 is calculated as follows:
KGIP,d(d1,d2)=exp(-γd||yd1-yd2||2)K GIP,d (d 1 ,d 2 )=exp(-γ d ||yd 1 -yd 2 || 2 )
其中,γ'd依照经验也设置为1。 ;依据上述计算公式,疾病Allergic asthma和疾病Atopicdermatitis的高斯核相似性值为0.4274。Among them, γ' d is also set to 1 according to experience. ; According to the above calculation formula, the Gaussian kernel similarity value of the disease Allergic asthma and the disease Atopicdermatitis is 0.4274.
(3)根据计算的疾病功能相似性Dfunsim和疾病高斯核相似性KGIP,d集成最终的疾病相似性过程,具体集成方式计算如下:(3) According to the calculated disease functional similarity D funsim and disease Gaussian kernel similarity K GIP,d integrate the final disease similarity process, the specific integration method is calculated as follows:
最终疾病相似性为功能相似性和高斯核相似性的平均值。The final disease similarity was the average of functional similarity and Gaussian kernel similarity.
(4)将微生物高斯核相似性矩阵KGIP,m作为微生物最终相似性矩阵Sm:(4) The microbial Gaussian kernel similarity matrix K GIP,m is used as the final microbial similarity matrix S m :
Sm=KGIP,m S m =K GIP,m
微生物只有一个高斯核相似性,故其最终相似性Sm为高斯核相似性。Microbes have only one Gaussian kernel similarity, so their final similarity S m is Gaussian kernel similarity.
(5)根据微生物最终相似性Sm,疾病最终相似性Sd,已知的微生物-疾病数据集成一个双层的异构网络,利用双随机游走方法继续预测,其预测流程为:(5) According to the final similarity S m of microorganisms and the final similarity S d of diseases, the known microorganism-disease data is integrated into a two-layer heterogeneous network, and the double random walk method is used to continue to predict. The prediction process is as follows:
首先,对微生物最终相似性矩阵Sm数据做列归一化处理得到随机游走的微生物相似性关系矩阵MM∈Nm*Nm,其计算方式如下:First, column normalization is performed on the final microbial similarity matrix S m data to obtain a random walk microbial similarity matrix MM∈N m *N m , which is calculated as follows:
同样,对疾病最终相似性矩阵Sd数据做列归一化处理得到随机游走的疾病相似性关系矩阵MD∈Nd*Nd,其计算方式如下:Similarly, perform column normalization on the final disease similarity matrix S d data to obtain a random walk disease similarity relationship matrix MD∈N d *N d , which is calculated as follows:
我们定义矩阵P∈Nm*Nd表示预测的微生物-疾病关系,P(i,j)表示微生物i和疾病j的关联分数(关联程度)。随机游走模型的预测过程是在这个双层的异构网络中同时游走,故我们对微生物和疾病网络分别设置了最大迭代次数参数Il和Ir。异构网络中的游走过程如下:We define the matrix P ∈ N m * N d to denote the predicted microbe-disease relationship, and P(i,j) denotes the association score (degree of association) between microbe i and disease j. The prediction process of the random walk model is to walk in this two-layer heterogeneous network at the same time, so we set the maximum iteration parameters I l and I r for the microbial network and the disease network respectively. The walking process in a heterogeneous network is as follows:
微生物网络中的左游走:Left walk in microbial network:
疾病网络中的右游走:Right walks in disease networks:
其中,t为当前迭代的次数,Pt∈Nm×Nd表示第t次迭代预测得到的微生物-疾病关联分数矩阵,Pt(i,j)表示微生物i和疾病j的关联分数(关联程度);L_Pt表示微生物网络上进行第t次迭代预测得到的新的微生物-疾病关联分数矩阵,R_Pt表示在疾病网络上进行第t次迭代预测得到的微生物-疾病关联分数矩阵;P0为邻接矩阵Y∈Nm×Nd的归一化矩阵,Among them, t is the number of current iterations, P t ∈ N m ×N d represents the microorganism-disease association score matrix predicted by the t-th iteration, P t (i,j) represents the association score between microorganism i and disease j (association degree); L_P t represents the new microorganism-disease association score matrix obtained from the t-th iteration prediction on the microbial network, and R_P t represents the microorganism-disease association score matrix obtained from the t-th iteration prediction on the disease network; P 0 is the normalization matrix of the adjacency matrix Y∈N m ×N d ,
α为衰减参数,Il和Ir分别为微生物网络和疾病网络最大迭代次数参数,α、Il和Ir的值根据经验或交叉验证确定(本实施例中,根据经验和交叉验证,设置衰减参数的值为0.1,最大迭代次数参数Il和Ir的值分别为2和1);Lnum和Rnum分别为微生物网络和疾病网络已经完成迭代的次数, α is the attenuation parameter, I l and I r are respectively the parameters of the maximum number of iterations of the microbial network and the disease network, and the values of α, I l and I r are determined according to experience or cross-validation (in this embodiment, according to experience and cross-validation, set The value of the attenuation parameter is 0.1, and the values of the maximum iteration number parameters I l and I r are 2 and 1 respectively); L num and R num are the number of iterations that the microbial network and the disease network have completed, respectively,
当Pt收敛(Pt+1-Pt小于某个很小的阈值时(比如10-10),认为游走达到稳定状态)或者在微生物网络和疾病网络中的迭代游走均达到最大迭代次数时,结束迭代,最终的Pt即为预测得到的微生物-疾病关联分数矩阵。对矩阵中的各关联分数由大到小进行排序,排名越靠前的微生物-疾病对存在关联关系的可能性越大。When P t converges (P t+1 -P t is less than a small threshold (such as 10 -10 ), it is considered that the walk reaches a steady state) or the iterative walk in the microbial network and the disease network both reach the maximum iteration When the number of times is , the iteration ends, and the final Pt is the predicted microbe-disease correlation score matrix. The association scores in the matrix are sorted from large to small, and the higher the ranking of microorganism-disease pairs, the more likely there is an association relationship.
为了验证本发明的有效性,我们参考其他算法的验证标准,采用了两种验证方式:(1)5倍交叉验证;(2)留一验证。在五倍交叉验证中,将已知的微生物-疾病关系随机分成5份,依次轮流选择1份为测试集,剩下的4份为训练集,其测试验证次数为100次。在留一验证中,依次从已知的微生物-疾病关系中选择一条已知微生物-疾病关系为测试集,剩下的为训练集。采用的评价指标为AUC(the areas under ROC curves)值。In order to verify the effectiveness of the present invention, we refer to the verification standards of other algorithms and adopt two verification methods: (1) 5-fold cross-validation; (2) leave-one-out verification. In the five-fold cross-validation, the known microorganism-disease relationship is randomly divided into 5 parts, and one part is selected in turn as the test set, and the remaining 4 parts are used as the training set, and the number of test verifications is 100. In the leave-one-out validation, one of the known microorganism-disease relationships is sequentially selected as the test set, and the rest are used as the training set. The evaluation index used is AUC (the areas under ROC curves) value.
图2显示了集成疾病功能相似性和高斯核相似性在五倍交叉验证中的AUC图。从图中可以看出,本发明(Predicting microbe-disease interactions based onsimilarities and bi-random walk on the heterogeneous network,简称为BRWH-MDI)的AUC值为0.8676,优于其他3个基于疾病高斯核相似性、疾病表征相似性和微生物高斯核相似性的方法(KATZHMDA:0.8567,HGBI:0.7762,NBI:0.5622)。特别是在错误率(FPR值)低的时候,正确率(TPR值)更高,证明了本发明的预测结果中排名在前的微生物-疾病关系越正确。Figure 2 shows the AUC plot integrating disease functional similarity and Gaussian kernel similarity in five-fold cross-validation. It can be seen from the figure that the AUC value of the present invention (Predicting microbe-disease interactions based on similarities and bi-random walk on the heterogeneous network, referred to as BRWH-MDI) is 0.8676, which is better than the other three disease-based Gaussian kernel similarity , Disease Characterization Similarity and Microbial Gaussian Kernel Similarity (KATZHMDA: 0.8567, HGBI: 0.7762, NBI: 0.5622). Especially when the error rate (FPR value) is low, the correct rate (TPR value) is higher, which proves that the microorganism-disease relationship ranked first in the prediction results of the present invention is more correct.
图3描述了集成疾病功能相似性和高斯核相似性在留一验证中各方法的性能比较图。从图中也可以看出,本发明BRWH的AUC值为0.8780,也同样由于其他3个方法的表现(KATZHMDA:0.8644,HGBI:0.7866,NBI:5553)。同样在错误率(FPR值)低的时候,正确率(TPR值)更高,也表明了本发明的预测结果中排名在前的微生物-疾病关系的更高准确性。Figure 3 depicts the performance comparison of various methods in leave-one-out validation by integrating disease functional similarity and Gaussian kernel similarity. It can also be seen from the figure that the AUC value of the BRWH of the present invention is 0.8780, which is also due to the performance of the other three methods (KATZHMDA: 0.8644, HGBI: 0.7866, NBI: 5553). Also when the error rate (FPR value) is low, the correct rate (TPR value) is higher, which also shows the higher accuracy of the top microbial-disease relationship in the prediction results of the present invention.
通过上述应用案例的表现,本发明能够较准确的预测新的微生物-疾病关系,为后续的生物医学实验提供指导作用,提高疾病诊断和治疗水平。Through the performance of the above application cases, the present invention can more accurately predict new microorganism-disease relationships, provide guidance for subsequent biomedical experiments, and improve the level of disease diagnosis and treatment.
Claims (5)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201711293802.6A CN107887023A (en) | 2017-12-08 | 2017-12-08 | A kind of microbial diseases Relationship Prediction method based on similitude and double random walks |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201711293802.6A CN107887023A (en) | 2017-12-08 | 2017-12-08 | A kind of microbial diseases Relationship Prediction method based on similitude and double random walks |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN107887023A true CN107887023A (en) | 2018-04-06 |
Family
ID=61773209
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201711293802.6A Pending CN107887023A (en) | 2017-12-08 | 2017-12-08 | A kind of microbial diseases Relationship Prediction method based on similitude and double random walks |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN107887023A (en) |
Cited By (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108647484A (en) * | 2018-05-17 | 2018-10-12 | 中南大学 | A kind of drug relationship prediction technique integrated based on multiple information with least square method |
| CN108877953A (en) * | 2018-06-06 | 2018-11-23 | 中南大学 | A kind of drug sensitivity prediction method based on more similitude networks |
| CN109920478A (en) * | 2019-03-07 | 2019-06-21 | 中南大学 | A method for predicting microbe-disease relationships based on similarity and low-rank matrix filling |
| CN109935332A (en) * | 2019-03-01 | 2019-06-25 | 桂林电子科技大学 | A miRNA-disease association prediction method based on double random walk model |
| CN110349619A (en) * | 2019-01-17 | 2019-10-18 | 哈尔滨工业大学 | Disease and metabolin network establishing method based on similarity calculation |
| CN110610765A (en) * | 2019-09-10 | 2019-12-24 | 陕西师范大学 | A Method for Predicting Disease-Related Metabolites Using Double Random Walks |
| CN110610763A (en) * | 2019-09-10 | 2019-12-24 | 陕西师范大学 | Prediction method of association between metabolites and diseases based on KATZ model |
| CN111653359A (en) * | 2020-05-30 | 2020-09-11 | 吾征智能技术(北京)有限公司 | Intelligent prediction model construction method and prediction system for hemorrhagic diseases |
| CN112151191A (en) * | 2020-10-16 | 2020-12-29 | 山东管理学院 | Microorganism and disease association relation prediction method and system based on attention mechanism |
| CN112599202A (en) * | 2020-12-24 | 2021-04-02 | 南华大学 | Disease-related miRNA prediction system |
| CN112820347A (en) * | 2021-02-02 | 2021-05-18 | 中南大学 | Disease gene prediction method based on multiple protein network pulse dynamics process |
| CN113192562A (en) * | 2021-05-07 | 2021-07-30 | 中南大学 | Pathogenic gene identification method and system fusing multi-scale module structure information |
| CN113921106A (en) * | 2021-07-16 | 2022-01-11 | 湖南工业大学 | A New Coronary Pneumonia Drug Screening Model Based on Bounded Kernel Norm Regularization |
| CN116052873A (en) * | 2023-01-18 | 2023-05-02 | 齐齐哈尔大学 | A Weighted k-Nearest Neighbor Based Disease-Metabolite Association Prediction System |
| CN119649913A (en) * | 2024-11-28 | 2025-03-18 | 长沙理工大学 | A disease gene identification method and system based on heterogeneous network impulse dynamics |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2007100834A3 (en) * | 2006-02-27 | 2008-07-31 | Univ California | Graph querying, graph motif mining and the discovery of clusters |
| US20130064439A1 (en) * | 2011-02-23 | 2013-03-14 | Siemens Corporation | Systems and Method for Automatic Prostate Localization in MR Images Using Random Walker Segmentation Initialized Via Boosted Classifiers |
| CN105653846A (en) * | 2015-12-25 | 2016-06-08 | 中南大学 | An Ensemble-Based Similarity Measure and Bidirectional Random Walk Method for Drug Relocalization |
-
2017
- 2017-12-08 CN CN201711293802.6A patent/CN107887023A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2007100834A3 (en) * | 2006-02-27 | 2008-07-31 | Univ California | Graph querying, graph motif mining and the discovery of clusters |
| US20130064439A1 (en) * | 2011-02-23 | 2013-03-14 | Siemens Corporation | Systems and Method for Automatic Prostate Localization in MR Images Using Random Walker Segmentation Initialized Via Boosted Classifiers |
| CN105653846A (en) * | 2015-12-25 | 2016-06-08 | 中南大学 | An Ensemble-Based Similarity Measure and Bidirectional Random Walk Method for Drug Relocalization |
Non-Patent Citations (2)
| Title |
|---|
| CHENG YAN ETAL: ""DNRLMF-MDA:Predicting microRNA-Disease Associations Based on Similarities of microRNAs and Diseases"", 《HTTPS://IEEEXPLORE.IEEE.ORG/DOCUMENT/8118134》 * |
| SHUAIZOU ETAL: ""A novel approach for predicting microbe_disease associations by bi-random walk on the heterogeneous network"", 《PLOS ONE》 * |
Cited By (25)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108647484A (en) * | 2018-05-17 | 2018-10-12 | 中南大学 | A kind of drug relationship prediction technique integrated based on multiple information with least square method |
| CN108647484B (en) * | 2018-05-17 | 2020-10-23 | 中南大学 | A Drug Relationship Prediction Method Based on Multivariate Information Integration and Least Squares |
| CN108877953A (en) * | 2018-06-06 | 2018-11-23 | 中南大学 | A kind of drug sensitivity prediction method based on more similitude networks |
| CN108877953B (en) * | 2018-06-06 | 2021-09-03 | 中南大学 | Drug sensitivity prediction method based on multi-similarity network |
| CN110349619A (en) * | 2019-01-17 | 2019-10-18 | 哈尔滨工业大学 | Disease and metabolin network establishing method based on similarity calculation |
| CN109935332A (en) * | 2019-03-01 | 2019-06-25 | 桂林电子科技大学 | A miRNA-disease association prediction method based on double random walk model |
| CN109920478B (en) * | 2019-03-07 | 2020-12-08 | 中南大学 | A method for predicting microbe-disease relationships based on similarity and low-rank matrix filling |
| CN109920478A (en) * | 2019-03-07 | 2019-06-21 | 中南大学 | A method for predicting microbe-disease relationships based on similarity and low-rank matrix filling |
| CN110610763A (en) * | 2019-09-10 | 2019-12-24 | 陕西师范大学 | Prediction method of association between metabolites and diseases based on KATZ model |
| CN110610765A (en) * | 2019-09-10 | 2019-12-24 | 陕西师范大学 | A Method for Predicting Disease-Related Metabolites Using Double Random Walks |
| CN111653359A (en) * | 2020-05-30 | 2020-09-11 | 吾征智能技术(北京)有限公司 | Intelligent prediction model construction method and prediction system for hemorrhagic diseases |
| CN111653359B (en) * | 2020-05-30 | 2023-09-08 | 吾征智能技术(北京)有限公司 | Intelligent prediction model construction method and prediction system for hemorrhagic disease |
| CN112151191A (en) * | 2020-10-16 | 2020-12-29 | 山东管理学院 | Microorganism and disease association relation prediction method and system based on attention mechanism |
| CN112151191B (en) * | 2020-10-16 | 2023-02-03 | 山东管理学院 | Microorganism and disease association relation prediction method and system based on attention mechanism |
| CN112599202A (en) * | 2020-12-24 | 2021-04-02 | 南华大学 | Disease-related miRNA prediction system |
| CN112599202B (en) * | 2020-12-24 | 2024-04-26 | 南华大学 | Disease-related miRNA prediction system |
| CN112820347A (en) * | 2021-02-02 | 2021-05-18 | 中南大学 | Disease gene prediction method based on multiple protein network pulse dynamics process |
| CN112820347B (en) * | 2021-02-02 | 2023-09-22 | 中南大学 | Disease gene prediction method based on multiple protein network pulse dynamics process |
| CN113192562B (en) * | 2021-05-07 | 2022-05-13 | 中南大学 | Pathogenic gene identification method and system integrating multi-scale modular structural information |
| CN113192562A (en) * | 2021-05-07 | 2021-07-30 | 中南大学 | Pathogenic gene identification method and system fusing multi-scale module structure information |
| CN113921106A (en) * | 2021-07-16 | 2022-01-11 | 湖南工业大学 | A New Coronary Pneumonia Drug Screening Model Based on Bounded Kernel Norm Regularization |
| CN113921106B (en) * | 2021-07-16 | 2025-04-25 | 湖南工业大学 | A drug screening model for pneumonia caused by novel coronavirus infection based on bounded nuclear norm regularization |
| CN116052873A (en) * | 2023-01-18 | 2023-05-02 | 齐齐哈尔大学 | A Weighted k-Nearest Neighbor Based Disease-Metabolite Association Prediction System |
| CN116052873B (en) * | 2023-01-18 | 2024-01-26 | 齐齐哈尔大学 | Disease-metabolite association prediction system based on weight k-nearest neighbor |
| CN119649913A (en) * | 2024-11-28 | 2025-03-18 | 长沙理工大学 | A disease gene identification method and system based on heterogeneous network impulse dynamics |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN107887023A (en) | A kind of microbial diseases Relationship Prediction method based on similitude and double random walks | |
| Wen et al. | Graph neural networks for multimodal single-cell data integration | |
| Montesinos-López et al. | A review of deep learning applications for genomic selection | |
| CN112270958B (en) | Prediction method based on layered deep learning miRNA-lncRNA interaction relationship | |
| Thall | Adaptive enrichment designs in clinical trials | |
| CN109935332A (en) | A miRNA-disease association prediction method based on double random walk model | |
| CN110993113B (en) | LncRNA-disease relation prediction method and system based on MF-SDAE | |
| CN114927162A (en) | Multi-set correlation phenotype prediction method based on hypergraph representation and Dirichlet distribution | |
| CN107862179A (en) | A kind of miRNA disease association Relationship Prediction methods decomposed based on similitude and logic matrix | |
| CN112966114A (en) | Document classification method and device based on symmetric graph convolutional neural network | |
| CN113436684B (en) | Cancer classification and characteristic gene selection method | |
| CN110021341A (en) | A kind of prediction technique of GPCR drug based on heterogeneous network and targeting access | |
| CN110110753A (en) | Effective hybrid characteristic selecting method of pollination algorithm and ReliefF is spent based on elite | |
| He et al. | Human microbe-disease association prediction with graph regularized non-negative matrix factorization | |
| Yang et al. | Predicting circRNA-disease associations based on autoencoder and graph embedding | |
| CN113257359A (en) | CRISPR/Cas9 guide RNA editing efficiency prediction method based on CNN-SVR | |
| CN115171842B (en) | Traditional Chinese medicine recommendation method based on graph neural network based on syndrome information | |
| Shen et al. | Simultaneous genes and training samples selection by modified particle swarm optimization for gene expression data classification | |
| Xia et al. | Multi-strategy dung beetle optimizer for global optimization and feature selection | |
| CN118412146A (en) | Prediction model construction method, prediction method and device for drug combination synergy | |
| CN108427865B (en) | Method for predicting correlation between LncRNA and environmental factors | |
| Nandhini et al. | An optimal stacked ResNet-BiLSTM-based accurate detection and classification of genetic disorders | |
| Wang et al. | Binary grey wolf optimizer with a novel population adaptation strategy for feature selection | |
| CN110942803A (en) | Efficient prediction method for correlation between LncRNA and environmental factors | |
| CN109920478B (en) | A method for predicting microbe-disease relationships based on similarity and low-rank matrix filling |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180406 |
|
| WD01 | Invention patent application deemed withdrawn after publication |