Quantifying the similarity of human diseases provides guiding insights to the discovery of micro-scope mechanisms from a macro scale. Previous work demonstrated that better performance can be gained by integrating multiview data sources or applying machine learning techniques. However, designing an efficient framework to extract and incorporate information from different biological data using deep learning models remains unexplored. We present CoGO, a Contrastive learning framework to predict disease similarity based on Gene network and Ontology structure, which incorporates the gene interaction network and gene ontology (GO) domain knowledge using graph deep learning models. First, graph deep learning models are applied to encode the features of genes and GO terms from separate graph structure data. Next, gene and GO features are projected to a common embedding space via a nonlinear projection. Then cross-view contrastive loss is applied to maximize the agreement of corresponding gene-GO associations and lead to meaningful gene representation. Finally, CoGO infers the similarity between diseases by the cosine similarity of disease representation vectors derived from related gene embedding. In our experiments, CoGO outperforms the most competitive baseline method on both AUROC and AUPRC, especially improves 19.57% in AUPRC (0.7733). The prediction results are significantly comparable with other disease similarity studies and thus highly credible. Furthermore, we conduct a detailed case study of top similar disease pairs which is demonstrated by other studies. Empirical results show that CoGO achieves powerful performance in disease similarity problem.
Bioinformatics

The COVID-19 pandemic has been a major public health event since 2020. Multiple variant strains of SARS-CoV-2, the causative agent of COVID-19, were detected based on the mutation sites in their sequences. These sequence mutations may lead to changes in the protein structures and affect the binding states of SARS-CoV-2 and human proteins. Experimental research on SARS-CoV-2 has accumulated a large amount of structural data and protein-protein interactions (PPIs), but the studies on the SARS-CoV-2–human PPI networks lack integration of physical associations with possible protein docking information. In addition, the docking structures of variant viral proteins with human receptor proteins are still insufficient. This study constructed SARS-CoV-2–human protein–protein interaction network with data integration methods. Crystal structures were collected to map the interaction pairs. The pairs of direct interactions and physical associations were selected and analyzed for variant docking calculations. The study examined the structures of spike (S) glycoprotein of variants Delta B.1.617.2, Omicron BA.1, and Omicron BA.2. The calculated docking structures of S proteins and potential human receptors were obtained. The study integrated binary protein interactions with 3D docking structures to fulfill an extended view of SARS-CoV-2 proteins from a macro- to micro-scale.
Biomolecules

Lymphoma is a heterogeneous disease caused by malignant proliferation of lymphocytes, resulting in significant mortality worldwide. While more and more lymphoma biomarkers have been identified with the advent and development of precision medicine, there are currently no databases dedicated to systematically gathering these scattered treasures. Therefore, we developed a lymphoma biomarker database (LBD) to curate experimentally validated lymphoma biomarkers in this study. LBD consists of 793 biomarkers extracted from 978 articles covering diverse subtypes of lymphomas, including 715 single and 78 combined biomarkers. These biomarkers can be categorized into molecular, cellular, image, histopathological, physiological and other biomarkers with various functions such as prognosis, diagnosis and treatment. As a manually curated database that provides comprehensive information about lymphoma biomarkers, LBD is helpful for personalized diagnosis and treatment of lymphoma. Database URL: http://bis.zju.edu.cn/LBD
Database

Recent Talks

  • Analyzing the genes related to Alzheimer’s disease via a nework and pathway-based approach
    Zhejiang University-Bielefeld University Joint Symposium - 2019
  • Bielefeld University & CeBiTec
  • Bielefeld, Germany

Recent Posters

  • Network and pathway based analyses of genes associated with Parkinson's disease
    The Seventh National Conference on Bioinformatics & Systems Biology of China and International Workshop on Advanced Bioinformatics & Precision Medicine, 2016
    DOI PDF
  • University of Electronic Science and Technology of China
  • Chengdu, China
  • Common characteristics of Alzheimer's disease and Parkinson's disease based on AlzGene and PDGene databases
    The Sixth National Conference on Bioinformatics & Systems Biology of China and International Workshop on Advanced Bioinformatics, 2014
    DOI PDF
  • Southeast University
  • Nanjing, China

Projects

Working

I was ever working as a research assistant in Prof. Ju Wang's Bioinformatics Lab at Tianjin Medical University from July 2016 to May 2017.

Contact