Yanshi HU

PhD in Bioinformatics

Zhejiang University

Biography

Being at the forefront of interdisciplinary research is incredibly exciting. It opens up new avenues for pushing the boundaries of our wisdom. My main interests lie in leveraging AI, bioinformatics and computational (systems) biology to advance insights into major complex diseases. Currently I am fascinated by deciphering large-scale disease comorbidity patterns and detailed disease comorbidity between Parkinson's and Crohn's disease.

I have authored 13 SCI papers, which garner extensive attention from academic communities and are widely cited in high-impact journals such as Science, Gastroenterology, Pharmacological Reviews, Trends in Cancer, Trends in Neurosciences, Alzheimer's & Dementia, Trends in Genetics, Brain and Nucleic Acids Research. Currently I serve as (ad hoc) peer reviewer for Advanced Science and 17 other SCI journals.

Interests

Bioinformatics
Systems Biology
Network Biology
Network Medicine
AI4Biomedicine
Neuroscience

Education

PhD in Bioinformatics, 2023

Zhejiang University
MPhil in Biomedical Engineering, 2016

Tianjin Medical University
BSc in Biomedical Engineering, 2013

Shandong First Medical University

Recent Publications

Complete List of Publications

Mapping the comorbid landscape of Parkinson's disease and Crohn's disease along the gut-blood-brain axis

Parkinson's disease (PD) and Crohn's disease (CD) are primarily localized to the brain and gut, respectively. Nevertheless, epidemiological evidence increasingly links these two seemingly unrelated disorders. Although genomic or transcriptomic efforts have been dedicated to understanding this phenomenon, the precise landscape underlying this comorbidity remains elusive. Here, a systematic multi-omics approach is employed to panoramically map this pathogenic nexus for the first time. By curating a comprehensive genetic corpus related to PD and CD from extensive publications, we uncovered a shared genetic architecture converging on biological functions governing host-pathogen interactions and barrier integrity maintenance. Further, multi-tissue transcriptomic datasets were meta-analyzed to validate genomic insights in transcriptional circumstances, which identified pervasive transcriptional synergies of PD and CD pathways within the blood context, indicating in blood CD pathological milieu could create a permissive environment for PD pathogenesis. Finally, delineating the aberrant gut-blood-brain axis through the sequential compromise of gut epithelial barrier, gut-vascular barrier and blood-brain barrier, we revealed a directional cascade where CD intestinal pathology facilitates PD substantia nigra degeneration via blood circulation, establishing a theoretical foundation for preventive and therapeutic interventions for PD and CD comorbidity. Crucially, this study provides a blueprint for dissecting the molecular etiology of comorbidities in other complex diseases affecting disparate anatomical sites.

YS Hu^*, B She, Z Yin, X Yu, W Wu, M Chen^*

medRχiv

DOI Altmetric PDF

AWmeta empowers adaptively-weighted transcriptomic meta-analysis

Transcriptomic meta-analysis enhances biological veracity and reproducibility by integrating diverse studies, yet prevailing P-value or effect-size integration approaches exhibit limited power to resolve subtle signatures. We present AWmeta, an adaptively-weighted framework that unifies both paradigms. Benchmarking across 35 Parkinson's and Crohn's disease datasets spanning diverse tissues and adaptively down-weighting underpowered studies, AWmeta yields higher-fidelity differentially expressed genes (DEGs) with markedly reduced false positives and establishes superior gene differential quantification convergence at both gene and study levels over state-of-the-art random-effects model (REM) and original studies. AWmeta requires fewer samples and DEGs from original studies to achieve substantial gene differential estimates, lowering experimental costs. We demonstrate AWmeta's remarkable stability and robustness against external and internal perturbations. Crucially, AWmeta prioritizes disease tissue-specific mechanisms with higher functional coherence than those from REM and original studies. By bridging statistical rigor with mechanistic interpretability, AWmeta harmonizes heterogeneous transcriptomic data into actionable insights, serving as a transformative tool for precision transcriptomic integration.

YS Hu^*, Z Wang, Y Hu, C Feng, Q Fang, M Chen^*

bioRχiv

DOI Altmetric PDF Code

Benchmarking alternative polyadenylation detection in single-cell and spatial transcriptomes

3’-tag-based sequencing methods have become the predominant approach for single-cell and spatial transcriptomics, with some protocols proven effective in detecting alternative polyadenylation (APA). While numerous computational tools have been developed for APA detection from these sequencing data, the absence of comprehensive benchmarks and the diversity of sequencing protocols and tools make it challenging to select appropriate methods for APA analysis in these contexts. We systematically compared seven 3’-tag-based sequencing protocols and identified key peak features affecting APA detection performance. We developed a simulation pipeline that generates realistic datasets preserving protocol-specific characteristics. Using simulated and real data, we comprehensively assessed six computational tools for their ability to identify polyA sites, quantify polyA site expression, detect differentially expressed (DE) APA genes, filter sequencing artifacts, and their computational efficiency. We also investigated factors influencing APA detection. Our evaluation revealed that SCAPE and scAPAtrap generally outperformed other tools across various performance metrics and protocols. Our systematic evaluation provides guidance for tool selection, experiment design, and future tool development in APA analysis for single-cell and spatial transcriptomics, paving the way for investigating APA in these contexts.

S Li, Z Wang, YS Hu, Q Ni, C Feng, Y Hu, S Zhang, M Chen^*

bioRχiv

DOI PDF Code

ncPlantDB: a plant ncRNA database with potential ncPEP information and cell type-specific interaction

The field of plant non-coding RNAs (ncRNAs) has seen significant advancements in recent years, with many ncRNAs recognized as important regulators of gene expression during plant development and stress responses. Moreover, the coding potential of these ncRNAs, giving rise to ncRNA-encoded peptides (ncPEPs), has emerged as an essential area of study. However, existing plant ncRNA databases lack comprehensive information on ncRNA-encoded peptides (ncPEPs) and cell type-specific interactions. To address this gap, we present ncPlantDB (https://bis.zju.edu.cn/ncPlantDB), a comprehensive database integrating ncRNA and ncPEP data across 43 plant species. ncPlantDB encompasses 353 140 ncRNAs, 3799 ncPEPs and 4 647 071 interactions, sourced from established databases and literature mining. The database offers unique features including translational potential data, cell-specific interaction networks derived from single-cell RNA sequencing and Ribo-seq analyses, and interactive visualization tools. ncPlantDB provides a user-friendly interface for exploring ncRNA expression patterns at the single-cell level, facilitating the discovery of tissue-specific ncRNAs and potential ncPEPs. By integrating diverse data types and offering advanced analytical tools, ncPlantDB serves as a valuable resource for researchers investigating plant ncRNA functions, interactions, and their potential coding capacity. This database significantly enhances our understanding of plant ncRNA biology and opens new avenues for exploring the complex regulatory networks in plant genomics.

L Liu^#, E Liu^#, Y Hu, S Li, S Zhang, H Chao, YS Hu, Y Zhu, Y Chen, L Xie, Y Shen, L Wu, M Chen^*

Nucleic Acids Research

PubMed DOI Altmetric PDF Database

Systematic single-cell analysis reveals dynamic control of transposable element activity orchestrating the endothelial-to-hematopoietic transition

The endothelial-to-hematopoietic transition (EHT) process during definitive hematopoiesis is highly conserved in vertebrates. Stage-specific expression of transposable elements (TEs) has been detected during zebrafish EHT and may promote hematopoietic stem cell (HSC) formation by activating inflammatory signaling. However, little is known about how TEs contribute to the EHT process in human and mouse. We reconstructed the single-cell EHT trajectories of human and mouse and resolved the dynamic expression patterns of TEs during EHT. Most TEs presented a transient co-upregulation pattern along the conserved EHT trajectories, coinciding with the temporal relaxation of epigenetic silencing systems. TE products can be sensed by multiple pattern recognition receptors, triggering inflammatory signaling to facilitate HSC emergence. Interestingly, we observed that hypoxia-related signals were enriched in cells with higher TE expression. Furthermore, we constructed the hematopoietic cis-regulatory network of accessible TEs and identified potential TE-derived enhancers that may boost the expression of specific EHT marker genes. Our study provides a systematic vision of how TEs are dynamically controlled to promote the hematopoietic fate decisions through transcriptional and cis-regulatory networks, and pre-train the immunity of nascent HSCs.

C Feng^#, R Tie^#, S Xin^#, Y Chen, S Li, Y Chen, X Hu, Y Zhou, Y Liu, Y Hu, YS Hu, H Pan, Z Wu, H Chao, S Zhang, Q Ni, J Huang, W Luo^*, H Huang^*, M Chen^*

BMC Biology

PubMed PMC DOI Altmetric PDF Database Code

HALD, a human aging and longevity knowledge graph for precision gerontology and geroscience analyses

Human aging is a natural and inevitable biological process that leads to an increased risk of aging-related diseases. Developing anti-aging therapies for aging-related diseases requires a comprehensive understanding of the mechanisms and effects of aging and longevity from a multi-modal and multi-faceted perspective. However, most of the relevant knowledge is scattered in the biomedical literature, the volume of which reached 36 million in PubMed. Here, we presented HALD, a text mining-based human aging and longevity dataset of the biomedical knowledge graph from all published literature related to human aging and longevity in PubMed. HALD integrated multiple state-of-the-art natural language processing (NLP) techniques to improve the accuracy and coverage of the knowledge graph for precision gerontology and geroscience analyses. Up to September 2023, HALD had contained 12,227 entities in 10 types (gene, RNA, protein, carbohydrate, lipid, peptide, pharmaceutical preparations, toxin, mutation, and disease), 115,522 relations, 1,855 aging biomarkers, and 525 longevity biomarkers from 339,918 biomedical articles in PubMed. HALD is available at https://bis.zju.edu.cn/hald.

Z Wu^#, C Feng^#, YS Hu, Y Zhou, S Li, S Zhang, Y Hu, Yuhao Chen, H Chao, Q Ni, M Chen^*

Scientific Data

PubMed PMC DOI Altmetric PDF Database Code

CoGO: a contrastive learning framework to predict disease similarity based on gene network and ontology structure

Quantifying the similarity of human diseases provides guiding insights to the discovery of micro-scope mechanisms from a macro scale. Previous work demonstrated that better performance can be gained by integrating multiview data sources or applying machine learning techniques. However, designing an efficient framework to extract and incorporate information from different biological data using deep learning models remains unexplored. We present CoGO, a Contrastive learning framework to predict disease similarity based on Gene network and Ontology structure, which incorporates the gene interaction network and gene ontology (GO) domain knowledge using graph deep learning models. First, graph deep learning models are applied to encode the features of genes and GO terms from separate graph structure data. Next, gene and GO features are projected to a common embedding space via a nonlinear projection. Then cross-view contrastive loss is applied to maximize the agreement of corresponding gene-GO associations and lead to meaningful gene representation. Finally, CoGO infers the similarity between diseases by the cosine similarity of disease representation vectors derived from related gene embedding. In our experiments, CoGO outperforms the most competitive baseline method on both AUROC and AUPRC, especially improves 19.57% in AUPRC (0.7733). The prediction results are significantly comparable with other disease similarity studies and thus highly credible. Furthermore, we conduct a detailed case study of top similar disease pairs which is demonstrated by other studies. Empirical results show that CoGO achieves powerful performance in disease similarity problem.

Y Chen^#, YS Hu^#, X Hu, C Feng, M Chen^*

Bioinformatics

PubMed DOI Altmetric PDF Code

Recent Talks

Analyzing the genes related to Alzheimer’s disease via a nework and pathway-based approach
YS Hu^*, J Wang^*, M Chen

Zhejiang University-Bielefeld University Joint Symposium - 2019
Bielefeld University & CeBiTec
Bielefeld, Germany

Analyzing the genes related to Alzheimer’s disease via a nework and pathway-based approach
YS Hu^*, J Wang^*, M Chen

Integrative Bioinformatics and Systems Biology (WIBSB-2018) DOI:10.18699/WIBSB-2018-28
DOI PDF
Novosibirsk State Univesity & ICG
Novosibirsk, Russia

Recent Posters

Network and pathway based analyses of genes associated with Parkinson's disease
YS Hu, Z Pan, Y Hu, J Wang^*

The Seventh National Conference on Bioinformatics & Systems Biology of China and International Workshop on Advanced Bioinformatics & Precision Medicine, 2016
DOI PDF
University of Electronic Science and Technology of China
Chengdu, China

Common characteristics of Alzheimer's disease and Parkinson's disease based on AlzGene and PDGene databases
YS Hu, R Fan, X Li, M Liu, X Liu, X Yi, T Zhang, J Wang^*

The Sixth National Conference on Bioinformatics & Systems Biology of China and International Workshop on Advanced Bioinformatics, 2014
DOI PDF
Southeast University
Nanjing, China

Projects

Working

I was ever working as a research assistant in Prof. Ju Wang's Bioinformatics Lab at Tianjin Medical University from July 2016 to May 2017. Currently, I am a research associate in Prof. Ming Chen's Lab of Bioinformatics in College of Life Sciences at Zhejiang University.

Contact

yanshihu AT zju.edu.cn
Zhejiang University