26 Oct 2020

NATURE

Authors: Saee Paliwal, Alex de Giorgio, Daniel Neil, Jean-Baptiste Michel, Alix M.B Lacoste

Abstract

Incorrect drug target identification is a major obstacle in drug discovery. Only 15% of drugs advance from Phase II to approval, with ineffective targets accounting for over 50% of these failures. Advances in data fusion and computational modeling have independently progressed towards addressing this issue. Here, we capitalize on both these approaches with Rosalind, a comprehensive gene prioritization method that combines heterogeneous knowledge graph construction with relational inference via tensor factorization to accurately predict disease‑gene links. Rosalind demonstrates an increase in performance of 18%‑50% over five comparable state‑of‑the‑art algorithms. On historical data, Rosalind prospectively identifies 1 in 4 therapeutic relationships eventually proven true. Beyond efficacy, Rosalind is able to accurately predict clinical trial successes (75% recall at rank 200) and distinguish likely failures (74% recall at rank 200). Lastly, Rosalind predictions were experimentally tested in a patient‑derived in-vitro assay for Rheumatoid arthritis (RA), which yielded 5 promising genes, one of which is unexplored in RA.


Back to publications

Latest publications

07 Dec 2022
NeurIPS 2022
sEHR-CE: Language modelling of structured EHR data for efficient and generalizable patient cohort expansion
Read more
07 Dec 2022
EMNLP 2022
Proxy-based Zero-Shot Entity Linking by Effective Candidate Retrieval
Read more
03 Nov 2022
AKBC 2022
Pseudo-Riemannian Embedding Models for Multi-Relational Graph Representations
Read more