University of California, San Francisco
Exec Dir & CTO
An original “Big Data” leader and software executive at pioneering companies, who prefers to go where Data Science (and computation) have yet to arrive…. but will. Spent the last several years with Data Science for Medicine and Public Health.
Talks and Events
2022 Talk: SPOKE – An Open Knowledge Network for Precision Medicine
A human’s health is predicated on vast range of factors, from the molecular, up through genes, their expression throughout the body (the “endome”) and on external factors such as drug interventions, the environment, social factors and behavior (the “exome”). Information about these factors sit in thousands of isolated data sets. Fully understanding and treating each person, the goal of Precision Medicine, is far beyond the capability of any human.
SPOKE, a graph-theoretic and open database, connects information from 41 specialized databases, structured as 21 different node types and 68 edge types across multiple specialties, ranging from molecular and cellular biology to pharmacology and clinical practice. SPOKE was conceived with the philosophy that if relevant information is connected, it can result in the emergence of knowledge, and hence to arrive at otherwise unattainable insights in understanding diseases, discovering drugs, curing diseases and proactively improving personal health. With over 70 million entities and relationships of curated, scientifically accepted “truths,” so far focused on the endome, SPOKE forms the basis of research inquiry and clinical treatments with deep biological knowledge – especially for complex conditions like Parkinson’s Disease or in-hospital sepsis. When the COVID-19 pandemic struck, and SARS-CoV-2 was sequenced in February 2022, SPOKE could immediately see the potential of later-approved treatments such as dexamethasone, or the biological cause of bradykinin storms and their effect on the pulmonary system.
Deriving new and relevant knowledge from a graph of this scale presents computational and algorithmic challenges, which are at the locus of our current studies, with collaborators. Traversing the graph to understand the most relevant paths between pairs of entities (e.g., from a drug compound to a disease condition, or a food item to effects on an organ), is one study. Comparing relationships found in the scientific publications corpus, which are peer-reviewed but not (yet) scientifically accepted, is another active area that guides both, the further development of SPOKE as well as the researcher who seeks to enhance their hypotheses or study with deeper knowledge. Creating the most effective methods for these challenges is an opportunity for algorithmic and data scientists.