Research Identifiers
ORCID iD:
https://orcid.org/0000-0002-8041-0370
Education and qualifications (3)
University of California, San Diego: CA, CA, US
Organization identifiers
FUNDREF:
http://dx.doi.org/10.13039/100007911
University of California, San Diego : CA, CA, US
Department
Chemistry
Added
2022-09-12
Last modified
2022-09-12
Source:
Vincent T. Metzger
University of California San Diego: San Diego, California, US
Organization identifiers
University of California San Diego : San Diego, California, US
Other organization identifiers provided by ROR
FUNDREF:
100007911 (preferred), 100009507, 100009508, 100005918, 100005548, 100008673, 100005594, 100011077, 100011108, 100011109, 100011114, 100012378, 100012424, 100013844, 100013847, 100020698
GRID:
grid.266100.3 (preferred)
ISNI:
0000000121074242
WIKIDATA:
Q622664
WIKIPEDIA_URL:
http://en.wikipedia.org/wiki/University_of_California,_San_Diego (preferred)
Department
Chemistry
Added
2026-01-09
Last modified
2026-01-09
Source:
Vincent T. Metzger
University of New Mexico: Albuquerque, New Mexico, US
Department
Chemistry
Added
2022-09-12
Last modified
2022-09-12
Source:
Vincent T. Metzger
KG2ML: integrating knowledge graphs and positive unlabeled learning for identifying disease-associated genes
Frontiers in Bioinformatics
2026-01-08 | Journal article
Contributors:
Praveen Kumar; Vincent T. Metzger; Swastika T. Purushotham; Priyansh Kedia; Cristian G. Bologa (and 2 more)
Show more detail
Homepage URL
Contributors
Praveen Kumar
(Author)
Vincent T. Metzger
(Author)
Swastika T. Purushotham
(Author)
Priyansh Kedia
(Author)
Cristian G. Bologa
(Author)
Christophe G. Lambert
(Author)
Jeremy J. Yang
(Author)
External identifiers
ISSN:
2673-7647
Abstract
Background
Biomedical knowledge graphs (KGs), such as the Data Distillery Knowledge Graph (DDKG), capture known relationships among entities (e.g., genes, diseases, proteins), providing valuable insights for research. However, these relationships are typically derived from prior studies, leaving potential unknown associations unexplored. Identifying such unknown associations, including previously unknown disease-associated genes, remains a critical challenge in bioinformatics and is crucial for advancing biomedical knowledge.
Methods
Traditional methods, such as linkage analysis and genome-wide association studies (GWAS), can be time-consuming and resource-intensive. This highlights the need for efficient computational approaches to identify or predict new genes using known disease-gene associations. Recently, network-based methods and KGs, enhanced by advances in machine learning (ML) frameworks, have emerged as promising tools for inferring these unexplored associations. Given the technical limitations of the Neo4j Graph Data Science (GDS) machine learning pipeline, we developed a novel machine learning pipeline called KG2ML (Knowledge Graph to Machine Learning). This pipeline utilizes our Positive and Unlabeled (PU) learning algorithm, PULSCAR (Positive Unlabeled Learning Selected Completely At Random), and incorporates path-based feature extraction from ProteinGraphML.
Results
KG2ML was applied to 12 diseases, including Bipolar Disorder, Coronary Artery Disease, and Parkinson’s Disease, to infer disease-associated genes not explicitly recorded in DDKG. For several of these diseases, 14 out of the 15 top-ranked genes lacked prior explicit associations in the DDKG but were supported by literature and TINX (Target Importance and Novelty Explorer) evidence. Incorporating PULSCAR-imputed genes as positives enhanced XGBoost classification, demonstrating the potential of PU learning in identifying hidden gene-disease relationships.
Conclusion
The observed improvement in classification performance after the inclusion of PULSCAR-imputed genes as positive examples, along with the subject matter experts’ (SME) evaluations of the top 15 imputed genes for 12 diseases, suggests that PU learning can effectively uncover disease-gene associations missing from existing knowledge graphs (KGs). By integrating KG data with ML-based inference, our KG2ML pipeline provides a scalable and interpretable framework to advance biomedical research while addressing the inherent limitations of current KGs.
Added
2026-01-08
Last modified
2026-01-09
Source:
Vincent T. Metzger
Unsupervised Latent Pattern Analysis for Estimating Type 2 Diabetes Risk in Undiagnosed Populations
2025-10-12 | Conference paper
Contributors:
Praveen Kumar; Vincent T. Metzger; Scott Alexander Malec
Show more detail
Homepage URL
Contributors
Praveen Kumar
(Author)
Vincent T. Metzger
(Author)
Scott Alexander Malec
(Author)
External identifiers
Added
2025-12-10
Last modified
2026-01-09
Source:
Crossref
The Data Distillery: A Graph Framework for Semantic Integration and Querying of Biomedical Data
2025-08-15 | Preprint
Contributors:
Taha Mohseni Ahooyi; Benjamin Stear; J. Alan Simmons; Vincent T. Metzger; Praveen Kumar (and 39 more)
Show more detail
Homepage URL
Contributors
Taha Mohseni Ahooyi
(Author)
Benjamin Stear
(Author)
J. Alan Simmons
(Author)
Vincent T. Metzger
(Author)
Praveen Kumar
(Author)
John Erol Evangelista
(Author)
Daniel J. B. Clarke
(Author)
Zhuorui Xie
(Author)
Heesu Kim
(Author)
Sherry L. Jenkins
(Author)
Mano R. Maurya
(Author)
Srinivasan Ramachandran
(Author)
Eoin Fahy
(Author)
Thomas H. Gillespie
(Author)
Fahim T. Imam
(Author)
Natallia Kokash
(Author)
Matthew E. Roth
(Author)
Robert Fullem
(Author)
Dubravka Jevtic
(Author)
Aleks Mihajlovic
(Author)
Michael Tiemeyer
(Author)
Clara Bakker
(Author)
Andrew J. Schroeder
(Author)
Julia Markowski
(Author)
Jared Nedzel
(Author)
Dave D. Hill
(Author)
James Terry
(Author)
Christopher Nemarich
(Author)
Jyl Boline
(Author)
Peter J. Park
(Author)
[ORCID: 0000-0001-9378-960X]
Kristin G. Ardlie
(Author)
Jeet Vora
(Author)
Raja Mazumder
(Author)
Rene Ranzinger
(Author)
Bernard de Bono
(Author)
Shankar Subramaniam
(Author)
Jeffrey S. Grethe
(Author)
Jeremy J. Yang
(Author)
Christophe G. Lambert
(Author)
Adam Resnick
(Author)
Aleks Milosavljevic
(Author)
Avi Ma’ayan
(Author)
Jonathan C. Silverstein
(Author)
Deanne M. Taylor
(Author)
[ORCID: 0000-0002-3302-4610]
External identifiers
Abstract
Abstract
The Data Distillery Knowledge Graph (DDKG) is a framework for semantic integration and querying of biomedical data across domains. Built for the NIH Common Fund Data Ecosystem, it supports translational research by linking clinical and experimental datasets in a unified graph model. Clinical standards such as ICD-10, SNOMED, and DrugBank are integrated through UMLS, while genomics and basic science data are structured using ontologies and standards such as HPO, GENCODE, Ensembl, STRING, and ClinVar. The DDKG uses a property graph architecture based on the UBKG infrastructure and supports ontology-based ingestion, identifier normalization, and graph-native querying. The system is modular and can be extended with new datasets or schema modules. We demonstrate its utility for informatics queries across eight use cases, including regulatory variant analysis, tissue-specific expression, biomarker discovery, and cross-species variant prioritization. The DDKG is accessible via a public interface, a programmatic API, and downloadable builds for local use.
The Data Distillery Knowledge Graph (DDKG) is a framework for semantic integration and querying of biomedical data across domains. Built for the NIH Common Fund Data Ecosystem, it supports translational research by linking clinical and experimental datasets in a unified graph model. Clinical standards such as ICD-10, SNOMED, and DrugBank are integrated through UMLS, while genomics and basic science data are structured using ontologies and standards such as HPO, GENCODE, Ensembl, STRING, and ClinVar. The DDKG uses a property graph architecture based on the UBKG infrastructure and supports ontology-based ingestion, identifier normalization, and graph-native querying. The system is modular and can be extended with new datasets or schema modules. We demonstrate its utility for informatics queries across eight use cases, including regulatory variant analysis, tissue-specific expression, biomarker discovery, and cross-species variant prioritization. The DDKG is accessible via a public interface, a programmatic API, and downloadable builds for local use.
Added
2026-01-08
Last modified
2026-01-09
Source:
Vincent T. Metzger
TIN-X version 3: update with expanded dataset and modernized architecture for enhanced illumination of understudied targets.
PeerJ
2024-06-25 | Journal article
DOI:
10.7717/peerj.17470
Contributors:
Metzger VT; Cannon DC; Yang JJ; Mathias SL; Bologa CG (and 9 more)
Show more detail
Homepage URL
Contributors
Metzger VT
(Author)
Cannon DC
(Author)
Yang JJ
(Author)
[ORCID: 0000-0002-1476-6192]
Mathias SL
(Author)
Bologa CG
(Author)
[ORCID: 0000-0003-2232-4244]
Waller A
(Author)
[ORCID: 0000-0003-3676-2501]
Schürer SC
(Author)
[ORCID: 0000-0001-7180-0978]
Vidović D
(Author)
[ORCID: 0000-0001-9798-2108]
Kelleher KJ
(Author)
[ORCID: 0000-0002-8878-1539]
Sheils TK
(Author)
Jensen LJ
(Author)
Lambert CG
(Author)
[ORCID: 0000-0003-1994-2893]
Oprea TI
(Author)
Edwards JS
(Author)
External identifiers
PMID:
38948230
DOI:
10.7717/peerj.17470
Abstract
TIN-X (Target Importance and Novelty eXplorer) is an interactive visualization tool for illuminating associations between diseases and potential drug targets and is publicly available at newdrugtargets.org. TIN-X uses natural language processing to identify disease and protein mentions within PubMed content using previously published tools for named entity recognition (NER) of gene/protein and disease names. Target data is obtained from the Target Central Resource Database (TCRD). Two important metrics, novelty and importance, are computed from this data and when plotted as log(importance) vs. log(novelty), aid the user in visually exploring the novelty of drug targets and their associated importance to diseases. TIN-X Version 3.0 has been significantly improved with an expanded dataset, modernized architecture including a REST API, and an improved user interface (UI). The dataset has been expanded to include not only PubMed publication titles and abstracts, but also full-text articles when available. This results in approximately 9-fold more target/disease associations compared to previous versions of TIN-X. Additionally, the TIN-X database containing this expanded dataset is now hosted in the cloud via Amazon RDS. Recent enhancements to the UI focuses on making it more intuitive for users to find diseases or drug targets of interest while providing a new, sortable table-view mode to accompany the existing plot-view mode. UI improvements also help the user browse the associated PubMed publications to explore and understand the basis of TIN-X's predicted association between a specific disease and a target of interest. While implementing these upgrades, computational resources are balanced between the webserver and the user's web browser to achieve adequate performance while accommodating the expanded dataset. Together, these advances aim to extend the duration that users can benefit from TIN-X while providing both an expanded dataset and new features that researchers can use to better illuminate understudied proteins.
Added
2026-01-08
Last modified
2026-01-09
Source:
Vincent T. Metzger
Overview of the Knowledge Management Center for Illuminating the Druggable Genome
Drug Discovery Today
2024-03 | Journal article
Contributors:
Tudor I. Oprea; Cristian Bologa; Jayme Holmes; Stephen Mathias; Vincent T. Metzger (and 9 more)
Show more detail
Homepage URL
Contributors
Tudor I. Oprea
(Author)
Cristian Bologa
(Author)
Jayme Holmes
(Author)
Stephen Mathias
(Author)
Vincent T. Metzger
(Author)
Anna Waller
(Author)
Jeremy J. Yang
(Author)
Andrew R. Leach
(Author)
Lars Juhl Jensen
(Author)
Keith J. Kelleher
(Author)
Timothy K. Sheils
(Author)
Ewy Mathé
(Author)
Sorin Avram
(Author)
Jeremy S. Edwards
(Author)
[ORCID: 0000-0003-3694-3716]
External identifiers
ISSN:
1359-6446
Added
2026-01-08
Last modified
2026-01-08
Source:
Vincent T. Metzger
Pharos 2023: an integrated resource for the understudied human proteome
Nucleic Acids Research
2023-01-06 | Journal article
DOI:
10.1093/nar/gkac1033
Contributors:
Keith J Kelleher; Timothy K Sheils; Stephen L Mathias; Jeremy J Yang; Vincent T Metzger (and 12 more)
Show more detail
Homepage URL
Contributors
Keith J Kelleher
(Author)
Timothy K Sheils
(Author)
Stephen L Mathias
(Author)
Jeremy J Yang
(Author)
Vincent T Metzger
(Author)
Vishal B Siramshetty
(Author)
Dac-Trung Nguyen
(Author)
Lars Juhl Jensen
(Author)
Dušica Vidović
(Author)
Stephan C Schürer
(Author)
Jayme Holmes
(Author)
Karlie R Sharma
(Author)
Ajay Pillai
(Author)
Cristian G Bologa
(Author)
Jeremy S Edwards
(Author)
Ewy A Mathé
(Author)
Tudor I Oprea
(Author)
External identifiers
DOI:
10.1093/nar/gkac1033
Added
2022-11-29
Last modified
2023-01-08
Source:
Crossref
Getting Started with the IDG KMC Datasets and Tools
Current Protocols
2022-01 | Journal article
DOI:
10.1002/cpz1.355
Contributors:
Vincent Metzger
Show more detail
Homepage URL
Contributors
Vincent Metzger
(Author)
[ORCID: 0000-0002-8041-0370]
External identifiers
Added
2022-09-12
Last modified
2022-09-12
Source:
Vincent T. Metzger
Electrostatic Channeling in P. falciparum DHFR-TS: Brownian Dynamics and Smoluchowski Modeling
Homepage URL
Contributors
Vincent Metzger
(Author)
[ORCID: 0000-0002-8041-0370]
External identifiers
ISSN:
0006-3495
Added
2022-09-12
Last modified
2022-09-12
Source:
Vincent T. Metzger
A model study of sequential enzyme reactions and electrostatic channeling
The Journal of Chemical Physics
2014-03-14 | Journal article
DOI:
10.1063/1.4867286
Contributors:
Vincent Metzger
Show more detail
Homepage URL
Contributors
Vincent Metzger
(Author)
[ORCID: 0000-0002-8041-0370]
External identifiers
Added
2022-09-12
Last modified
2022-09-12
Source:
Vincent T. Metzger
Activation and dynamic network of the M2 muscarinic receptor
Proceedings of the National Academy of Sciences
2013-07-02 | Journal article
Contributors:
Vincent Metzger
Show more detail
Homepage URL
Contributors
Vincent Metzger
(Author)
[ORCID: 0000-0002-8041-0370]
External identifiers
Abstract
G-protein-coupled receptors (GPCRs) mediate cellular responses to various hormones and neurotransmitters and are important targets for treating a wide spectrum of diseases. Although significant advances have been made in structural studies of GPCRs, details of their activation mechanism remain unclear. The X-ray crystal structure of the M2 muscarinic receptor, a key GPCR that regulates human heart rate and contractile forces of cardiomyocytes, was determined recently in an inactive antagonist-bound state. Here, activation of the M2 receptor is directly observed via accelerated molecular dynamics simulation, in contrast to previous microsecond-timescale conventional molecular dynamics simulations in which the receptor remained inactive. Receptor activation is characterized by formation of a Tyr206
5.58
–Tyr440
7.53
hydrogen bond and ∼6-Å outward tilting of the cytoplasmic end of transmembrane α-helix 6, preceded by relocation of Trp400
6.48
toward Phe195
5.47
and Val199
5.51
and flipping of Tyr430
7.43
away from the ligand-binding cavity. Network analysis reveals that communication in the intracellular domains is greatly weakened during activation of the receptor. Together with the finding that residue motions in the ligand-binding and G-protein-coupling sites of the apo receptor are correlated, this result highlights a dynamic network for allosteric regulation of the M2 receptor activation.
Added
2022-09-12
Last modified
2022-09-12
Source:
Vincent T. Metzger
Calcium binding and allosteric signaling mechanisms for the sarcoplasmic reticulum Ca<sup>2+</sup>ATPase
Protein Science
2012-10 | Journal article
DOI:
10.1002/pro.2129
Contributors:
Vincent Metzger
Show more detail
Homepage URL
Contributors
Vincent Metzger
(Author)
[ORCID: 0000-0002-8041-0370]
External identifiers
DOI:
10.1002/pro.2129
ISSN:
0961-8368
Added
2022-09-12
Last modified
2022-09-12
Source:
Vincent T. Metzger
Autonomous Targeting of Infectious Superspreaders Using Engineered Transmissible Therapies
PLoS Computational Biology
2011-03-17 | Journal article
Contributors:
Vincent Metzger
Show more detail
Homepage URL
Contributors
Vincent Metzger
(Author)
[ORCID: 0000-0002-8041-0370]
External identifiers
ISSN:
1553-7358
Added
2022-09-12
Last modified
2026-01-09
Source:
Vincent T. Metzger
