Research Identifiers

Education and qualifications (3)

University of California, San Diego: CA, CA, US

2009-08-01 to 2014-07-23 | Ph.D. (Chemistry)
Education
Show more detail

Organization identifiers

FUNDREF: http://dx.doi.org/10.13039/100007911
University of California, San Diego : CA, CA, US

Department

Chemistry

Added

2022-09-12

Last modified

2022-09-12
Source: Vincent T. Metzger

University of California San Diego: San Diego, California, US

2009-08-01 to 2014-04-23 | M.S. (Chemistry)
Education
Show more detail

Organization identifiers

University of California San Diego : San Diego, California, US

Other organization identifiers provided by ROR

Department

Chemistry

Added

2026-01-09

Last modified

2026-01-09
Source: Vincent T. Metzger

University of New Mexico: Albuquerque, New Mexico, US

2005-08-01 to 2009-05-25 | BA (Chemistry)
Education
Show more detail

Department

Chemistry

Added

2022-09-12

Last modified

2022-09-12
Source: Vincent T. Metzger

KG2ML: integrating knowledge graphs and positive unlabeled learning for identifying disease-associated genes

Frontiers in Bioinformatics
2026-01-08 | Journal article
Contributors: Praveen Kumar; Vincent T. Metzger; Swastika T. Purushotham; Priyansh Kedia; Cristian G. Bologa (and 2 more)
Show more detail

Contributors

Praveen Kumar (Author)
Vincent T. Metzger (Author)
Swastika T. Purushotham (Author)
Priyansh Kedia (Author)
Cristian G. Bologa (Author)
Christophe G. Lambert (Author)
Jeremy J. Yang (Author)

External identifiers

Abstract


Background
Biomedical knowledge graphs (KGs), such as the Data Distillery Knowledge Graph (DDKG), capture known relationships among entities (e.g., genes, diseases, proteins), providing valuable insights for research. However, these relationships are typically derived from prior studies, leaving potential unknown associations unexplored. Identifying such unknown associations, including previously unknown disease-associated genes, remains a critical challenge in bioinformatics and is crucial for advancing biomedical knowledge.


Methods
Traditional methods, such as linkage analysis and genome-wide association studies (GWAS), can be time-consuming and resource-intensive. This highlights the need for efficient computational approaches to identify or predict new genes using known disease-gene associations. Recently, network-based methods and KGs, enhanced by advances in machine learning (ML) frameworks, have emerged as promising tools for inferring these unexplored associations. Given the technical limitations of the Neo4j Graph Data Science (GDS) machine learning pipeline, we developed a novel machine learning pipeline called KG2ML (Knowledge Graph to Machine Learning). This pipeline utilizes our Positive and Unlabeled (PU) learning algorithm, PULSCAR (Positive Unlabeled Learning Selected Completely At Random), and incorporates path-based feature extraction from ProteinGraphML.


Results
KG2ML was applied to 12 diseases, including Bipolar Disorder, Coronary Artery Disease, and Parkinson’s Disease, to infer disease-associated genes not explicitly recorded in DDKG. For several of these diseases, 14 out of the 15 top-ranked genes lacked prior explicit associations in the DDKG but were supported by literature and TINX (Target Importance and Novelty Explorer) evidence. Incorporating PULSCAR-imputed genes as positives enhanced XGBoost classification, demonstrating the potential of PU learning in identifying hidden gene-disease relationships.


Conclusion
The observed improvement in classification performance after the inclusion of PULSCAR-imputed genes as positive examples, along with the subject matter experts’ (SME) evaluations of the top 15 imputed genes for 12 diseases, suggests that PU learning can effectively uncover disease-gene associations missing from existing knowledge graphs (KGs). By integrating KG data with ML-based inference, our KG2ML pipeline provides a scalable and interpretable framework to advance biomedical research while addressing the inherent limitations of current KGs.

Added

2026-01-08

Last modified

2026-01-09
Source: Source Vincent T. Metzger

Unsupervised Latent Pattern Analysis for Estimating Type 2 Diabetes Risk in Undiagnosed Populations

2025-10-12 | Conference paper
Contributors: Praveen Kumar; Vincent T. Metzger; Scott Alexander Malec
Show more detail

Contributors

Praveen Kumar (Author)
Vincent T. Metzger (Author)
Scott Alexander Malec (Author)

External identifiers

Added

2025-12-10

Last modified

2026-01-09
Source: Validated source Crossref

The Data Distillery: A Graph Framework for Semantic Integration and Querying of Biomedical Data

2025-08-15 | Preprint
Contributors: Taha Mohseni Ahooyi; Benjamin Stear; J. Alan Simmons; Vincent T. Metzger; Praveen Kumar (and 39 more)
Show more detail

Contributors

Taha Mohseni Ahooyi (Author)
Benjamin Stear (Author)
J. Alan Simmons (Author)
Vincent T. Metzger (Author)
Praveen Kumar (Author)
John Erol Evangelista (Author)
Daniel J. B. Clarke (Author)
Zhuorui Xie (Author)
Heesu Kim (Author)
Sherry L. Jenkins (Author)
Mano R. Maurya (Author)
Srinivasan Ramachandran (Author)
Eoin Fahy (Author)
Thomas H. Gillespie (Author)
Fahim T. Imam (Author)
Natallia Kokash (Author)
Matthew E. Roth (Author)
Robert Fullem (Author)
Dubravka Jevtic (Author)
Aleks Mihajlovic (Author)
Michael Tiemeyer (Author)
Clara Bakker (Author)
Andrew J. Schroeder (Author)
Julia Markowski (Author)
Jared Nedzel (Author)
Dave D. Hill (Author)
James Terry (Author)
Christopher Nemarich (Author)
Jyl Boline (Author)
Peter J. Park (Author) [ORCID: 0000-0001-9378-960X]
Kristin G. Ardlie (Author)
Jeet Vora (Author)
Raja Mazumder (Author)
Rene Ranzinger (Author)
Bernard de Bono (Author)
Shankar Subramaniam (Author)
Jeffrey S. Grethe (Author)
Jeremy J. Yang (Author)
Christophe G. Lambert (Author)
Adam Resnick (Author)
Aleks Milosavljevic (Author)
Avi Ma’ayan (Author)
Jonathan C. Silverstein (Author)
Deanne M. Taylor (Author) [ORCID: 0000-0002-3302-4610]

External identifiers

Abstract

Abstract
The Data Distillery Knowledge Graph (DDKG) is a framework for semantic integration and querying of biomedical data across domains. Built for the NIH Common Fund Data Ecosystem, it supports translational research by linking clinical and experimental datasets in a unified graph model. Clinical standards such as ICD-10, SNOMED, and DrugBank are integrated through UMLS, while genomics and basic science data are structured using ontologies and standards such as HPO, GENCODE, Ensembl, STRING, and ClinVar. The DDKG uses a property graph architecture based on the UBKG infrastructure and supports ontology-based ingestion, identifier normalization, and graph-native querying. The system is modular and can be extended with new datasets or schema modules. We demonstrate its utility for informatics queries across eight use cases, including regulatory variant analysis, tissue-specific expression, biomarker discovery, and cross-species variant prioritization. The DDKG is accessible via a public interface, a programmatic API, and downloadable builds for local use.

Added

2026-01-08

Last modified

2026-01-09
Source: Source Vincent T. Metzger

TIN-X version 3: update with expanded dataset and modernized architecture for enhanced illumination of understudied targets.

PeerJ
2024-06-25 | Journal article
Contributors: Metzger VT; Cannon DC; Yang JJ; Mathias SL; Bologa CG (and 9 more)
Show more detail

Contributors

Metzger VT (Author)
Cannon DC (Author)
Yang JJ (Author) [ORCID: 0000-0002-1476-6192]
Mathias SL (Author)
Bologa CG (Author) [ORCID: 0000-0003-2232-4244]
Waller A (Author) [ORCID: 0000-0003-3676-2501]
Schürer SC (Author) [ORCID: 0000-0001-7180-0978]
Vidović D (Author) [ORCID: 0000-0001-9798-2108]
Kelleher KJ (Author) [ORCID: 0000-0002-8878-1539]
Sheils TK (Author)
Jensen LJ (Author)
Lambert CG (Author) [ORCID: 0000-0003-1994-2893]
Oprea TI (Author)
Edwards JS (Author)

External identifiers

Abstract

TIN-X (Target Importance and Novelty eXplorer) is an interactive visualization tool for illuminating associations between diseases and potential drug targets and is publicly available at newdrugtargets.org. TIN-X uses natural language processing to identify disease and protein mentions within PubMed content using previously published tools for named entity recognition (NER) of gene/protein and disease names. Target data is obtained from the Target Central Resource Database (TCRD). Two important metrics, novelty and importance, are computed from this data and when plotted as log(importance) vs. log(novelty), aid the user in visually exploring the novelty of drug targets and their associated importance to diseases. TIN-X Version 3.0 has been significantly improved with an expanded dataset, modernized architecture including a REST API, and an improved user interface (UI). The dataset has been expanded to include not only PubMed publication titles and abstracts, but also full-text articles when available. This results in approximately 9-fold more target/disease associations compared to previous versions of TIN-X. Additionally, the TIN-X database containing this expanded dataset is now hosted in the cloud via Amazon RDS. Recent enhancements to the UI focuses on making it more intuitive for users to find diseases or drug targets of interest while providing a new, sortable table-view mode to accompany the existing plot-view mode. UI improvements also help the user browse the associated PubMed publications to explore and understand the basis of TIN-X's predicted association between a specific disease and a target of interest. While implementing these upgrades, computational resources are balanced between the webserver and the user's web browser to achieve adequate performance while accommodating the expanded dataset. Together, these advances aim to extend the duration that users can benefit from TIN-X while providing both an expanded dataset and new features that researchers can use to better illuminate understudied proteins.

Added

2026-01-08

Last modified

2026-01-09
Source: Source Vincent T. Metzger

Overview of the Knowledge Management Center for Illuminating the Druggable Genome

Drug Discovery Today
2024-03 | Journal article
Contributors: Tudor I. Oprea; Cristian Bologa; Jayme Holmes; Stephen Mathias; Vincent T. Metzger (and 9 more)
Show more detail

Contributors

Tudor I. Oprea (Author)
Cristian Bologa (Author)
Jayme Holmes (Author)
Stephen Mathias (Author)
Vincent T. Metzger (Author)
Anna Waller (Author)
Jeremy J. Yang (Author)
Andrew R. Leach (Author)
Lars Juhl Jensen (Author)
Keith J. Kelleher (Author)
Timothy K. Sheils (Author)
Ewy Mathé (Author)
Sorin Avram (Author)
Jeremy S. Edwards (Author) [ORCID: 0000-0003-3694-3716]

External identifiers

Added

2026-01-08

Last modified

2026-01-08
Source: Source Vincent T. Metzger

Pharos 2023: an integrated resource for the understudied human proteome

Nucleic Acids Research
2023-01-06 | Journal article
Contributors: Keith J Kelleher; Timothy K Sheils; Stephen L Mathias; Jeremy J Yang; Vincent T Metzger (and 12 more)
Show more detail

Contributors

Keith J Kelleher (Author)
Timothy K Sheils (Author)
Stephen L Mathias (Author)
Jeremy J Yang (Author)
Vincent T Metzger (Author)
Vishal B Siramshetty (Author)
Dac-Trung Nguyen (Author)
Lars Juhl Jensen (Author)
Dušica Vidović (Author)
Stephan C Schürer (Author)
Jayme Holmes (Author)
Karlie R Sharma (Author)
Ajay Pillai (Author)
Cristian G Bologa (Author)
Jeremy S Edwards (Author)
Ewy A Mathé (Author)
Tudor I Oprea (Author)

External identifiers

Added

2022-11-29

Last modified

2023-01-08
Source: Validated source Crossref

Getting Started with the IDG KMC Datasets and Tools

Current Protocols
2022-01 | Journal article
Contributors: Vincent Metzger
Show more detail

Contributors

Vincent Metzger (Author) [ORCID: 0000-0002-8041-0370]

External identifiers

Added

2022-09-12

Last modified

2022-09-12
Source: Source Vincent T. Metzger

Electrostatic Channeling in P. falciparum DHFR-TS: Brownian Dynamics and Smoluchowski Modeling

Biophysical Journal
2014-11 | Journal article
Contributors: Vincent Metzger
Show more detail

Contributors

Vincent Metzger (Author) [ORCID: 0000-0002-8041-0370]

External identifiers

Added

2022-09-12

Last modified

2022-09-12
Source: Source Vincent T. Metzger

A model study of sequential enzyme reactions and electrostatic channeling

The Journal of Chemical Physics
2014-03-14 | Journal article
Contributors: Vincent Metzger
Show more detail

Contributors

Vincent Metzger (Author) [ORCID: 0000-0002-8041-0370]

External identifiers

Added

2022-09-12

Last modified

2022-09-12
Source: Source Vincent T. Metzger

Activation and dynamic network of the M2 muscarinic receptor

Proceedings of the National Academy of Sciences
2013-07-02 | Journal article
Contributors: Vincent Metzger
Show more detail

Contributors

Vincent Metzger (Author) [ORCID: 0000-0002-8041-0370]

External identifiers

Abstract


G-protein-coupled receptors (GPCRs) mediate cellular responses to various hormones and neurotransmitters and are important targets for treating a wide spectrum of diseases. Although significant advances have been made in structural studies of GPCRs, details of their activation mechanism remain unclear. The X-ray crystal structure of the M2 muscarinic receptor, a key GPCR that regulates human heart rate and contractile forces of cardiomyocytes, was determined recently in an inactive antagonist-bound state. Here, activation of the M2 receptor is directly observed via accelerated molecular dynamics simulation, in contrast to previous microsecond-timescale conventional molecular dynamics simulations in which the receptor remained inactive. Receptor activation is characterized by formation of a Tyr206
5.58
–Tyr440
7.53
hydrogen bond and ∼6-Å outward tilting of the cytoplasmic end of transmembrane α-helix 6, preceded by relocation of Trp400
6.48
toward Phe195
5.47
and Val199
5.51
and flipping of Tyr430
7.43
away from the ligand-binding cavity. Network analysis reveals that communication in the intracellular domains is greatly weakened during activation of the receptor. Together with the finding that residue motions in the ligand-binding and G-protein-coupling sites of the apo receptor are correlated, this result highlights a dynamic network for allosteric regulation of the M2 receptor activation.

Added

2022-09-12

Last modified

2022-09-12
Source: Source Vincent T. Metzger

Calcium binding and allosteric signaling mechanisms for the sarcoplasmic reticulum Ca<sup>2+</sup>ATPase

Protein Science
2012-10 | Journal article
Contributors: Vincent Metzger
Show more detail

Contributors

Vincent Metzger (Author) [ORCID: 0000-0002-8041-0370]

External identifiers

Added

2022-09-12

Last modified

2022-09-12
Source: Source Vincent T. Metzger

Autonomous Targeting of Infectious Superspreaders Using Engineered Transmissible Therapies

PLoS Computational Biology
2011-03-17 | Journal article
Contributors: Vincent Metzger
Show more detail

Contributors

Vincent Metzger (Author) [ORCID: 0000-0002-8041-0370]

External identifiers

Added

2022-09-12

Last modified

2026-01-09
Source: Source Vincent T. Metzger