Research Identifiers
Keywords
Funding (4)
Using the literature to build causal models of retrospective observational data ✓ NIH
United States National Library of Medicine (Bethesda, Maryland, US)
Homepage URL: https://reporter.nih.gov/search/agkD0LqiWk-OD1wThSJt3Q/project-details/10879451
GRANT_NUMBER: R00LM013367
Show more detailOrganization identifiers
📄 Project Abstract (from NIH)
records (EHRs), allow for the identification links between health events, such as drug exposures and side-
effects. Some of these links indicate stable dependencies deemed as causes. Causal insight allows reverse-
engineering disease. If confounding is not addressed, it will be difficult to distinguish causative from correlative
links. Our approach is to identify confounders explicitly. Graphical causal modeling (GCMs) can discover
causal links from data and prior knowledge. GCMs summarize causal links between variables. Automated
selection of variables would allow GCMs to scale and yield more insight from data. Literature-based discovery
(LBD) methods were developed to identify links between concepts in the literature. Advanced methods permit
the search for concepts linked to each other through specific verbs, e.g., “causes”, “treats”. Our hypothesis is
that we can exploit structured knowledge extracted from the literature to inform GCMs. In prior work, we found
that LBD + GCM was better at identifying side-effects in EHR data than traditional methods. Compared to
methods which use solely data, we hypothesize that our method will increase the ability to detect causal
relationships from EHR data. The first aim is to determine the extent to which LBD-informed GCM improves the
identification of causal links for drug safety. We will build LBD-informed GCMs using publicly available
reference datasets for drug safety. These reference datasets contain drug/side-effect pairs for performance
benchmarking. (A) Test the ability of GCM algorithms to identify known causal links solely using data. We will
systematically evaluate GCM algorithms based on their ability to re-discover causal links in a reference
standard. Results will guide our studies on how GCM can be tuned. (B) Determine the effect of adding different
subsets of LBD-derived information to GCMs at identifying drug side-effects. We will build causal models using
increasing numbers confounders. The second aim is to test the ability of LBD built with disease-specific
literature to improve the relevance of LBD derived confounders for Alzheimer's Disease (AD). We chose AD for
its high prevalence and relative lack of effective pharmacologic treatment. (A) Compare LBD strategies in a
disease-specific setting. We will test LBD variants using disease-specific literature or with LBD lacking subject-
matter restrictions. (B) Define the ability of robust LBD-informed GCM to validate drug repurposing candidates
for treating AD symptoms. We will test the ability of advanced methods to iteratively resolve hidden latent
confounding, when detected, to improve effect estimates. The fulfillment of these aims will yield new methods
to combine insights from the literature with causal modeling to uncover causal relationships of drug exposures
on adverse events and on beneficial outcomes.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2026-07-31T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Using the literature to build causal models of retrospective observational data ✓ NIH
United States National Library of Medicine (Bethesda, US)
Homepage URL: https://app.dimensions.ai/details/grant/grant.9844339
GRANT_NUMBER: K99LM013367
Show more detailOrganization identifiers
Funding project translated title
📄 Project Abstract (from NIH)
records (EHRs), allow for the identification links between health events, such as drug exposures and side-
effects. Some of these links indicate stable dependencies deemed as causes. Causal insight allows reverse-
engineering disease. If confounding is not addressed, it will be difficult to distinguish causative from correlative
links. Our approach is to identify confounders explicitly. Graphical causal modeling (GCMs) can discover
causal links from data and prior knowledge. GCMs summarize causal links between variables. Automated
selection of variables would allow GCMs to scale and yield more insight from data. Literature-based discovery
(LBD) methods were developed to identify links between concepts in the literature. Advanced methods permit
the search for concepts linked to each other through specific verbs, e.g., “causes”, “treats”. Our hypothesis is
that we can exploit structured knowledge extracted from the literature to inform GCMs. In prior work, we found
that LBD + GCM was better at identifying side-effects in EHR data than traditional methods. Compared to
methods which use solely data, we hypothesize that our method will increase the ability to detect causal
relationships from EHR data. The first aim is to determine the extent to which LBD-informed GCM improves the
identification of causal links for drug safety. We will build LBD-informed GCMs using publicly available
reference datasets for drug safety. These reference datasets contain drug/side-effect pairs for performance
benchmarking. (A) Test the ability of GCM algorithms to identify known causal links solely using data. We will
systematically evaluate GCM algorithms based on their ability to re-discover causal links in a reference
standard. Results will guide our studies on how GCM can be tuned. (B) Determine the effect of adding different
subsets of LBD-derived information to GCMs at identifying drug side-effects. We will build causal models using
increasing numbers confounders. The second aim is to test the ability of LBD built with disease-specific
literature to improve the relevance of LBD derived confounders for Alzheimer's Disease (AD). We chose AD for
its high prevalence and relative lack of effective pharmacologic treatment. (A) Compare LBD strategies in a
disease-specific setting. We will test LBD variants using disease-specific literature or with LBD lacking subject-
matter restrictions. (B) Define the ability of robust LBD-informed GCM to validate drug repurposing candidates
for treating AD symptoms. We will test the ability of advanced methods to iteratively resolve hidden latent
confounding, when detected, to improve effect estimates. The fulfillment of these aims will yield new methods
to combine insights from the literature with causal modeling to uncover causal relationships of drug exposures
on adverse events and on beneficial outcomes.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2023-07-31T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Using Biomedical Knowledge to Identify Plausible Signals for Pharmacovigilance
United States National Library of Medicine (n/a, US)
GRANT_NUMBER: grant.R01LM011563
Show more detailOrganization identifiers
Funding project translated title
Added
Last modified
NLM Training Program in Biomedical Informatics & Data Science for Predoctoral and Postdoctoral Fellows
United States National Library of Medicine (n/a, US)
GRANT_NUMBER: grant.T15LM007093
Show more detail