I received my PhD in Computer Science from Duke University in 1997. In August 2014, following a faculty appointment at Montana State University Department of Computer Science, and nearly 15 years as CEO of a bioinformatics software company, Golden Helix, I became a faculty member in the University of New Mexico Center for Global Health, Division of Translational Informatics, and Department of Internal Medicine.
My research areas include clinical research informatics, bioinformatics, and systems thinking. I develop and apply methods for the analysis of longitudinal healthcare data for predictive and preventative medicine. Since its inception, I have collaborated with other members of the Observational Health Data Sciences and Informatics collaborative. The OHDSI/OMOP common data model has been adopted to represent over 500M patients' electronic health and/or administrative claims records worldwide, enabling the development of a broad set of tools for the analysis of human health on these massive datasets. I am currently developing statistical and computational tools to compare treatment options and obtain better estimates of expected health outcomes despite large biases and confounding in the data, with a focus on mental illness (bipolar disorder, major depression, PTSD, suicidality), with pilot projects in human aging.
In July 2016 I received an NIH NLM R21 award to research methods for observational comparative effectiveness research, and a PCORI award to compare bipolar disorder treatments and outcomes in large-scale administrative claims data.In 2020, I received an R56 award from the NIH NIMH to investigate undiagnosed and/or unrecorded PTSD, TBI, and self-harm through machine learning to determine the degree to which this phenomenon exists, and to examine disparities in diagnosis/recording/outcomes by patient sociodemographic factors. Recent results of this work were presented at the 2021 OHDSI Global Symposium.
In addition, I perform bioinformatics analyses of genomics datasets with current projects in pediatric Malaria and COVID-19 in collaboration with Dr. DJ Perkins in the Center for Global Health. I serve as the UNM Clinical and Translational Sciences Center (CTSC) Informatics Core Lead. I hold secondary appointments in the UNM Department of Psychiatry and Behavioral Sciences and the UNM Department of Computer Science.
I inform all of my efforts through a palette of multiple systems disciplines including Theory of Constraints, System Dynamics, Requisite Organization, TRIZ, Cybernetics, and the Scientific Method.
Research Identifiers
Funding (9)
Deriving high-quality evidence from national healthcare databases to improve suicidality detection and treatment outcomes in PTSD ✓ NIH
National Institute of Mental Health (Bethesda, US)
Homepage URL: https://app.dimensions.ai/details/grant/grant.13057758
GRANT_NUMBER: R01MH129764
Show more detailOrganization identifiers
Funding project translated title
📄 Project Abstract (from NIH)
Post-traumatic stress disorder (PTSD) often has complex profiles of co-occurring medical conditions and is
associated with high risk of self-harm, including suicidality, which is a leading cause of death, particularly
among Veterans. There is a critical lack of advancement in PTSD pharmacotherapy, as illustrated by increased
use of off-label medications and polypharmacy (multiple drugs used simultaneously) with limited evidence on
their relative risks and benefits. Moreover, PTSD and suicidal and nonsuicidal self-harm often remain
undocumented in electronic health records (EHR). There is also poor predictability of disease outcomes since
there are frequent changes in pharmacological treatment and multiple modifying co-occurring conditions
including depression, bipolar disorder, schizophrenia, substance use disorders, traumatic brain injury, and
sleep disorders. Our long-term goal is to improve diagnostics, secondary/tertiary prevention, and treatment
outcomes of PTSD and its co-occurring conditions via enhanced EHR utilization. To achieve our objectives, we
will analyze EHR and administrative claims data from Veterans Health Administration (VHA) and non-VHA
databases, collectively covering >1.8M patients with PTSD. Specifically, we aim to: (1) Identify undetected and
uncoded co-occurring mental health phenotypes that impact PTSD outcomes using machine learning and
characterize disparities in their documentation; (2) Create robust models, accounting for biases and
co-occurring conditions, to identify clinical trajectories of PTSD decompensation/recovery in response to
time-varying treatments; and (3) Compare risk of self-harm and hospitalization among PTSD treatments using
coded and imputed phenotypes through an international network study. We will compare the effectiveness of
PTSD psychotropic monotherapies, polypharmacy, and psychotherapy to guide the choice of treatment for
improved patient outcomes. By enhancing and validating a positive-unlabeled machine learning approach
developed by our team, we will impute unrecorded/undetected mental health conditions co-occurring with
PTSD in both VHA and non-VHA databases, and characterize factors associated with documentation
disparities. We will model disease trajectories with enhanced latent class / latent trajectory analysis, focusing
on self-harm, substance use disorders, and psychiatric hospitalization in PTSD. Finally, we will perform the
largest comparative effectiveness studies to date of PTSD treatments on >100 monotherapy and
polypharmacy regimens, in addition to psychotherapy interventions, using causal models and methods for
addressing biases. These studies will provide high-quality evidence on the risk of hospitalizations and suicidal
acts/self-harm. Successful completion of these investigations will improve the quality of clinical psychiatric
decision-making, and guide improved service delivery to the Veteran and non-Veteran populations with
PTSD/TBI, and/or high risk of self-harm/suicidality.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2026-11-30T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Unsupervised and semi-supervised ML/AI with iterative experimentation for rapid identification of targeted alphaviral small molecules
Defense Threat Reduction Agency (VA, VA, US)
Homepage URL: https://www.usaspending.gov/award/ASST_NON_HDTRA12310005_097
GRANT_NUMBER: HDTRA12310005
Show more detailOrganization identifiers
Added
Last modified
Illuminating the Druggable Genome Data Coordinating Center - Engagement Plan with the CFDE ✓ NIH
Office of the Director (Bethesda, US)
Homepage URL: https://app.dimensions.ai/details/grant/grant.9411836
GRANT_NUMBER: OT2OD030546
Show more detailOrganization identifiers
Funding project translated title
📄 Project Abstract (from NIH)
protein- and disease-centric data types from multiple sources, integrate and harmonize them, then make them
readily available to the public; Second, adapt and scale existing technologies to unveil the function of selected
understudied members of the G-protein coupled receptor, ion channel and protein kinase families. Within the
IDG, the Knowledge Management Center (IDG-KMC) integrates data from a wide range of chemical, biological
and clinical resources, and has developed platforms that can be used to navigate understudied proteins (the
“dark genome”), and their potential contribution to specific pathologies. Specifically, the IDG KMC is creating
automated workflows to capture relevant public data for the entire proteome including manual annotations for
the IDG list, covering five major areas: genotype, phenotype, expression, structure & function, and interactions
& pathways. The IDG KMC designs, develops, implements, and updates the Target Central Repository Database
(TCRD), a protein knowledgebase. The IDG KMC also expands, improves, and maintains Pharos, the TCRD
portal, with support for automated data summaries, and active community feedback. Both TCRD and Pharos
already integrate data from three Common Fund projects: GTEx, IMPC/KOMP and LINCS. The IDG KMC
consolidates all the data generated by the Data and Resource Generation Centers (DRGCs), improving these
data findability, accessibility, interoperability, reusability (FAIRness) and serving these data on the Pharos portal.
The IDG program interface with the CFDE will enable hypothesis generation about novel drug targets for complex
diseases. Many other Common Fund (CF) programs produce data about genetic variants and differentially
expressed genes and proteins in the context of many complex human diseases. These genes in many cases do
not have much information about them. For example, the CF program Undiagnosed Disease Network (UDN)
identifies mutations in genes associated with undiagnosed diseases. The IDG-KMC has information from
empirical evidence and from computational predictions about the function of these genes, which are commonly
under-studied. Hence, data from the IDG-KMC can enrich the CFDE users who examine datasets that list genes
and proteins. Several IDG resources provide gene landing pages that provide unique information about genes.
These landing pages can be improved regarding FAIRness and can become a resource for the CFDE. In
addition, data collected by the DRGCs and by the R03 IDG awardees can enrich the content of the CFDE portal. In
particular, results from the R03 projects (Fig. 1) are currently not evaluated or stored in one place and are at risk
of becoming lost. The CFDE engagement will ensure that data from this investment remains available long term.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2027-03-22T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Deriving high-quality evidence from national healthcare databases to improve suicidality detection and treatment outcomes in PTSD and TBI ✓ NIH
National Institute of Mental Health (Bethesda, US)
Homepage URL: https://app.dimensions.ai/details/grant/grant.9293321
GRANT_NUMBER: R56MH120826
Show more detailOrganization identifiers
Funding project translated title
📄 Project Abstract (from NIH)
Post-traumatic stress disorder (PTSD) has complex profiles of co-occurring medical conditions (comorbidities)
and is associated with high risk of suicide, particularly among Veterans, in which it is a leading cause of death.
There is a critical lack of advancement in PTSD pharmacotherapy, as illustrated by increased use of off-label
medications and polypharmacy (multiple drugs used simultaneously). The consequent limited evidence on the
relative risks and benefits of treatments creates a crisis in PTSD management. Moreover, PTSD and its major
comorbidities [traumatic brain injury (TBI) and suicidality] often remain undocumented in electronic health
records (EHR). There is also poor predictability of disease outcomes since there are frequent changes in
pharmacological treatment and multiple modifying comorbidities. Our long-term goal is to improve diagnostics,
secondary/tertiary prevention, and treatment outcomes of PTSD and its comorbidities via enhanced EHR
utilization. To achieve our objectives, we will analyze EHR and administrative claims data from Veterans
Administration (VA) and non-VA databases, collectively covering >2M PTSD and >2M TBI patients.
Specifically, we aim to: (1) Identify undetected PTSD, TBI, and self-harm from EHRs (using machine learning
with and without natural language language processing) to guide health service improvements. (2) Predict
PTSD clinical course in the VA population through novel modeling of disease trajectories that account for
time-varying treatments and biases (3) Compare the effectiveness of PTSD psychotropic monotherapies,
polypharmacy, and psychotherapy to guide the choice of treatment for improved patient outcomes. By
enhancing and validating a machine learning approach developed by our team, we will impute unrecorded
PTSD, TBI, and self-harm from both datasets, and characterize factors associated with documentation
disparities. We will model diseases trajectories with enhanced latent class analysis, focusing on self-harm,
substance misuse, and psychiatric hospitalization in PTSD. With Local Control methodology innovations, we
will compare the risk of PTSD in veterans with and without comorbid TBI. Finally, we will perform the largest
comparative effectiveness studies (to date) of PTSD treatments on >100 monotherapy and polypharmacy
regimens plus psychotherapy interventions. These studies will provide high-quality evidence on the risk of
hospitalizations, substance misuse, and suicidal acts/self-harm. Successful completion of these investigations
will improve the quality of decision making for providers and patients, and guide improved service delivery to
the population of veterans and non-veterans with PTSD/TBI, and/or high risk of suicide.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2022-05-31T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
A microaggregation framework for reproducible research with observational data: addressing biases while protecting personal identities ✓ NIH
National Library of Medicine (Bethesda, US)
GRANT_NUMBER: R21LM012389
Show more detailOrganization identifiers
📄 Project Abstract (from NIH)
The primary objective of the current proposal is to foster efforts towards transparent and
reproducible knowledge repositories for evidence-based medicine. The wealth of healthcare
data already available in electronic health records could be better utilized to help guide
treatment choices and compliment findings from randomized controlled trials. This proposal
addresses two major obstacles. The first is the challenge of deriving high-quality evidence from
observational data in the presence of biases and confounders, particularly with temporal data.
The second is that patient privacy and other concerns prevent disclosure of source data, which
hinders reproducible research -- currently there is a vast body of medical literature whose
findings guide clinical practice, yet cannot be independently scrutinized. We will address these
challenges through an innovative methodology, local control, which both corrects biases and
enables disclosure of question-specific microaggregated data to reproduce research findings
without disclosure of individual information. The key idea behind local control is to form many
homogeneous patient clusters within which one can compare alternate treatments, statistically
correcting for measured biases and confounders, analogous to a randomized block design. Our
methodology provides a unified framework for enabling open, high quality, comparative
effectiveness research by combining novel feature selection approaches, based on fractional
factorial experimental design, with advances in survival analysis, including competing risks. We
will create a public R package containing a family of methods for nonparametric bias correction
and statistical disclosure control in cross-sectional, case-control, and survival analysis settings.
Success of this research will also enable a novel model, we term “parcelled data sharing” to
facilitate open selective release of proprietary data sources for specific questions --
simultaneously protecting patient privacy, proprietary interests, and the public good. Our
research will contribute to the goal of evidence-based medicine being supported by national and
global knowledge bases on thousands of comparative effectiveness questions from 100’s of
millions of patients’ health records. This application supports the NLM mission by assisting in
the advancement of medical and related sciences through the dissemination and exchange of
important information to the progress of medicine and health. The specific aims are to (1)
Develop and evaluate a survival-based local control methodology for bias-corrected treatment
comparisons in time-to-event observational data; and (2) Develop and evaluate local control-
based microaggregation for reproducible research.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2019-06-30T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Longitudinal Comparative Effectiveness of Bipolar Disorder Therapies
Patient-Centered Outcomes Research Institute (Washington, US)
Homepage URL: https://grants.uberresearch.com/100006093/6bbd239d/Longitudinal-Comparative-Effectiveness-of-Bipolar-Disorder-Therapies
GRANT_NUMBER: 6bbd239d
Show more detailOrganization identifiers
Added
Last modified
Data Driven Prognostics
Department of Defense, Small Business Innovation Research (Washington, US)
Homepage URL: https://grants.uberresearch.com/100000005/177031/Data-Driven-Prognostics
GRANT_NUMBER: F33615-03-M-4122
Show more detailOrganization identifiers
Added
Last modified
Software Relating Genes to Disease and Clinical Outcomes ✓ NIH
National Institute of General Medical Sciences (Bethesda, US)
Homepage URL: https://grants.uberresearch.com/100000057/R44GM062081/Software-Relating-Genes-to-Disease-and-Clinical-Outcomes
GRANT_NUMBER: R44GM062081
Show more detailOrganization identifiers
📄 Project Abstract (from NIH)
DESCRIPTION (provided by applicant): The development of a software system is proposed that will combine statistical theory, computer science algorithms, and genetics expertise to take advantage of the great influx of data generated by the study of the human genome, clinical trials data and the creation of inexpensive genotyping techniques. This software will elucidate the complex relationship between drug efficacy and side effects, multiple interacting genes and environmental factors.
Our Phase I results show it is feasible to link phenotype to genotype for a list of "candidate" genes. A novel haplotype trend test has been developed to aid in finding associations across large SNP maps. Commercialization of this technique is essential for companies that intend to use large public or private SNP maps to locate genes that are associated with disease and drug safety and efficacy. Our statistical methods are expected to be successful even if the disease mechanism can differ from one person to another.
By analyzing and interpreting clinical trial data, the software will match drugs to target populations according to their specific genotype. This will enable pharmaceutical companies to create novel drugs that render maximum effectiveness and have minimum side effects, i.e. the right drug for the right person.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2005-12-31T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Software Relating Genes to Disease and Clinical Outcomes ✓ NIH
National Institute of General Medical Sciences (Bethesda, US)
Homepage URL: https://grants.uberresearch.com/100000057/R43GM062081/Software-Relating-Genes-to-Disease-and-Clinical-Outcomes
GRANT_NUMBER: R43GM062081
Show more detailOrganization identifiers
📄 Project Abstract (from NIH)
DESCRIPTION (Applicant's abstract): The development of a software system is
proposed that will combine statistical theory, computer science algorithms, and
genetics expertise to take advantage of the great influx of data generated by
both the study of the human genome and the creation of inexpensive genotyping
techniques. This software will elucidate the complex relationship between drug
efficacy and side effects, and multiple interacting genes and environmental
factors. Preliminary results, obtained by using simulated data, indicate that
it might be feasible to link phenotype to genotype for a list of "candidate
genes." The statistical methods are expected to be successful even if the
disease mechanism can differ from one person to another. By analyzing and
interpreting clinical trial data, the software will match drugs to target
populations according to their specific genotype. This will enable
pharmaceutical companies to create novel drugs that render maximum
effectiveness and have minimum side effects, i.e. the right drug for the right
person.
PROPOSED COMMERCIAL APPLICATION:
The target markets for the research include pharmaceutical companies, CRO'S
universities, and government agencies. It has good potential for commercialization
because it is expected to help create novel drugs, boost the safety of drug treatments,
save substantial resources, and make sense of complex genotype/phenotype relationships
in clinical trials context.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2001-09-30T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Education and qualifications (4)
Duke University: Durham, NC, US
Organization identifiers
Department
Added
Last modified
Duke University: Durham, North Carolina, US
Organization identifiers
Other organization identifiers provided by ROR
Department
Added
Last modified
Montana State University: Bozeman, MT, US
Department
Added
Last modified
University of Calgary: Calgary, AB, CA
Organization identifiers
Department
Homepage URL
Added
Last modified
Badapple 2.0: An Empirical Predictor of Compound Promiscuity, Updated, Modernized, and Enhanced for Explainability
Homepage URL
Contributors
External identifiers
Added
Last modified
KG2ML: Integrating Knowledge Graphs and Positive Unlabeled Learning for Identifying Disease-Associated Genes
Homepage URL
Contributors
External identifiers
Added
Last modified
Detecting Opioid Use Disorder in Health Claims Data With Positive Unlabeled Learning.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
Environment scan of generative AI infrastructure for clinical and translational science.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
Positive Unlabeled Learning Selected Not At Random (PULSNAR): class proportion estimation without the selected completely at random assumption
Homepage URL
Contributors
External identifiers
Added
Last modified
Transcriptomic and Proteomic Insights into Host Immune Responses in Pediatric Severe Malarial Anemia: Dysregulation in HSP60-70-TLR2/4 Signaling and Altered Glutamine Metabolism
Homepage URL
Contributors
External identifiers
Added
Last modified
Human NCR3 gene variants rs2736191 and rs11575837 alter longitudinal risk for development of pediatric malaria episodes and severe malarial anemia.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
Disproportionate impact of COVID-19 severity and mortality on hospitalized American Indian/Alaska Native patients
Homepage URL
Contributors
External identifiers
Added
Last modified
Entire Expressed Peripheral Blood Transcriptome in Pediatric Severe Malarial Anemia
Homepage URL
Contributors
External identifiers
Added
Last modified
Toxicology knowledge graph for structural birth defects.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
Positive Unlabeled Learning Selected Not At Random (PULSNAR): class proportion estimation when the SCAR assumption does not hold
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
Genetic variation in CSF2 (5q31.1) is associated with longitudinal susceptibility to pediatric malaria, severe malarial anemia, and all-cause mortality in a high-burden malaria and HIV region of Kenya.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
Elevated SARS-CoV-2 in peripheral blood and increased COVID-19 severity in American Indians/Alaska Natives.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
A Comprehensive COVID-19 Daily News and Medical Literature Briefing to Inform Health Care and Policy in New Mexico: Implementation Study.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
Ingestion of hemozoin by peripheral blood mononuclear cells alters temporal gene expression of ubiquitination processes.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
Complement component 3 mutations alter the longitudinal risk of pediatric malaria and severe malarial anemia.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
TIGA: target illumination GWAS analytics.
Homepage URL
Contributors
External identifiers
Abstract
Added
Last modified
Using Machine Learning Imputed Outcomes to Assess Drug-Dependent Risk of Self-Harm in Patients with Bipolar Disorder: A Comparative Effectiveness Study
Homepage URL
External identifiers
Abstract
Incomplete suicidality coding in administrative claims data is a known obstacle for observational studies. With most of the negative outcomes missing from the data, it is challenging to assess the evidence on treatment strategies for the prevention of self-harm in bipolar disorder (BD), including pharmacotherapy and psychotherapy. There are conflicting data from studies on the drug-dependent risk of self-harm, and there is major uncertainty regarding the preventive effect of monotherapy and drug combinations.
Objective:
The aim of this study was to compare all commonly used BD pharmacotherapies, as well as psychotherapy for the risk of self-harm, in a large population of commercially insured individuals, using self-harm imputation to overcome the known limitations of this outcome being underrecorded within US electronic health care records.
Methods:
The IBM MarketScan administrative claims database was used to compare self-harm risk in patients with BD following 65 drug regimens and drug-free periods. Probable but uncoded self-harm events were imputed via machine learning, with different probability thresholds examined in a sensitivity analysis. Comparators included lithium, mood-stabilizing anticonvulsants (MSAs), second-generation antipsychotics (SGAs), first-generation antipsychotics (FGAs), and five classes of antidepressants. Cox regression models with time-varying covariates were built for individual treatment regimens and for any pharmacotherapy with or without psychosocial interventions (“psychotherapy”).
Results:
Among 529,359 patients, 1.66% (n=8813 events) had imputed and/or coded self-harm following the exposure of interest. A higher self-harm risk was observed during adolescence. After multiple testing adjustment (P≤.012), the following six regimens had higher risk of self-harm than lithium: tri/tetracyclic antidepressants + SGA, FGA + MSA, FGA, serotonin-norepinephrine reuptake inhibitor (SNRI) + SGA, lithium + MSA, and lithium + SGA (hazard ratios [HRs] 1.44-2.29), and the following nine had lower risk: lamotrigine, valproate, risperidone, aripiprazole, SNRI, selective serotonin reuptake inhibitor (SSRI), “no drug,” bupropion, and bupropion + SSRI (HRs 0.28-0.74). Psychotherapy alone (without medication) had a lower self-harm risk than no treatment (HR 0.56, 95% CI 0.52-0.60; P=8.76×10-58). The sensitivity analysis showed that the direction of drug-outcome associations did not change as a function of the self-harm probability threshold.
Conclusions:
Our data support evidence on the effectiveness of antidepressants, MSAs, and psychotherapy for self-harm prevention in BD.
Trial Registration:
ClinicalTrials.gov NCT02893371; https://clinicaltrials.gov/ct2/show/NCT02893371
JMIR Ment Health 2021;8(4):e24522
