I received my PhD in Computer Science from Duke University in 1997. In August 2014, following a faculty appointment at Montana State University Department of Computer Science, and nearly 15 years as CEO of a bioinformatics software company, Golden Helix, I became a faculty member in the University of New Mexico Center for Global Health, Division of Translational Informatics, and Department of Internal Medicine.
My research areas include clinical research informatics, bioinformatics, cheminformatics, and systems thinking. I develop and apply methods for the analysis of longitudinal healthcare data for predictive and preventative medicine. Since its inception, I have collaborated with other members of the Observational Health Data Sciences and Informatics collaborative. The OHDSI/OMOP common data model has been adopted to represent over 10% of the global population of patients' electronic health and/or administrative claims records worldwide, enabling the development of a broad set of tools for the analysis of human health on these massive datasets. I am currently developing statistical and computational tools to compare treatment options and obtain better estimates of expected health outcomes despite large biases and confounding in the data, with a focus on mental illness (bipolar disorder, major depression, PTSD, suicidality).
In 2016, received an NIH NLM R21 award to research methods for observational comparative effectiveness research, and a PCORI award to compare bipolar disorder treatments and outcomes in large-scale administrative claims data. In 2020, I received an R56 award from the NIH NIMH, followed in 2022 with an NIH NIMH R01 to investigate undiagnosed and/or unrecorded PTSD, TBI, and self-harm through machine learning to determine the degree to which this phenomenon exists, and to examine differences in diagnosis/recording/outcomes among patient subgroups. At the interface of bioinformatics and cheminformatics, I have co-led with Dr. Jeremy Yang the NIH-funded Common Fund Data System (CFDE) Illuminating the Druggable Genome Data Coordinating Center since 2022.
I serve as the UNM Southwest Center for Advancing Clinical & Translational Innovation (SW CACTI) Informatics Core Lead. I hold secondary appointments in the UNM Department of Psychiatry and Behavioral Sciences and the UNM Department of Computer Science.
I inform all of my efforts through a palette of multiple systems disciplines including Theory of Constraints, System Dynamics, Requisite Organization, TRIZ, Cybernetics, and the Scientific Method.
Research Identifiers
Funding (9)
Deriving high-quality evidence from national healthcare databases to improve suicidality detection and treatment outcomes in PTSD ✓ NIH
National Institute of Mental Health (Bethesda, US)
Homepage URL: https://app.dimensions.ai/details/grant/grant.13057758
GRANT_NUMBER: R01MH129764
Show more detailOrganization identifiers
Funding project translated title
📄 Project Abstract (from NIH)
Post-traumatic stress disorder (PTSD) often has complex profiles of co-occurring medical conditions and is
associated with high risk of self-harm, including suicidality, which is a leading cause of death, particularly
among Veterans. There is a critical lack of advancement in PTSD pharmacotherapy, as illustrated by increased
use of off-label medications and polypharmacy (multiple drugs used simultaneously) with limited evidence on
their relative risks and benefits. Moreover, PTSD and suicidal and nonsuicidal self-harm often remain
undocumented in electronic health records (EHR). There is also poor predictability of disease outcomes since
there are frequent changes in pharmacological treatment and multiple modifying co-occurring conditions
including depression, bipolar disorder, schizophrenia, substance use disorders, traumatic brain injury, and
sleep disorders. Our long-term goal is to improve diagnostics, secondary/tertiary prevention, and treatment
outcomes of PTSD and its co-occurring conditions via enhanced EHR utilization. To achieve our objectives, we
will analyze EHR and administrative claims data from Veterans Health Administration (VHA) and non-VHA
databases, collectively covering >1.8M patients with PTSD. Specifically, we aim to: (1) Identify undetected and
uncoded co-occurring mental health phenotypes that impact PTSD outcomes using machine learning and
characterize disparities in their documentation; (2) Create robust models, accounting for biases and
co-occurring conditions, to identify clinical trajectories of PTSD decompensation/recovery in response to
time-varying treatments; and (3) Compare risk of self-harm and hospitalization among PTSD treatments using
coded and imputed phenotypes through an international network study. We will compare the effectiveness of
PTSD psychotropic monotherapies, polypharmacy, and psychotherapy to guide the choice of treatment for
improved patient outcomes. By enhancing and validating a positive-unlabeled machine learning approach
developed by our team, we will impute unrecorded/undetected mental health conditions co-occurring with
PTSD in both VHA and non-VHA databases, and characterize factors associated with documentation
disparities. We will model disease trajectories with enhanced latent class / latent trajectory analysis, focusing
on self-harm, substance use disorders, and psychiatric hospitalization in PTSD. Finally, we will perform the
largest comparative effectiveness studies to date of PTSD treatments on >100 monotherapy and
polypharmacy regimens, in addition to psychotherapy interventions, using causal models and methods for
addressing biases. These studies will provide high-quality evidence on the risk of hospitalizations and suicidal
acts/self-harm. Successful completion of these investigations will improve the quality of clinical psychiatric
decision-making, and guide improved service delivery to the Veteran and non-Veteran populations with
PTSD/TBI, and/or high risk of self-harm/suicidality.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2026-11-30T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Unsupervised and semi-supervised ML/AI with iterative experimentation for rapid identification of targeted alphaviral small molecules
Defense Threat Reduction Agency (VA, VA, US)
Homepage URL: https://www.usaspending.gov/award/ASST_NON_HDTRA12310005_097
GRANT_NUMBER: HDTRA12310005
Show more detailOrganization identifiers
Added
Last modified
Illuminating the Druggable Genome Data Coordinating Center - Engagement Plan with the CFDE ✓ NIH
Office of the Director (Bethesda, US)
Homepage URL: https://app.dimensions.ai/details/grant/grant.9411836
GRANT_NUMBER: OT2OD030546
Show more detailOrganization identifiers
Funding project translated title
📄 Project Abstract (from NIH)
protein- and disease-centric data types from multiple sources, integrate and harmonize them, then make them
readily available to the public; Second, adapt and scale existing technologies to unveil the function of selected
understudied members of the G-protein coupled receptor, ion channel and protein kinase families. Within the
IDG, the Knowledge Management Center (IDG-KMC) integrates data from a wide range of chemical, biological
and clinical resources, and has developed platforms that can be used to navigate understudied proteins (the
“dark genome”), and their potential contribution to specific pathologies. Specifically, the IDG KMC is creating
automated workflows to capture relevant public data for the entire proteome including manual annotations for
the IDG list, covering five major areas: genotype, phenotype, expression, structure & function, and interactions
& pathways. The IDG KMC designs, develops, implements, and updates the Target Central Repository Database
(TCRD), a protein knowledgebase. The IDG KMC also expands, improves, and maintains Pharos, the TCRD
portal, with support for automated data summaries, and active community feedback. Both TCRD and Pharos
already integrate data from three Common Fund projects: GTEx, IMPC/KOMP and LINCS. The IDG KMC
consolidates all the data generated by the Data and Resource Generation Centers (DRGCs), improving these
data findability, accessibility, interoperability, reusability (FAIRness) and serving these data on the Pharos portal.
The IDG program interface with the CFDE will enable hypothesis generation about novel drug targets for complex
diseases. Many other Common Fund (CF) programs produce data about genetic variants and differentially
expressed genes and proteins in the context of many complex human diseases. These genes in many cases do
not have much information about them. For example, the CF program Undiagnosed Disease Network (UDN)
identifies mutations in genes associated with undiagnosed diseases. The IDG-KMC has information from
empirical evidence and from computational predictions about the function of these genes, which are commonly
under-studied. Hence, data from the IDG-KMC can enrich the CFDE users who examine datasets that list genes
and proteins. Several IDG resources provide gene landing pages that provide unique information about genes.
These landing pages can be improved regarding FAIRness and can become a resource for the CFDE. In
addition, data collected by the DRGCs and by the R03 IDG awardees can enrich the content of the CFDE portal. In
particular, results from the R03 projects (Fig. 1) are currently not evaluated or stored in one place and are at risk
of becoming lost. The CFDE engagement will ensure that data from this investment remains available long term.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2027-03-22T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Deriving high-quality evidence from national healthcare databases to improve suicidality detection and treatment outcomes in PTSD and TBI ✓ NIH
National Institute of Mental Health (Bethesda, US)
Homepage URL: https://app.dimensions.ai/details/grant/grant.9293321
GRANT_NUMBER: R56MH120826
Show more detailOrganization identifiers
Funding project translated title
📄 Project Abstract (from NIH)
Post-traumatic stress disorder (PTSD) has complex profiles of co-occurring medical conditions (comorbidities)
and is associated with high risk of suicide, particularly among Veterans, in which it is a leading cause of death.
There is a critical lack of advancement in PTSD pharmacotherapy, as illustrated by increased use of off-label
medications and polypharmacy (multiple drugs used simultaneously). The consequent limited evidence on the
relative risks and benefits of treatments creates a crisis in PTSD management. Moreover, PTSD and its major
comorbidities [traumatic brain injury (TBI) and suicidality] often remain undocumented in electronic health
records (EHR). There is also poor predictability of disease outcomes since there are frequent changes in
pharmacological treatment and multiple modifying comorbidities. Our long-term goal is to improve diagnostics,
secondary/tertiary prevention, and treatment outcomes of PTSD and its comorbidities via enhanced EHR
utilization. To achieve our objectives, we will analyze EHR and administrative claims data from Veterans
Administration (VA) and non-VA databases, collectively covering >2M PTSD and >2M TBI patients.
Specifically, we aim to: (1) Identify undetected PTSD, TBI, and self-harm from EHRs (using machine learning
with and without natural language language processing) to guide health service improvements. (2) Predict
PTSD clinical course in the VA population through novel modeling of disease trajectories that account for
time-varying treatments and biases (3) Compare the effectiveness of PTSD psychotropic monotherapies,
polypharmacy, and psychotherapy to guide the choice of treatment for improved patient outcomes. By
enhancing and validating a machine learning approach developed by our team, we will impute unrecorded
PTSD, TBI, and self-harm from both datasets, and characterize factors associated with documentation
disparities. We will model diseases trajectories with enhanced latent class analysis, focusing on self-harm,
substance misuse, and psychiatric hospitalization in PTSD. With Local Control methodology innovations, we
will compare the risk of PTSD in veterans with and without comorbid TBI. Finally, we will perform the largest
comparative effectiveness studies (to date) of PTSD treatments on >100 monotherapy and polypharmacy
regimens plus psychotherapy interventions. These studies will provide high-quality evidence on the risk of
hospitalizations, substance misuse, and suicidal acts/self-harm. Successful completion of these investigations
will improve the quality of decision making for providers and patients, and guide improved service delivery to
the population of veterans and non-veterans with PTSD/TBI, and/or high risk of suicide.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2022-05-31T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
A microaggregation framework for reproducible research with observational data: addressing biases while protecting personal identities ✓ NIH
National Library of Medicine (Bethesda, US)
GRANT_NUMBER: R21LM012389
Show more detailOrganization identifiers
📄 Project Abstract (from NIH)
The primary objective of the current proposal is to foster efforts towards transparent and
reproducible knowledge repositories for evidence-based medicine. The wealth of healthcare
data already available in electronic health records could be better utilized to help guide
treatment choices and compliment findings from randomized controlled trials. This proposal
addresses two major obstacles. The first is the challenge of deriving high-quality evidence from
observational data in the presence of biases and confounders, particularly with temporal data.
The second is that patient privacy and other concerns prevent disclosure of source data, which
hinders reproducible research -- currently there is a vast body of medical literature whose
findings guide clinical practice, yet cannot be independently scrutinized. We will address these
challenges through an innovative methodology, local control, which both corrects biases and
enables disclosure of question-specific microaggregated data to reproduce research findings
without disclosure of individual information. The key idea behind local control is to form many
homogeneous patient clusters within which one can compare alternate treatments, statistically
correcting for measured biases and confounders, analogous to a randomized block design. Our
methodology provides a unified framework for enabling open, high quality, comparative
effectiveness research by combining novel feature selection approaches, based on fractional
factorial experimental design, with advances in survival analysis, including competing risks. We
will create a public R package containing a family of methods for nonparametric bias correction
and statistical disclosure control in cross-sectional, case-control, and survival analysis settings.
Success of this research will also enable a novel model, we term “parcelled data sharing” to
facilitate open selective release of proprietary data sources for specific questions --
simultaneously protecting patient privacy, proprietary interests, and the public good. Our
research will contribute to the goal of evidence-based medicine being supported by national and
global knowledge bases on thousands of comparative effectiveness questions from 100’s of
millions of patients’ health records. This application supports the NLM mission by assisting in
the advancement of medical and related sciences through the dissemination and exchange of
important information to the progress of medicine and health. The specific aims are to (1)
Develop and evaluate a survival-based local control methodology for bias-corrected treatment
comparisons in time-to-event observational data; and (2) Develop and evaluate local control-
based microaggregation for reproducible research.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2019-06-30T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Longitudinal Comparative Effectiveness of Bipolar Disorder Therapies
Patient-Centered Outcomes Research Institute (Washington, US)
Homepage URL: https://grants.uberresearch.com/100006093/6bbd239d/Longitudinal-Comparative-Effectiveness-of-Bipolar-Disorder-Therapies
GRANT_NUMBER: 6bbd239d
Show more detailOrganization identifiers
Added
Last modified
Data Driven Prognostics
Department of Defense, Small Business Innovation Research (Washington, US)
Homepage URL: https://grants.uberresearch.com/100000005/177031/Data-Driven-Prognostics
GRANT_NUMBER: F33615-03-M-4122
Show more detailOrganization identifiers
Added
Last modified
Software Relating Genes to Disease and Clinical Outcomes ✓ NIH
National Institute of General Medical Sciences (Bethesda, US)
Homepage URL: https://grants.uberresearch.com/100000057/R44GM062081/Software-Relating-Genes-to-Disease-and-Clinical-Outcomes
GRANT_NUMBER: R44GM062081
Show more detailOrganization identifiers
📄 Project Abstract (from NIH)
DESCRIPTION (provided by applicant): The development of a software system is proposed that will combine statistical theory, computer science algorithms, and genetics expertise to take advantage of the great influx of data generated by the study of the human genome, clinical trials data and the creation of inexpensive genotyping techniques. This software will elucidate the complex relationship between drug efficacy and side effects, multiple interacting genes and environmental factors.
Our Phase I results show it is feasible to link phenotype to genotype for a list of "candidate" genes. A novel haplotype trend test has been developed to aid in finding associations across large SNP maps. Commercialization of this technique is essential for companies that intend to use large public or private SNP maps to locate genes that are associated with disease and drug safety and efficacy. Our statistical methods are expected to be successful even if the disease mechanism can differ from one person to another.
By analyzing and interpreting clinical trial data, the software will match drugs to target populations according to their specific genotype. This will enable pharmaceutical companies to create novel drugs that render maximum effectiveness and have minimum side effects, i.e. the right drug for the right person.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2005-12-31T00:00:00
💰 Award Amount (from NIH)
📊 Fiscal Year (from NIH)
🏷️ Activity Code (from NIH)
🔢 Project Number (from NIH)
🔗 Full Project Record (from NIH)
Added
Last modified
Software Relating Genes to Disease and Clinical Outcomes ✓ NIH
National Institute of General Medical Sciences (Bethesda, US)
Homepage URL: https://grants.uberresearch.com/100000057/R43GM062081/Software-Relating-Genes-to-Disease-and-Clinical-Outcomes
GRANT_NUMBER: R43GM062081
Show more detailOrganization identifiers
📄 Project Abstract (from NIH)
DESCRIPTION (Applicant's abstract): The development of a software system is
proposed that will combine statistical theory, computer science algorithms, and
genetics expertise to take advantage of the great influx of data generated by
both the study of the human genome and the creation of inexpensive genotyping
techniques. This software will elucidate the complex relationship between drug
efficacy and side effects, and multiple interacting genes and environmental
factors. Preliminary results, obtained by using simulated data, indicate that
it might be feasible to link phenotype to genotype for a list of "candidate
genes." The statistical methods are expected to be successful even if the
disease mechanism can differ from one person to another. By analyzing and
interpreting clinical trial data, the software will match drugs to target
populations according to their specific genotype. This will enable
pharmaceutical companies to create novel drugs that render maximum
effectiveness and have minimum side effects, i.e. the right drug for the right
person.
PROPOSED COMMERCIAL APPLICATION:
The target markets for the research include pharmaceutical companies, CRO'S
universities, and government agencies. It has good potential for commercialization
because it is expected to help create novel drugs, boost the safety of drug treatments,
save substantial resources, and make sense of complex genotype/phenotype relationships
in clinical trials context.
👤 Principal Investigator(s) (from NIH)
🏛️ Recipient Organization (from NIH)
📅 Project Dates (from NIH)
End: 2001-09-30T00:00:00
