Digital Assets – Translational Informatics Division

FACILITIES AND OTHER RESOURCES
UNIVERSITY OF NEW MEXICO

Translational Informatics Division

The Translational Informatics Division (TID) is housed in the Innovation Discovery & Training Complex (IDTC) together with the UNM Center for Molecular Discovery (UNMCMD). TID is part of the UNM Health Sciences Center, which includes the School of Medicine, UNM Hospital, and the New Mexico Cancer Center, an NCI-designated Comprehensive Cancer Center.
Founded in 2012 by Tudor Oprea (as Division Chief) and by then Chair of Internal Medicine (DoIM) Pope Moseley, TID aims to provide integration of translational informatics services in support of clinical and basic research within DoIM. TID members are formally trained across multiple disciplines including medicine, chemistry, biochemistry, genetics, informatics, computer science and engineering. As the largest Department in the entire University (over 260 Faculty, across 17 Divisions and Centers), DoIM provides direct access to clinician scientists with a variety of medical specialties and research interests. TID has also maintained collaborations with UNM’s departments of Computer Science, Mathematics and Statistics, and Chemistry and Chemical Biology, and with the two National Laboratories based in New Mexico, namely Sandia National Laboratories and Los Alamos National Laboratory. This setting combines world class biomedical research, clinical care, education and community service, and is ideally suited for translational research.

TID maintains a specially equipped server room of 190 sf hosting the clusters and enterprise servers of the Division, and a conference room of 300 sf. Our locally maintained and operated major cluster (Pinon) has 792 Intel Xeon compute cores and 33 nodes with 64 GB RAM each. Members of TID have access to UNM Center for Advanced Research Computing (CARC) which hosts a TID-dedicated computing system (Synergy) and provides access to campus-wide computing clusters. CARC currently has over 3000 CPU cores and 92k NVIDIA Tesla K40M CUDA cores spanning a variety of distributed and shared-memory architectures. Online working NAS and nearline storage is provided by the Research Storage Consortium (RSC) HP x9000/7400 system ~1.5 PB (raw) configured as RAID6, with integrated tape library for data archiving. TID’s LAN is connected via 1 Gbps+ routers, with typical Internet bandwidth ~200Mbps. Network security features include a custom dual-DMZ architecture and industry standard VPN access for a variety of high performance privacy models. Each TID member is computationally well-equipped, typically with a 64-bit 4-core 8GB RAM workstation, a high-end laptop, and access to local, UNM-CARC, and cloud-based servers and clusters.

TID’s software cyberinfrastructure includes a variety of free and commercial tools efficiently addressing a wide range of computational tasks. For any such tool, practical usability requires suitable hardware, prerequisite configurations, and expertise for effective use. A few of the enterprise server components readily available are: PostgreSql, MySql, Tomcat, Jena. For statistics, data analysis, visualization and machine learning: R, Tibco-Spotfire, Weka, Tableau, MKS-Simca, and Mesa Analytics. For cheminformatics: ChemAxon, OpenEye, RDKit, OpenBabel, Leadscope. For web development: Django, RShiny, Lift. OSs in current use include: CentOS, Ubuntu, SuSE, Mac OSX and Windows. Programming languages include: Perl, Python, R, Java, Scala, JavaScript, PHP and C++. Other (licensed, not open-access) resources include Truven MarketScan (health informatics database), Cerner HealthFacts (deidentified electronic medical records database) and Statista.com (statistics from a variety of areas, including consumer reports, pharmaceutical, etc.).

TID has access to a large variety of scientific enterprise hardware and software, as listed above. Highly relevant to this RFA are the digital assets we maintain, listed in Table 1 below. A variety of custom client tools provide access to online resources via programmable-web APIs (e.g. REST). The Department of Internal Medicine provides administrative and secretarial support.

Table 1. Digital Assets developed and maintained by TID

Resource	Category / Name	Description	Reference
Drugcentral	Database: Online drug compendium.	DrugCentral provides information on active ingredients, pharmaceutical products, drug mode of action, indications, pharmacologic action.	DrugCentral 2021 supports drug discovery and repositioning. Nucleic Acids Research, Volume 49, Issue D1, 8 January 2021. Sorin Avram, Cristian G Bologa, Jayme Holmes, Giovanni Bocci, Thomas B Wilson, Dac-Trung Nguyen, Ramona Curpan, Liliana Halip, Alina Bora, Jeremy J Yang, Jeffrey Knockel, Suman Sirimulla, Oleg Ursu, Tudor I Oprea. http://drugcentral.org
	Database: Chemical bioactivity database.	CARLSBAD contains selected high confidence bioactivities with associated protein targets, compounds, and chemical patterns, scaffolds and MCSes.	Mathias et al., The CARLSBAD database: a confederated database of chemical bioactivities. Database (Oxford). 2013; 2013:bat044. PMID: 23794735. https://datascience.unm.edu/carlsbad/
TCRD/Pharos	Database: Target central research database, exposed via Pharos, the user interface portal.	TCRD is the central resource behind the Illuminating the Druggable Genome Knowledge Management Center (IDG- KMC).	TCRD and Pharos 2021: mining the human proteome for disease biology. Nucleic Acids 2021 Jan 8. PMID: 33156327. Timothy K Sheils, Stephen L Mathias , Keith J Kelleher. Vishal B Siramshetty, Dac-Trung Nguyen Cristian G Bologa, Lars Juhl Jensen, Dušica Vidović, Amar Koleti, Stephan C Schürer, Anna Waller, Jeremy J Yang, Jayme Holmes, Giovanni Bocci, Noel Southall, Poorva Dharkar, Ewy Mathé, Anton Simeonov, Tudor I Oprea, http://pharos.nih.gov
Illuminating the Druggable Genome	Website: IDG Consortium.	The IDG website, druggablegenome.net provides timely access to all activities and information of the IDG pilot phase.	http://druggablegenome.net
Badapple	Webapp: Scaffold promiscuity detection.	Badapple is a method for rapidly identifying likely promiscuous compounds via associated scaffolds.	Yang et al., Badapple: promiscuity patterns from noisy evidence. J Cheminform. 2016:8(29):1-14. PMID: 27239230 https://datascience.unm.edu/badapple/
	Webapp: Integrative navigation in pharmacological space.	iPHACE explores the polypharmacology of drugs and cross-pharmacology of targets.	Garcia-Serna et al., iPHACE: integrative navigation in pharmacological space, Bioinformatics. 2010;26(7):985-6. PMID: 20156991 Iphace
Drug-Likeness	Webapp: Assess drug likeness.	Drug-likeness filter based on molecular fragments.	Ursu O, Oprea TI. Model-Free Drug-Likeness from Fragments, J Chem Inf Model. 2010; 50(8):1387-94. PMID: 20726597 http://pasilla.health.unm.edu/tomcat/drug-likeness/index.jsp
TIN-X	Webapp: Target Importance and Novelty eXplorer.	Interactive visualization tool for discovering interesting associations between diseases and potential drug targets.	http://newdrugtargets.org/ or http://ec2-54-148-116-22.us-west-2.compute.amazonaws.com/