New “big omics” supercomputer will speed up solutions; insights will lead to advances in a wide range of complex diseases
Mount Sinai Health System, one of New York’s largest integrated health systems, announced that it has been awarded a grant from the U.S. Department of Health and Human Services valued at $2 million to build a second “big omics data engine” (BODE 2) that will enable Mount Sinai researchers to explore more complex scientific questions more quickly.
“Supercomputers have become essential in biomedical scientific discovery, and Mount Sinai has been a leader on this front, making investments in computational and data science that are advancing our understanding of and ability to treat complex diseases,” said Dennis S. Charney, MD, Anne and Joel Ehrenkranz Dean, Icahn School of Medicine at Mount Sinai, and President for Academic Affairs, Mount Sinai Health System. “With BODE 2, we are renewing our commitment to push the boundaries of scientific research, tackle questions that we did not previously have the computational power to take on, and achieve breakthroughs that transform clinical care worldwide.”
BODE 2 will launch at the end of this year and will replace BODE, a previous supercomputer that was used by 61 basic and translational researchers at Mount Sinai representing more than $100 million in NIH funding, along with their collaborators at 75 external institutions. BODE enabled scientific findings that appeared in more than 167 publications, including Nature and Science, with a total of 2,427 citations in three years.
“BODE has proven to be a vital tool for groundbreaking research across a broad range of fields, but as Mount Sinai’s faculty continues to grow, there is concurrent growth in research initiatives, necessitating investment in a new supercomputer that has sufficient computational throughput and storage space to support this activity,” said Patricia Kovatch, Senior Associate Dean for Scientific Computing and Data Science at the Icahn School of Medicine at Mount Sinai, member of the Icahn Institute for Data Science and Genomic Technology, and Associate Professor of Genetics and Genomic Sciences, and Pharmacological Sciences.
The new BODE 2 supercomputer is a Lenovo ThinkSystem SR360 that consists of 3,840 Intel Cascade Lake cores, with 15 terabytes of memory, 14 petabytes of raw storage, and 11 petabytes of usable storage. It will produce approximately 28 million core compute hours per year at a frequency of 2.6 GHz and it will have a peak speed of 220 teraflops per second—approximately double that of BODE. Researchers will have broad, user-friendly, integrated access to more diverse data sources with robust, secure, bidirectional information flow between research and point-of-care programs. BODE 2 will also enable innovative application of translational bioinformatics research and data-driven medicine.
“Based on our experiences with BODE, BODE 2 is designed to provide our researchers and clinicians, and their external partners in Mount Sinai-led national research projects, with the necessary infrastructure to achieve faster results for greater scientific throughput, increased fidelity in their simulations and analysis, and seamless migration of research applications to the software environment for enhanced scientific productivity,” Ms. Kovatch said. “Computing capability of this size and speed is not available widely, and Mount Sinai’s investment in building this infrastructure will translate into more robust genetics and population analysis, gene expression, machine learning, and structural and chemical biology investigations, and result in new insights and advances in a wide range of diseases including Alzheimer’s, autism, influenza, prostate cancer, schizophrenia, and substance use disorders.”
Research projects that will be facilitated by BODE 2 include:
Understanding the Mechanism of SPl1-Dependent Alzheimer Disease Risk: BODE 2 will provide both the necessary storage for whole-genome-sequencing data sets from more than 10,000 study subjects and the processing power (approximately 12 million compute hours) to analyze the data using machine learning techniques. The analyses will be used to enhance current treatments or explore new therapies.
The Trans-Omics for Precision Medicine (TOPMed) Program: BODE 2 will provide the 1.75 petabytes of storage necessary for the whole-genome-sequencing data, other omics, and molecular, behavioral, imaging, environmental, and clinical data for this unprecedented exploration of the biological causes underlying heart, blood, lung, and sleep disorders. It will also provide the hundreds of terabytes required for intermediate results storage, and the approximately 7 million compute hours necessary for the highest-powered analysis of the TOPMed data. These processes can be greatly accelerated on BODE 2 versus running them on standalone machines.
“This new supercomputer will enable us to mine deep databases of genomic and clinical information using machine-learning approaches to propel the personalized medicine of today into better medicine tomorrow,” said Eimear Kenny, PhD, Associate Professor of Medicine (General Internal Medicine), and Genetics and Genomic Sciences, at the Icahn School of Medicine at Mount Sinai, Director of the Center for Genomic Health, and a Principal Investigator of the TOPMed Program. “The technology will help fuel innovative research programs to further our understanding of disease progression and management.”