How We Can Use Federated Data Sharing, Natural Language Processing, and Deep Phenotyping to Advance Precision Medicine
Our ability to deeply investigate the cancer genome is outpacing our ability to relate these changes to the phenotypes that they produce. Transformational change is possible but we will need to address several fundamental challenges including: (1) accurate phenotyping across entire populations of cancer patients, (2) sharing of clinical, imaging, and sequencing data associated with cancer biospecimens, and (3) processing of complex, high-dimensional data in combination with clinical data. In this CBIIT talk, I will share our experiences in two different open-source, NCI-funded projects to develop technology that can help address these fundamental challenges:
The TIES Cancer Research Network is a federated network of Cancer Centers that enables collaborative access to deidentified and NLP-processed data, images, and biospecimens across all institutions. A network “trust” agreement among all TCRN institutions, and policies for managing the network make it possible for investigators to easily access this large data set. TCRN is based on a scalable model that could support a national clinical data and resource sharing network for Precision Medicine.
The Cancer Deep Phenotyping project (DeepPhe) is a new collaboration with the Boston Children’s Hospital cTAKES team, that focuses on development of advanced methods for phenotype extraction and representation. Expected outcomes of this project will include software pipelines for processing clinical documents to extract summarizations of key cancer phenotype variables over time including stage, tumor extent, recurrence and outcome.
Rebecca Crowley Jacobson, M.D., M.S., is a Professor of Biomedical Informatics at the University of Pittsburgh School of Medicine, with secondary faculty appointments in the University of Pittsburgh Cancer Institute, Intelligent Systems Program, and Department of Pathology. She is the Chief Information Officer for the Institute for Personalized Medicine and also Director of the National Library of Medicine funded Graduate Training Program in Biomedical Informatics. Her research interests include federated data sharing, the development and evaluation of natural language processing systems to enable translational research, and the development and use of ontologies and knowledge-based systems. She was elected as a Fellow of the American College of Medical Informatics in 2010, and is the author of over 75 manuscripts. Her work has been funded by the National Cancer Institute, National Library of Medicine, Fogarty International Center, National Center for Research Resources, and the Agency for Health Care Research and Quality.
- Machine Learning in Genomics: Tools, Resources, Clinical Applications, and Ethics WorkshopApril 13, 2021 - April 14, 2021Using the Genomic Data Commons APIApril 26, 2021