Deep Learning Methods for Scalable Information Extraction From Path Reports: An Update from the NCI-DOE Pilot for Cancer Surveillance
Pathology reports are a primary source of information for cancer registries, which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. In this talk we will present an update on the NCI-DOE pilot for cancer surveillance, discussing deep learning technology developed and highlighting both theoretical and practical perspectives that are relevant to natural language processing of clinical reports. Using different deep learning architectures, we will present benchmark studies for various information extraction tasks and discuss their importance in supporting a comprehensive and scalable national cancer surveillance program.
Dr. Gina Tourassi is the founding Director of the Health Data Sciences Institute and Group Leader of Biomedical Sciences, Engineering and Computing at the Oak Ridge National Laboratory (ORNL). Concurrently, she holds appointments as an adjunct Professor of Radiology at Duke University and the University of Tennessee and as a joint UT-ORNL Professor of Mechanical, Aerospace, and Biomedical Engineering at the University of Tennessee at Knoxville. Her research interests include medical imaging, biomedical informatics, clinical decision support systems and data-driven biomedical discovery. Her scholarly work has led to nine U.S. patents and innovation disclosures and more than 230 peer-reviewed journal articles, conference proceedings articles, and book chapters. Her research in medical imaging has been featured in numerous high-profile publications such as the MIT Science and Technology Review, Oncology Times and the Economist. Dr. Tourassi has served as Associate Editor of the scientific journals Radiology and Neurocomputing, and as a Guest Associate Editor of Medical Physics. She is elected Fellow of the American Institute of Medical and Biological Engineering (AIMBE), the American Association of Medical Physicists (AAPM) and the International Society for Optics and Photonics (SPIE). For her leadership in the Joint Design of Advanced Computing Solutions for Cancer initiative, she received the DOE Secretary’s Appreciation Award in 2016. In 2017, she received ORNL Distinguished Researcher award and Director’s Award for Outstanding Individual Accomplishment in Science and Technology. Dr. Tourassi holds a B.S. degree in Physics from Aristotle University of Thessaloniki, Greece, and a Ph.D. in Biomedical Engineering from Duke University.
Dr. Paul Fearn is Chief of the Surveillance Informatics Branch for the National Cancer Institute (NCI) Surveillance Research (SEER) Program, advancing applications of natural language processing, machine learning, and other informatics tools and methods to support cancer registries and cancer surveillance. Previously, he was Director of Biomedical Informatics at Fred Hutchinson Cancer Research Center and instigator of the Hutch Integrated Data Repository and Archive (HIDRA). He has been the Informatics Manager for the Department of Surgery and the Office of Strategic Planning and Innovation at Memorial Sloan-Kettering Cancer Center (MSKCC), where he initiated and led the Caisis project, an open-source system that is currently used at multiple centers. Paul has a B.A. in Spanish from the University of Houston, biostatistics training from the University of Texas School of Public Health in Houston, an M.B.A. from the New York University Stern School of Business, and a Ph.D. in Biomedical and Health Informatics from the University of Washington School of Medicine. He has more than 20 years of experience in cancer research informatics at Baylor College of Medicine, MSKCC, Fred Hutch and with the NCI SEER program.
- Machine Learning in Genomics: Tools, Resources, Clinical Applications, and Ethics WorkshopApril 13, 2021 - April 14, 2021Using the Genomic Data Commons APIApril 26, 2021