Cancer Data Science Pulse

Data Standards

The world of clinical research standards, including the BRIDG Model that bridges research and healthcare, would truly not be what it is today without the significant and selfless contributions of Dr. Edward Helton.

Biomedical knowledge is typically centered around the variety of biological entity types, such as genes, genetic variants, drugs, diseases, etc. Collectively, we refer to them as "BioThings." The volume of biomedical data has grown explosively, thanks to the efforts of many different researchers and consortia. This explosive growth includes many different types of data using many different formats and standards, making it difficult to unify the disparate sources of data.

In an era of unprecedented growth in the size and variety of data sets and the number of software tools, there is an ever-increasing need for frameworks that connect and integrate data and tools within a secure and easy-to-use research ecosystem.

Broad and equitable data sharing can be interpreted many ways. For NCI's Office of Data Sharing, this means balancing the support of exciting science and innovation and the needs of research and participant communities with privacy and realistic expectations. This balance is possible when the policies we create acknowledge the benefits and challenges the public, research, and participant communities experience as they share their information to advance disease knowledge and improve healthcare.

Dr. Jaime M. Guidry Auvil serves as the director of the newly-launched NCI Office of Data Sharing (ODS). Headquartered at the Center for Biomedical Informatics and Information Technology, ODS is creating a comprehensive data sharing vision and strategy for NCI and the cancer research community.

I recently joined NCI to help support strategic data sharing and informatics projects within the Center for Biomedical Informatics and Information Technology (CBIIT). Having worked on information management at another Institute for five years and the trans-NIH Big Data to Knowledge (BD2K) initiative since its inception, this is an exciting opportunity for me to continue to contribute to enhancing data science across the biomedical community.

The NCI Data Catalog is a consolidated listing of the publicly available data collections produced by NCI initiatives.