Cancer Data Science Pulse
Your Guide to NCI Data Science Resources for Breast Cancer Research
Find Data to Support Your Breast Cancer Research
- Genomic Data Commons
- Offers more than 9,000 breast cancer cases, including more than 80,000 files.
- Proteomic Data Commons
- Includes 10 different breast cancer studies, including more than 300 cases.
- Imaging Data Commons
- Contains nearly 15,000 public breast cancer cases.
- Human Tumor Atlas Network
- Has three different breast cancer atlases in the network:
Get Tools You Can Use
|Search biomarkers, data sets, and collections with the Early Detection Research Network.|
Request access to controlled data through the database of Genotypes and Phenotypes, which has 82 breast cancer studies with 112 data sets.
|Download medical images of breast cancer through The Cancer Imaging Archive.|
|Keep an eye out for the Confluence Project, which is developing a large research resource to uncover breast cancer genetics through genome-wide association studies. Data from the project is set to be available for request in 2023.|
Keep Up with NCI’s Data Science Contributions to Breast Cancer Research
Subscribe to receive weekly NCI Data Science updates in your inbox, including upcoming events, the latest blog posts, and recent news releases.
- Read how researchers studied resistance to chemotherapy in triple-negative breast cancer using NIH’s database of Genotypes and Phenotypes and the Proteomic Data Commons.
- Listen to the recording of Dr. Maryellen L. Giger at the April 2022 NCI Imaging and Informatics Community Webinar. She spoke on the development, validation, database needs, and future implementation of artificial intelligence in the clinical radiology workflow, including case studies of breast cancer and COVID-19.
- Explore two breast cancer imaging data collections that The Cancer Imaging Archive has released and made publicly available. The collections are affiliated with the I-SPY Trial, an ongoing clinical trial.
- Download 1,400 files of open-access proteomic data related to breast, ovarian, and pediatric cancer studies.
Read a CBIIT Spotlight
Daoud Meerzaman, Ph.D., the Computational Genomics and Bioinformatics Branch chief for CBIIT, contributed to a study focusing on bioinformatic benchmarks for genetic variants in two distinct cell lines linked to breast cancer. This study used data from multiple next-generation sequencing to detect and confirm germline and somatic variants in two distinct cell lines linked to breast cancer.
We’ve reached out to Dr. Meerzaman since the 2021 publication, and he tells us the paper has been cited more than 100 times.
“Like other cancer types, breast cancer is a disease that involves gene mutations. Robust and accurate mutation caller algorithms are of critical importance. Benchmarking of newer algorithms is required to identify and implement the best practices in the identification of mutations. This is not only crucial for computational scientists to identify mutation callers that are accurate and trustworthy, but it is more critical for clinicians to trust these mutations because, eventually, physicians base their treatment decision using these mutations.”
“Previous research projects focused on using a single variable such as DNA, RNA, or protein. Fortunately, the paradigm is shifting, and scientists are using artificial intelligence and machine learning to carry out integrative research approaches utilizing radiological and pathological images with genomic variation to improve cancer diagnosis and treatment,” he adds.
Dr. Meerzaman also works with the Applied Proteogenomics OrganizationaL Learning and Outcomes (APOLLO) network, a collaboration between NCI, the Department of Defense, and the Department of Veterans Affairs. APOLLO’s first phase is focused on the full proteogenomic profiling of cancers of the lung, ovary, endometrium, prostate, and breast.
The APOLLO network is featured in NCI’s Data Science Time Capsule, along with several other themes showcasing the status of cancer research and data.
Leave a Reply
- Data Sharing (63)
- Informatics Tools (35)
- Training (34)
- Genomics (33)
- Data Commons (32)
- Data Standards (32)
- Precision Medicine (27)
- Seminar Series (22)
- Data Sets (21)
- Machine Learning (20)
- Artificial Intelligence (16)
- Leadership Updates (12)
- High-Performance Computing (HPC) (9)
- Imaging (9)
- Policy (8)
- Funding (6)
- Jobs & Fellowships (6)
- Proteomics (4)
- Semantics (4)
- Publications (2)
- Information Technology (2)
- Awards & Recognition (1)
- Childhood Cancer Data Initiative (1)
- Request for Information (1)