Predictive Modeling: The Basics
What is Predictive Modeling?
Predictive modeling builds a mathematical description of a process to make accurate, data-driven predictions about future outcomes. This contributes to:
- increased patient treatment and care,
- improved clinical decisionmaking,
- created risk models to assist with cancer prevention, and
- expanded fundamental understanding of cancer etiology.
Why is Predictive Modeling Important for Cancer Research?
Predictive modeling uses advanced numerical methods, mathematics, and computer science to help researchers like you anticipate what might happen within the realm of cancer care. Predictive modeling is essential in oncology because many early-stage cancers don’t show symptoms. Consequently, doctors often rely on predictions to decide if a patient should undergo treatment. Fortunately, this prediction process has evolved over time to the point that a model can consider an individual’s genomics without compromising personal privacy. (Try your own data in this tool that NCI staff members helped develop.)
Examples of predictive modeling used in the cancer research field include, but are not limited to, the following:
- Drug development and discovery identify potential anticancer compounds and drug candidates.
- Risk prediction models incorporate factors such as age, genetics, lifestyle, and medical history to estimate an individual’s risk of developing a particular cancer.
- Researchers use predictive modeling to perfect radiation therapy planning.
These predictive models are like powerful calculators that help us better understand a patient by considering factors such as patient information, genetics, and treatment history. There are two types of models: mechanistic and non-mechanistic. The former relies on mathematical descriptions of the disease process that are put to the test by the accuracy of the predictions. The latter includes a wide variety of techniques, ranging from training artificial intelligence engines to describe the relation between variables to forecasting entirely based on past occurrences. (Try your own data in this tool that NCI staff members helped develop.)
What Do I Need to Know?
Fundamental Tips for Practicing Predictive Modeling
- Have an interest in data analysis.
- Predictive modeling involves working with large and complex data sets. You’ll enjoy this stage of the data science lifecycle if you appreciate digging into data, finding patterns, and drawing insights from numbers.
- Develop a programming habit.
- Coding doesn’t have to be intimidating, but if you can learn, predictive modeling often involves writing and implementing algorithms. Proficiency in JavaScript, Python, or R can be highly beneficial.
- Don’t be afraid to learn new mathematical methods and devise novel statistical procedures.
- When understanding algorithms and interpreting the results in physics, Galileo Galilei, astronomer, physicist, and engineer, once said, "Mathematics is the language of nature.” It really is!
- Be ready for a team-driven, hands-on challenge.
- Predicting real-world outcomes often requires creativity in adapting and combining methods to suit the specific problem. Approaching it as collaborative interdependence is key.
- Find an open-source repository/community.
- Share what you do and discover what others are attempting through data science communities. For example, GitHub lets you host your live applications, serve packages without loss of attribution, manage your projects efficiently, collaborate effectively, showcase your skills, and be a part of a lively community of data scientists. NCI has multiple GitHub webpages with hundreds of repositories to search through.
- Become nimble and ever-present.
- Predictive modeling is constantly evolving with new algorithms, tools, and data sources. Find a computational environment where you are comfortable; computing moves you from being a consumer to creating opportunities for secondary data analysis.
NCI Predictive Modeling Resources and Initiatives
Now that you have a sense of the basics, use the following resources to discover more about the topic and understand NCI’s investment in this stage of the data science lifecycle.
Resources and Tools
- Genomic Data Commons (GDC): The NCI GDC uses predictive modeling techniques to analyze vast amounts of genomic data from various cancers. This helps identify genetic mutations and alterations contributing to cancer development, leading to insights for targeted therapies and precision medicine.
- Imaging Data Commons (IDC): NCI’s IDC makes predictive modeling tools available for analyzing and interpreting medical images.
- SEER Cancer Statistics: NCI’s Surveillance, Epidemiology, and End Results (SEER) program uses predictive modeling to estimate cancer incidence, mortality, and survival rates. These data help researchers and policymakers understand cancer trends, allocate resources, and develop effective prevention and treatment strategies.
- Predictive Oncology Model and Data Clearinghouse (MoDaC): MoDaC is a data repository and model clearinghouse, which consists of predictive oncology data sets and mathematical models (such as machine learning and deep learning models) developed within NCI and in collaborative programs.
Blogs
- Blending Weather Forecasting with Team Science Leads to Advances in Cancer Immunotherapy: Read more about a mathematician who’s using data and computational methods (a combination more regularly exercised in the field of meteorology) to better understand how cancer will respond to treatments.
- Meteorology, Aerospace Engineering, and Cancer Research—The Future of Predictive Modeling: A January 2021 workshop panel explored how computing and predictive modeling can positively influence how to target and apply radiation to treat cancerous tumors.
- Next Generation Artificial Intelligence (AI): New Models Help Unleash the Power of AI: Read about the paradigm shifts of today that will influence the research discoveries of tomorrow.
- Ready to start your project? Get an overview of the data science lifecycle and what you should do in each stage.
- Want to learn the basic skills for cancer data science? Check out our basics skills video course.
- Need answers to data science questions? Visit our Training Guide Library.