Cancer Data Science Pulse

Career Confessions From a Cancer Data Scientist

Career Confessions from Cancer Data Scientists

We asked 8 of our data scientists across the National Cancer Institute to share their advice and career journeys to help you answer the question – what should I know to start my career in cancer data science?

Our participants:

1. Yuri Kotliarov, Ph.D., Computational Biologist, NCI DCTD, 2. Jay Ronquillo, M.D., DATA Scholar, NCI CBIIT 3. Dana Wolff-Hughes, Ph.D., Program Director, NCI DCCPS 4. Stephanie Harmon, Ph.D., Staff Scientist, NCI CCR 5. Yu Fan, M.S., Bioinformatician, NCI CBIIT 6. Roxanne Jensen, Ph.D., Program Director, NCI DCCPS 7. Shashi Ratnayake, M.S., Bioinformatician NCI CBIIT 8. Peng Jiang, Ph.D., Investigator, NCI CCR.

Interview Question: What do you wish knew at the beginning?

“I wish I spent more time on statistics. It will help you choose the right algorithm or model for your research.” – Yu

“Personally, I would’ve considered pursuing a computers science degree. I like coding and it also would help to understand many tools and packages used in data science.” – Shashi

“One of my early mistakes was thinking data was just data, and if you knew the structure of the data, you could analyze it. I’ve since learned that you must know both how the data was generated and the biology behind it.” – Yuri

“I didn’t think of myself as a data scientist at first, though I was working with many of the tools. Whether you start in this career or not, data science is a tool you can learn to complement whatever fields you are in.” – Dana

“Data science is not for one individual. To do some meaningful work, you cannot do it alone, but must find a group of collaborators from other fields (e.g., cancer biologists for bioinformatics) to work together and help each other.” – Peng

“The knowledge transfer we have with clinicians is so important – providing context for the deployment, and therefore development, of our algorithms. That was something that as a graduate student I didn’t ask enough questions about, or I didn’t ask them soon enough. I thought, “we’ll get to it later.” Turns out that was really important.” – Stephanie

“Biomedical informatics and data science can be applied across so many levels of healthcare, from specific conditions like cancer affecting patients, to healthcare policies reaching populations, to broad initiatives in public health and precision medicine.” – Jay

“I wish I knew more about how best to work with the different expectations between both technology and scientific research communities. Technology moves very fast – think of all the progress in the last five years! Research is slower. So, working with both can be a challenge to meet the expectations of the always developing technology sector and the steady and precise research world.” – Roxanne

Interview Question: Any advice for those starting out?

“Don’t let the learning stay in the lecture. Get hands-on experience with real world data through research or an internship as early as possible.” – Yu

“Get hands-on experience. It is critical at the beginning of your career!” – Shashi

“Jump in! There’s so many publicly available data sets and challenges that I would just encourage people to get going right away.” – Stephanie

“Data science is a team science, so don’t try to do it all yourself. There’s too much for any one person to know and there are others in the data science community who can help you.” – Dana

“Build a strong scientific and technical foundation, find good mentors who can guide you along the way, and seek out worthwhile opportunities.” – Jay

“When I started learning data science, I studied in Japan, and all textbooks around me were in Japanese. Now, there are a lot of opportunities to learn any skill you want. Take a good course in bioinformatics. Find community online and offline. Ask questions and help others.” – Yuri

“Start planning ahead for the speed at which tech develops. When the iPad came out, I thought, ‘with a silly name like that – no one’s going to use it.’ But that piece of technology changed the game for how we capture information patients for both research and in care delivery.” – Roxanne

“A thorough understanding of biology is very important. For people from mathematical and computer science backgrounds, take some time to read basic textbooks and keep an eye on new articles from biological journals. Otherwise, you might work on some problems that are not useful.” – Peng

Interview Question: What skills does a cancer data scientist need?

Programming, stats and math, science background, technology and trends, and collaboration.

Interview Question: What programming language do they work with?

R (5 of our interviewees use this), python (5 of our interviewees use this), C or C++ (3 of our interviewees use this), Matlab (2 of our interviewees use this), SPSS (2 of our interviewees use this)
Yu Fan, M.S.
Bioinformatician, NCI Center for Biomedical Informatics and Information Technology (CBIIT)
Staff Scientist, NCI Center for Cancer Research (CCR) Artificial Intelligence Resource
Program Director, NCI Division of Cancer Control and Population Sciences (DCCPS) Healthcare Delivery Research Program
Investigator, NCI Center for Cancer Research (CCR) Cancer Data Science Laboratory
Computational Biologist, NCI Division of Cancer Treatment and Diagnosis (DCTD) Biometric Research Program
Shashikala "Shashi" Ratnayake, M.S.
Bioinformatician, NCI Center for Biomedical Informatics and Information Technology (CBIIT)
Jay Ronquillo, M.D.
DATA Scholar, NCI Center for Biomedical Informatics and Information Technology (CBIIT)
Program Director, NCI Division of Cancer Control and Population Sciences (DCCPS) Epidemiology and Genomics Research Program
Older Post
Next Generation Artificial Intelligence: New Models Help Unleash the Power of AI
Newer Post
Performing a CIViC Duty—A Community-Driven Resource for Interpreting Data on Cancer Variants

Leave a Reply

Vote below about this page’s helpfulness.

Your email address will not be published.


Enter the characters shown in the image.