Career Confessions From a Cancer Data Scientist Alternative Text
Career Confessions from Cancer Data Scientists
We asked 8 of our data scientists across the National Cancer Institute to share their advice and career journeys to help you answer the question – what should I know to start my career in cancer data science?
Our participants:
1. Yuri Kotliarov, Ph.D., Computational Biologist, NCI DCTD, 2. Jay Ronquillo, M.D., DATA Scholar, NCI CBIIT 3. Dana Wolff-Hughes, Ph.D., Program Director, NCI DCCPS 4. Stephanie Harmon, Ph.D., Staff Scientist, NCI CCR 5. Yu Fan, M.S., Bioinformatician, NCI CBIIT 6. Roxanne Jensen, Ph.D., Program Director, NCI DCCPS 7. Shashi Ratnayake, M.S., Bioinformatician NCI CBIIT 8. Peng Jiang, Ph.D., Investigator, NCI CCR.
Interview Question: What do you wish knew at the beginning?
“I wish I spent more time on statistics. It will help you choose the right algorithm or model for your research.” – Yu
“Personally, I would’ve considered pursuing a computers science degree. I like coding and it also would help to understand many tools and packages used in data science.” – Shashi
“One of my early mistakes was thinking data was just data, and if you knew the structure of the data, you could analyze it. I’ve since learned that you must know both how the data was generated and the biology behind it.” – Yuri
“I didn’t think of myself as a data scientist at first, though I was working with many of the tools. Whether you start in this career or not, data science is a tool you can learn to complement whatever fields you are in.” – Dana
“Data science is not for one individual. To do some meaningful work, you cannot do it alone, but must find a group of collaborators from other fields (e.g., cancer biologists for bioinformatics) to work together and help each other.” – Peng
“The knowledge transfer we have with clinicians is so important – providing context for the deployment, and therefore development, of our algorithms. That was something that as a graduate student I didn’t ask enough questions about, or I didn’t ask them soon enough. I thought, “we’ll get to it later.” Turns out that was really important.” – Stephanie
“Biomedical informatics and data science can be applied across so many levels of healthcare, from specific conditions like cancer affecting patients, to healthcare policies reaching populations, to broad initiatives in public health and precision medicine.” – Jay
“I wish I knew more about how best to work with the different expectations between both technology and scientific research communities. Technology moves very fast – think of all the progress in the last five years! Research is slower. So, working with both can be a challenge to meet the expectations of the always developing technology sector and the steady and precise research world.” – Roxanne
Interview Question: Any advice for those starting out?
“Don’t let the learning stay in the lecture. Get hands-on experience with real world data through research or an internship as early as possible.” – Yu
“Get hands-on experience. It is critical at the beginning of your career!” – Shashi
“Jump in! There’s so many publicly available data sets and challenges that I would just encourage people to get going right away.” – Stephanie
“Data science is a team science, so don’t try to do it all yourself. There’s too much for any one person to know and there are others in the data science community who can help you.” – Dana
“Build a strong scientific and technical foundation, find good mentors who can guide you along the way, and seek out worthwhile opportunities.” – Jay
“When I started learning data science, I studied in Japan, and all textbooks around me were in Japanese. Now, there are a lot of opportunities to learn any skill you want. Take a good course in bioinformatics. Find community online and offline. Ask questions and help others.” – Yuri
“Start planning ahead for the speed at which tech develops. When the iPad came out, I thought, ‘with a silly name like that – no one’s going to use it.’ But that piece of technology changed the game for how we capture information patients for both research and in care delivery.” – Roxanne
“A thorough understanding of biology is very important. For people from mathematical and computer science backgrounds, take some time to read basic textbooks and keep an eye on new articles from biological journals. Otherwise, you might work on some problems that are not useful.” – Peng
Interview Question: What skills does a cancer data scientist need?
Programming, stats and math, science background, technology and trends, and collaboration.
Interview Question: What programming language do they work with?
R (5 of our interviewees use this), python (5 of our interviewees use this), C or C++ (3 of our interviewees use this), Matlab (2 of our interviewees use this), SPSS (2 of our interviewees use this)