Cancer Data Science Pulse

Whose Data Is It, Anyway?

February 10–14, 2025 is “Love Data Week,” which is why we’re talking to Drs. Joe Flores-Toro and Mousumi Ghosh from NCI’s Office of Data Sharing (ODS) about this year’s theme: “Whose data is it, anyway?” By the end of this blog, we hope you’ll better understand the different ways of perceiving “ownership,” how NCI is addressing researcher concerns, and how readers like you can get involved in the conversation. Thank you, Drs. Flores-Toro and Ghosh, for joining us!

Let’s begin with the thematic question, “Whose data is it?” Is the answer a simple or complex one, and why?

Dr. Ghosh: You’ll hear a wide range of responses to the question, and a lot of it depends on who you ask and what the context is. So, I’d say it’s a complex answer. 

Institutions are responsible for ensuring the accuracy and protection of data, making them custodians or stewards rather than owners. Researchers who generate data may have certain rights regarding its use (such as through intellectual property agreements) but they don’t own the data. When patients are involved—such as in clinical trials—the data originates from them, and they have rights related to privacy, consent, and access. While patients do not typically ‘own’ the data, they maintain control over how others use, share, and protect their personal information. Ownership is nuanced, and we always consider the policies, regulations, and ethical considerations that govern it. 

Dr. Flores-Toro: A lot comes down to how you define it: Are we talking about ownership in the context of who the data belong to, or are we talking about ownership in terms of who is responsible for it? In my mind, the data always belong to the patients in some capacity. They’re the people we protect, and they’re the ones who consent to have their data used or shared in a particular way. When we start considering responsibility though, ownership falls under the purview of whomever is holding or collecting the data. 

Why does defining data ownership matter?

Dr. Flores-Toro: It’s important to establish who’s responsible for maintaining, distributing, and stewarding the data. When we’re looking at data ownership in terms of responsibility, whoever owns the data needs to ensure they’re protecting those data and the patients’ privacy.

Dr. Ghosh: Agreed. Data ownership is often a first step in developing a data governance framework. When we assign ownership, it helps ensure security and quality of the data. 

When we talk about data ownership, what are some of the common points raised?

Dr. Ghosh: There are often concerns about data authority, misrepresentation or misuse of shared data, and the fear that someone else might publish similar findings first. Researchers who generate data may struggle with a desire to exclusively control their data, while at the same time feel it’s their responsibility to share it. Part of the struggle arises from the uncertainty about how data ownership and sharing will affect their work and careers. 

Dr. Flores-Toro: There are also conversations happening around who profits or gains from the use of data, and I think we’ll see those discussions continuing in the future.

How can NCI help address the concerns of cancer researchers and data scientists who are hesitant to share their data?

Dr. Ghosh: There are real concerns from researchers in the publishing-centric academic culture. Their careers and ability to secure grants are closely connected to their publications, so there’s understandable worry about others potentially using their data first. That’s why it’s very important to encourage efforts to track data provenance and robustly reward those who generate the data. NIH and NCI are playing a role in these efforts, for example, by working towards including a section for data sets on people’s biographies or annual progress reports, and by putting mechanisms in place to really track data provenance.

One concrete example is the Index of NCI Studies (INS), which ODS developed. INS catalogues not only publications but also the data sets associated with research funding, recognizing data as a bona fide research product.

Dr. Flores-Toro: Adding to what Mousumi said, I think it would also be helpful to have a section in grant funding applications for people to note the data sets they’ve shared, the citations or publications their data sets are in, you know…showing the reach of their data. And then something that NCI’s already doing is hosting the annual ODS symposium with panels and breakout sessions to help the data community connect, voice concerns, and get a better understanding of the current culture of data sharing.

You can also access a recording of the 2023 symposium and the 2024 symposium.

At the upcoming ODS symposium this fall, we plan to host discussions on data management and sharing practices to gather insights from the community and understand their perspectives.

What can someone do to better understand the issue of data ownership?

Dr. Flores-Toro: From a clinician or investigator’s point of view, make sure you have a good understanding of the requirements of your Data Management and Sharing plan, because that ties into the responsibility side of data ownership.

Dr. Ghosh: I second what Joe said. Engage in discussions through conferences, research forums, or advocacy groups to gain insight into evolving policies and ethical considerations. Understanding data ownership also involves exploring frameworks like Health Insurance Portability and Accountability Act and Indigenous Data Sovereignty, which define rights and responsibilities around data use.

Dr. Flores-Toro: I’d also suggest monitoring for and responding to requests for information that ODS, NCI, or NIH releases. Anyone (across the spectrum of educational background and experience) can respond to those and get involved by sharing their input.

Another great way to get and stay involved is by signing up for the ODS newsletter so you’ll be aware of new information or events you can attend. 

Are there things we could do to better ensure we’re using and sharing patient data in responsible, appropriate ways?

Dr. Flores-Toro: There are certainly a lot of policies and guidelines in place regarding patient consent, and one thing that we always stress in ODS when we consult on patient consent language is to make sure that there’s language about consenting for secondary use of the data. Sometimes, patient consent is only for collection and then primary research use, and we need to make sure that we have the proper consents to share data so we can maximize their utility, while still protecting the patient. 

Something we’re working on is continuing to improve how we communicate the requirements around things like data sharing and consent to the people who want, or are required, to share data. With clearer requirements, you’ll also be able to better use and share data. 

Is there anything you hope people will do after they read this?

Dr. Ghosh: I’m always on the lookout for examples from people that demonstrate the value of data reuse and data sharing, so I’d love to see researchers elevating conversations around the value of data reuse.

Dr. Flores-Toro: Like I mentioned earlier, I hope readers will get engaged with our office, ODS, by joining our mailing list so they can get important updates. I want to hear feedback from the community, whether it’s good or bad. We really do understand and commiserate with pain points as much as the successes within the research community. 

If you’re a researcher or clinician, you can check to see what resources your institution has, because most likely their library systems will have a lot of resources to help prepare for, and adhere to, the requisite data sharing policies. They can also help with legal counsel or getting access to data management software.

Dr. Ghosh: Absolutely. I’d ask our readers to start with understanding what tools for managing and sharing data are available to them, and their home institutions are a great place to start. They can really help you make sure your data are useful when shared.

Dr. Flores-Toro: One last suggestion; if you’re a researcher working with patients, consider how you can help them join or contribute to a cancer advocacy group. These groups are so important for expressing patient population interests to entities like the U.S. Congress.

Do you have a question or want to contribute to the conversation about data ownership? We invite you to leave a comment! Please be aware that we will review and publish comments in accordance with the NCI Comment Policy.
Health Scientist Administrator
Health Scientist Administrator
Older Post
Digital Twins for Cancer—Not If, But When, How, and Why?

Leave a Reply

Vote below about this page’s helpfulness.

Your email address will not be published.