News
New Platform for Prioritizing Genetic Variants Underlying Cancer Risk
Are you having trouble prioritizing which genetic variants to study in your cancer research? There’s a new platform available, called FORGEdb, designed to address this challenge. This online tool draws on multiple data sets (from sources such as ENCODE, Roadmap Epigenomics, and BLUEPRINT), offering information on more than 59 million variants.
Researchers from NCI’s Division of Cancer Epidemiology and Genetics, led by Dr. Charles Breeze, spearheaded a worldwide effort to develop FORGEdb, a web-based tool to rank genetic variations according to their relevance.
CBIIT staff, including Brian Park, Kailing Chen, Madhu Kanigicherla, and Ben Chen, were instrumental in refining FORGEdb. Ms. Madhu Kanigicherla said, “We faced an unprecedented challenge with FORGEdb. We needed to transfer petabytes of data to the cloud and make that data easy to access and use.”
Their solution: a modular application programming interface (API), featuring billions of annotations for more than 59 million variants. The API lets us quickly import data sets, with zero downtime. Moreover, the API is fully open access so anyone can integrate FORGEdb data into their own websites (with proper attribution to the original source).
With the addition of this API, once you submit a request, you’ll receive a response in milliseconds. You can easily check the scores of a certain variant. A high FORGEdb score indicates the variant is likely to have a significant impact on disease. To achieve a high FORGEdb score, a variant/candidate must be:
- associated with a specific regulatory element class (e.g., elements that enhance gene expression),
- supported by high-quality data from multiple biological samples, and
- represented in several different lines of experimental evidence.
As noted by Dr. Charles Breeze and his coauthors, “Gathering relevant information from many different data sources and linking the data to individual genetic variants can be challenging in terms of computational resources, data processing, quality control, and reproducibility.”
He added, “We developed FORGEdb to help researchers prioritize candidate regulatory variants, gene targets, and mechanisms underlying cancer risk that can be targeted for future therapies.”