Cancer Data Science Pulse

How the Mitelman Database Can Help You Explore Genomic Abnormalities

A previous blog discussed how one of NCI’s Cloud Resources, the Institute for Systems Biology Cancer Gateway in the Cloud (ISB-CGC), provides researchers with shortcuts to data analysis by making important clinical, genomic, and proteomic data available in Google BigQuery tables.

ISB-CGC is also home to some standalone databases of genomic importance. Today, we highlight one of these—the Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer and the recent addition of genomic coordinates to its user interface. 

Alt Text: Screenshot of the ISB-CGC Cancer Gateway in the Cloud homepage, showing the Mitelman Database as the third panel in the Data Browser Section, “Chromosomal Aberrations & Gene Fusions DB”
The Mitelman Database can be accessed through the ISB-CGC homepage.

The Mitelman Database: A Goldmine of Cytogenetic Data Linked to Cancer

Cytogenetic analysis is the process of examining chromosomes, especially to look for abnormalities such as missing, extra, broken, or rearranged chromosomes. When applied to tumors, cytogenetic analysis can provide crucial information about the genetic mechanisms of cancer.  

The Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer catalogs over 70,000 such acquired chromosome aberrations for multiple types of cancer. The database relates cytogenetic aberrations and their genomic consequences, in particular gene fusions, to tumor characteristics. The information has served the research community for decades, but only recently has lived online:

  • 1983: Began as a book
  • 2000: NCI made the information available online
  • 2019: ISB-CGC began hosting and supporting the database on its cloud platform 

Dr. Felix Mitelman, in collaboration with Drs. Bertil Johansson and Fredrik Mertens, manually culled all the data from the literature. NCI, the Swedish Cancer Society, and the Swedish Childhood Cancer Foundation support the Mitelman Database. These organizations update the database quarterly in January, April, July, and October.

You can query the database by parameters such as topography, morphology, gene characteristics, cytogenetic aberrations, and journal references. 

Screenshot of the database’s cytogenetics searcher tool which has filter options for abnormalities, topography, or morphology.
The Mitelman Database Cases Cytogenetics Searcher

Adding Genomic Coordinates Increases Data Collaboration Opportunities

Until recently, the database only displayed the resulting genetic location information in karyotypes ; this tells where a gene is by referencing its physical location on a band of the arm of a human chromosome. 

Nowadays, much genomic research data describes gene locations in molecular terms instead of physical location; that is, by using precise nucleotide start and stop positions on the chromosome. Though the karyotype data in the Mitelman Database has been very useful, mapping it to molecular genomic coordinates increases the range of research data that it can be combined with to make significant scientific discoveries. 

As of June 2022, the Mitelman Database also displays genomic coordinates on the user interface. Thanks to procedures incorporated from the web-based tool CytoConverter, you can view the genomic coordinates translated from the chromosomal imbalances identified in the karyotype nomenclature.

You have the option of viewing the genomic coordinate information for either individual karyotypes or for multiple karyotypes in a search result providing the following information:

  • Corresponding chromosome
  • Start and end position
  • Type of imbalance (i.e., gain or loss). For multiple chromosomes, net imbalances across the selected group are displayed in chart, ideogram, or tabular format.
Screenshot of the overall chromosomal imbalances for chromosome 1. The view is broken into three tabs showing charts, ideograms, and data. Two charts on the left side plot the frequency of 1 and 1> extra copies on different positions. The chart on the right plots the frequency of Loss of 1 copy and Homozygous deletions on different positions.
An example of the Mitelman Database View Overall Chromosomal Imbalances screen. The abnormalities of the chromosomes and their genomic coordinates have been calculated by CytoConverter.

Mitelman Data on the Cloud

Like the rest of the data that ISB-CGC hosts, the Mitelman data, including the CytoConverter-generated genomic coordinates, are also publicly available in Google BigQuery, a cloud-based data warehouse formatted in data matrices. This allows researchers to analyze the data using Structured Query Language (SQL) and tools such as Python and R and to combine the data with other data sets. The ISB-CGC team has provided examples on Github, which researchers can use as templates for their own data exploration.

Deena Bleich
Bioinformatician, Institute for Systems Biology Cancer Gateway in the Cloud
Older Post
FireCloud: A Secure Platform For Data Analysis Powered by Terra
Newer Post
Visualizing Data Using Circular Heatmaps and Biplots—Pro-Tips From NCI Researchers

Leave a Reply

Vote below about this page’s helpfulness.

Your email address will not be published.

CAPTCHA Image CAPTCHA