The Fourth Paradigm: How Big Data is Changing Science

February 18, 2015 10:00 a.m. - 11:00 a.m. ET

This talk will describe how science is changing as a result of the vast amounts of data we are collecting from gene sequencers to telescopes and supercomputers. This “Fourth Paradigm of Science,” predicted by Jim Gray, is moving at full speed, and is transforming one scientific area after another. The talk will present various examples on the similarities of the emerging new challenges and how Jim Gray’s vision is realized by the scientific community. Scientists are increasingly limited by their ability to analyze the large amounts of complex data available. These datasets are generated not only by instruments but also computational experiments; the sizes of the largest numerical simulations are on par with data collected by instruments, crossing the petabyte threshold this year. The importance of large synthetic datasets is increasingly important, as scientists compare their experiments to reference simulations. All disciplines need a new “instrument for data” that can deal not only with large datasets but the cross product of large and diverse datasets. There are several multi-faceted challenges related to this conversion, e.g., how to move, visualize, analyze, and in general interact with petabytes of data. 

This talk will describe how science is changing as a result of the vast amounts of data we are collecting from gene sequencers to telescopes and supercomputers. This “Fourth Paradigm of Science,” predicted by Jim Gray, is moving at full speed, and is transforming one scientific area after another. The talk will present various examples on the similarities of the emerging new challenges and how Jim Gray’s vision is realized by the scientific community. Scientists are increasingly limited by their ability to analyze the large amounts of complex data available. These datasets are generated not only by instruments but also computational experiments; the sizes of the largest numerical simulations are on par with data collected by instruments, crossing the petabyte threshold this year. The importance of large synthetic datasets is increasingly important, as scientists compare their experiments to reference simulations. All disciplines need a new “instrument for data” that can deal not only with large datasets but the cross product of large and diverse datasets. There are several multi-faceted challenges related to this conversion, e.g., how to move, visualize, analyze, and in general interact with petabytes of data. 

Dr. Alexander Szalay
Alexander Szalay is a Bloomberg Distinguished Professor of Astronomy and Computer Science at the Johns Hopkins University. Dr. Szalay is the Director of the Institute for Data Intensive Science. He is a cosmologist, working on the statistical measures of the spatial distribution of galaxies and galaxy formation. He is a Corresponding Member of the Hungarian Academy of Sciences, and a Fellow of the American Academy of Arts and Sciences. In 2004 Dr. Szalay received an Alexander Von Humboldt Award in Physical Sciences and, in 2007, the Microsoft Jim Gray Award. In 2008 he became Doctor Honoris Causa of the Eotvos University, Budapest. He enjoys playing with Big Data.

Presentation

View Dr. Szalay's presentation.

Vote below about this page’s helpfulness.
CAPTCHA Image CAPTCHA

Enter the characters shown in the image.