Newswise — Since the shutdown of the Tevatron particle collider at the Department of Energy's Fermilab in 2011, there has been a concerted effort to preserve the data and rich physics legacy from the two Tevatron experiments, CDF and DZero. The Run II Data Preservation project, completed in December, enables scientists to perform publishable scientific analysis of 10 years of Tevatron Run II particle collision data through at least 2020.

Kenneth Herner and Bo Jayatilaka, co-leaders of the Run II Data Preservation Project, point out that the project enables scientists to revisit a measurement or to test new theoretical calculations long after the original Tevatron experiments ended.

“These data sets can potentially verify discoveries made at the Large Hadron Collider,” Jayatilaka said, referring to the particle collider at the European research center CERN.

“The Tevatron's unique proton-antiproton collision data set enables physics studies that are complementary to those at the LHC," Herner added.

In the world of digital science, "data preservation" means not only preservation of the data set itself, but also of the software to enable future access to that data. The Run II Data Preservation project also addressed documentation and adoption of the sustainable infrastructure needed to ensure that scientists will be able to analyze Run II data in future computing environments.

The need for sustainable data preservation will continue to increase as science advances, experiments become less replicable and data sets become increasingly specialized. Projects such as the Data and Software Preservation for Open Science and the Study Group for Data Preservation in high-energy physics are also working to expand and improve data preservation technology.

Through the Run II Data Preservation project, both CDF and DZero have adapted their data analysis techniques with the long-term computing infrastructure that is expected to be the backbone of Fermilab's physics program for years to come. Herner and Willis Sakumoto, co-leader of the effort at CDF, both emphasize that their users are now able to run their analyses in the long-term supported infrastructure without having to learn new tools.

“The project has accomplished its goal of transitioning CDF analysis infrastructure support so that we can access the data and run the software into 2020 with minimal additional cost to the base program,” Sakumoto said.

DZero users, too, are able to run their analysis using their familiar tools, Herner said.

The two-year-long preservation project was a collaborative effort of experts from the international CDF and DZero collaborations as well as computing experts at Fermilab.