CI Compass leads archiving, long-term data preservation conversation at 2024 NSF Research Infrastructure Workshop
As scientific data at U.S. National Science Foundation (NSF)-funded mid-scale and major facilities continues to grow exponentially with the help of advanced instrumentation and increased computing power, the challenge of preserving, archiving, and keeping that data accessible grows with it. In March, the U.S. National Science Foundation’s (NSF) CI Compass continued the NSF Major Facilities (MF) cyberinfrastructure conversation on efforts and best practices concerning archiving and long-term preservation of scientific data at the 2024 NSF Research Infrastructure Workshop (RIW). The focus topic was developed in direct response to feedback that CI Compass received during its 2024 Cyberinfrastructure for NSF Major Facilities (CI4MF 2024) workshop.
During CI4MF, the FAIR Data (Findable, Accessible, Interoperable, and Reusable) Topical Working Group organized a panel titled “Major Facilities’ Approach to Open Science”

where questions about the greater need to focus on archiving and preservation were posed.
CI Compass participated in the NSF Research Infrastructure Workshop in 2022 and 2023, as well. CI Compass leadership welcomes the continued opportunity to work with the NSF Research Infrastructure Office, the NSF Office of Advanced Cyberinfrastructure, and the research facilities on-site, as well as the NSF center dedicated to cybersecurity, Trusted CI, to continue facilitating progress.
“Continuing to connect with cyberinfrastructure practitioners across the wide spectrum of research infrastructures in the NSF’s science and engineering eosystem is core to CI Compass’s mission,” said Ewa Deelman, Director of CI Compass, research professor of computer science and principal scientist at the University of Southern California Information Sciences Institute. “The data lifecycle in each research facility continues to change and challenge technologies and cyberinfrastructure professionals alike. At RIW, we focused particularly on data archiving. We want to continue seeking best community practices and collaborate on solutions to these challenges.”
Archiving and Long-term Preservation of Scientific Data Panel
Since the release of “Ensuring Free, Immediate, and Equitable Access to Federally Funded Research” by Alondra Nelson, Deputy Assistant to the President and Deputy Director for Science and Society Performing the Duties of Director, from the White House Office of Science and Technology Policy (OSTP), commonly known as the “Nelson Memo,” in August 2022, cyberinfrastructure practitioners and data managers have been working on plans to comply with the memo’s December 31, 2025 deadline.
Don Brower, CI Compass FAIR data expert, research assistant professor, and computational scientist at the University of Notre Dame, moderated a panel as a part of the NSF RIW. The panel, “Archiving and Long Term Preservation of Scientific Data: Considerations, Approaches, Challenges, and Best Practices,” was hosted on the first day of the NSF RIW. Angela Murillo, CI Compass co-principal investigator and director of the student fellowship program, worked on putting the panel and questions together.
“From CI Compass’s past work and the discussions we have at the FAIR Data Topical Working Group each month, we know these complex topics are on the mind of researchers at major facilities,” said Murillo. “By fostering a discussion specifically on long-term data preservation, we hope to inspire facilities to begin addressing this big challenge by sharing experiences, resources, and potential solutions with each other."
Joining the panel were a group of cyberinfrastructure practitioners concerned with data preservation, including Bruce Berriman, senior astronomer and data scientist, Infrared Processing and Analysis Center (IPAC), California Institute of Technology; Adam Bolton, astrophysicist and data scientist, director, Community Science and Data Center (CSDC), NSF NOIRLab; and Jeannette Jackson, managing director, Research Data Ecosystem, University of Michigan.
Jackson spoke about the importance of data preservation which allows for ongoing discoveries that scientists make in existing datasets, and the importance of keeping abreast of what is needed to continue to make data available for the long term, especially as instruments and systems are continually evolving.

“The data being produced in the infrastructure world is going to be important for the ongoing ability to connect it to different data types. Scientists can and will come up with new questions making possible new scientific discoveries in ways that were not possible before,” Jackson said.
“Make no small plans,” was one of Berriman’s leading thoughts concerning data and archives management, especially about responding to continual change in science requirements by agencies like the NASA and NSF and continuing to support and innovate technology forward to keep data safe and accessible.
“The NSF Rubin Observatory will report around 10 million transient alerts per night, and those alerts require rapid follow-ups,” Berriman said.
Bolton brought another astronomy-focused perspective to the panel. As human knowledge, instrumentation, and software continue to advance, new revelations appear in old datasets.
“Scientists can make big discoveries within our data archives,” Bolton said.
Bolton referenced the “killer asteroids hiding in large catalogs” discoveries announced in May 2022, and publicized in the New York Times. The B612 Foundation, a nonprofit research group, developed and applied a computational program to the multi-Petabyte NOIRLab archive, to discover previously unknown celestial bodies and asteroids hiding in existing catalogs and images.
After the formal panel discussion, questions from the audience were about the data usage models and evolving trends, data consent and usage in the health and social science spaces, and the current limitations of existing data management software.
The discussion for the panel continued past the official end of the session, prompting more connections to be made between centers, facilities, and educators.
“Managing metadata is crucial to a successful archive for self-documentation and data discovery, though it is often difficult and time consuming,” said Berriman.
“Facilities are facing challenges with legacy systems and their unique instrumentation made to collect and store very specific types of data,” said Brower. “We want to ensure that best practices are considered as the next versions of those systems are created. The next versions need to support continued scientific research, comply with federal data mandates, and help to continue making discoveries and pushing innovations.”
Cyberinfrastructure Workshop
As the NSF Research Infrastructure Workshop’s Cyberinfrastructure Track kicked off, three more sessions were hosted by organizers outside of the Archiving and Long-term Preservation of Scientific Data panel.

Two sessions focused on cyberinfrastructure and ecosystems that exist outside of the NSF research infrastructure. These sessions presented different perspectives of approaches from agencies other than NSF in an effort to challenge processes and present similar situations throughout cyberinfrastructure facilities.
Debbie Bard, data department head at the National Energy Research Scientific Computing Facility (NERSC), and the Lawrence Berkeley National Laboratory, presented a talk titled “Department of Energy (DOE) Integrated Research Infrastructure (IRI) and the Advanced Scientific Computing Research.”
The second talk was titled “Overview of Canada Foundation for Innovation Research Infrastructure and Some of the Shared Data Challenges,” with both Mark Legace, director of programs, and Claudia Fall, associate director for research facilities, representing the Canada Foundation for Innovation.
The final session of the Cyberinfrastructure Track was titled “Digital Backbone: Navigating the Research Infrastructure Guide (RIG) Revisions to Cyberinfrastructure and Cybersecurity.” The session was led by Bill Miller, senior advisor for cyberinfrastructure; Office of Advanced Cyberinfrastructure, NSF; Michael Corn, cybersecurity advisor for research, Office of the Director, NSF; and Alison Rockwell, research infrastructure advisor, Research Infrastructure Office, NSF.
Poster Sessions

After the Cyberinfrastructure Track sessions, CI Compass took part in an in-person poster session where CI Compass leadership was on-site to discuss CI Compass’s objectives, the FAIR Data Topical Working Group, and the CI Compass Fellowship Program (CICF).
CI Compass had further discussions and meetings during the poster session to bring new NSF mid-scale and MF partners into collaborations with the center, and to understand their needs and concerns about future workforce development and data lifecycle management.
The 2025 NSF Research Infrastructure Workshop has not yet been announced. More information and announcements about events like this workshop can be found at researchinfrastructureoutreach.org.
Update on June 24, 2024: The videos from the 2024 NSF Research Infrastructure Workshop are now available to review on the NSF Research Infrastructure Knowledge Sharing Gateway.
About CI Compass
CI Compass is funded by the NSF Office of Advanced Cyberinfrastructure in the Directorate for Computer and Information Science and Engineering under grant number 2127548. Its participating research institutions include the University of Southern California, Indiana University, Texas Tech University, the University of North Carolina at Chapel Hill, the University of Notre Dame, and the University of Utah.
To learn more about CI Compass, please visit ci-compass.org.
Contact: Christina Clark, Research Communications Specialist
CI Compass / Notre Dame Research / University of Notre Dame cclark26@nd.edu / 574.631.2665
ci-compass.org / @cicompass
Originally published by ci-compass.org on June 21, 2024.
atLatest Research
- The 14-year history of Notre Dame's Greater China Scholars programGreater China Scholars gather beneath the Golden Dome. This…
- Asian American journalist and activist Helen Zia to speak at Notre DameHelen Zia, a pioneering journalist, author and activist, will present the Asian American Distinguished Speaker Series lecture at the University of Notre Dame at 5 p.m. on March 19 (Wednesday) in the Smith Ballroom, Morris Inn.
- Meenal Datta receives Air Force Young Investigator Program award to safeguard soldier immunity against unique stressorsMeenal Datta, assistant professor of aerospace and mechanical engineering, has received a Young Investigator Program (YIP) award from the Air Force Office of Scientific Research (AFOSR).
- Notre Dame researchers develop new, ultra-power-efficient 5G antennaBenjamin…
- Junior Toni Akintola to compete in the nation's largest student entrepreneur competition; Notre Dame to host for the first time…
- 10th Annual Global Health Case Competition: Teams Address Palliative Care Needs for Childhood Cancer Patients in UgandaOn Saturday, February 8, 2025, twelve teams of University of Notre Dame students competed to address a global health challenge at the 10th Annual Global Health Case Competition in the Jordan Hall of Science. This year’s case challenged students to pitch proposals for integrating palliative care among children and adolescents who are being treated for cancer in Uganda over a one-year period.