Endangered Data Week started in early 2017, as a wave of concerned librarians, archivists, scientists, and citizens rushed into action to preserve federal data, and shed light on the many threats to crucial information. The most visible effort (and an Endangered Data Week partner) focused on environmental data: DataRefuge, a program spearheaded by the Penn Program in the Environmental Humanities Lab, University of Pennsylvania Libraries, and Project_ARCC. Contributors to this project scrambled to ensure that crucial datasets on climate change and related issues were preserved for researchers now and into the future. Datasets were added to ICPSR's DataLumos and DataRefuge as a supplement to federal agency servers. The Environmental Data & Governance Initiative (EDGI) quickly formed and began documenting changes made to websites and policies concerning federal environmental data.
Threats to open data aren't new, and archivists, librarians, and researchers have a long history of working to foster and preserve unfettered access to information. Events like Sunshine Week and Open Access Week highlight similar issues to the scholarly community and the press. The End of Term Web Archive project has functioned since 2008 to "capture and save US Government websites at the end of presidential administrations" as a collaboration between the Internet Archive, California Digital Library, University of North Texas Libraries, and Library of Congress, covering sites from all three branches of the federal government.
Our public data will not be saved through a one-time mass backup, nor by distributed and uncoordinated, small acts of heroism. And, as the research data management community well knows, privately administered data is also under threat, often of benign neglect. Furthermore, we must be cognizant of data that should not be collected, retained, or made public, and critical of the usage of data in ways that may further harm marginalized communities. We see Endangered Data Week as a service to projects like those listed above, and to the broader community of people who care about justice and access to information.
Together, we must: work for strong federal, state, and local open data policies; increase critical data literacy skills and competencies among students and colleagues; and continue to shed light, year after year, on threats to data collections from all sources. At the same time, we hope to make our communities more aware of the governmental, corporate, and institutional collection of data, and the ways in which these data can harm marginalized communities. An annual series of events, coordinated across campuses, nonprofits, libraries, citizen science initiatives, community activists, and cultural heritage institutions and spanning disciplines and types of datasets can shed light on public information that is in danger of being deleted, repressed, mishandled, or lost, as well as surveillance and data collection practices that can be harmful to communities. Through this project, we hope to: raise awareness of different types of threats to publicly available data; engage with the power dynamics involved in data creation, sharing, and retention; foster concrete skills and collaborative projects; and highlight work to make endangered data more secure and accessible.
EDW is organized by Brandon Locke, Jason A. Heppler, Sarah Melton, Rachel Mattson, and Joseph Koivisto, in collaboration with Bethany Nowviskie and Wayne Graham. Additional contributors include the Digital Library Federation's interest group on Government Records Transparency/Accountability, and (for DLF) Katherine Kim, Aliya Reich, Gayle Schechter, and Becca Quon. DataRefuge, Mozilla Science Lab, the NDSA, and CLIR join the DLF as project sponsors.