Jekyll2022-11-29T16:14:06+00:00http://endangereddataweek.org/feed.xmlEndangered Data Week<strong>raising awareness</strong> of threats to publicly available data; <strong>exploring</strong> the power dynamics of data creation, sharing, and retention; and <strong>teaching</strong> ways to make endangered data more accessible and secureEndangering Data Interview with Sarah Lamdan2020-10-13T13:00:25+00:002020-10-13T13:00:25+00:00http://endangereddataweek.org/2020/10/13/endangering-data-interview-with-sarah-lamdan<p><em><img data-attachment-id="21930" data-permalink="https://www.diglib.org/endangering-data-interview-with-sarah-lamdan/sarahlamdan-285x190/" data-orig-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/SarahLamdan-285x190-1.jpg?fit=285%2C190&ssl=1" data-orig-size="285,190" data-comments-opened="0" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="SarahLamdan-285×190" data-image-description="" data-medium-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/SarahLamdan-285x190-1.jpg?fit=285%2C190&ssl=1" data-large-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/SarahLamdan-285x190-1.jpg?fit=285%2C190&ssl=1" loading="lazy" class="size-full wp-image-21930 alignright" src="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/SarahLamdan-285x190-1.jpg?resize=285%2C190" alt="Sarah Lamdan headshot" width="285" height="190" data-recalc-dims="1" />Sarah Lamdan is a Professor of Law at CUNY School of Law in Long Island City, NY. She has a master’s degree in library science and legal information management. She also has a law certificate in environmental law. Her work focuses on information law and policy.</em></p>
<p><em>Professor Lamdan works on issues across the spectrum from open government to personal privacy. She is currently writing a book about data control and access called Data Cartels, which will be published by Stanford University Press. Sarah is a member of the<a href="https://envirodatagov.org/"> Environmental Data & Governance Initiative</a> and<a href="https://theintercept.com/2019/11/14/ice-lexisnexis-thomson-reuters-database/"> works with immigration groups</a> on government surveillance issues. Lamdan’s book, Environmental Information: Research, Access & Environmental Decisionmaking (Environmental Law Institute 2017) serves as a resource for journalists, scientists, and researchers who use government science information in their work.</em></p>
<hr />
<p><b>Tell us a bit about your projects and how you became interested in issues of data privacy, collection, and surveillance.</b></p>
<p>I became interested in the topic after seeing a news article in 2017 about ICE’s “extreme vetting” social media surveillance program, and noticing that Thomson Reuters and LexisNexis reps had attended an ICE event to learn about how to win gov’t contracts to participate in the invasive immigrant surveillance program. Thomson Reuters and LexisNexis (part of the data analytics giant RELX Group) are the main suppliers of legal research products for the legal profession. Their products, Westlaw and Lexis, are considered the “gold standard” legal research products, and together, the companies have a legal information duopoly. I was concerned about the ethical implications of immigration lawyers using products that may ultimately be participating in ICE surveillance programs that harm their clients.</p>
<p> </p>
<p><b>You’ve written several pieces[1] detailing how many vendor business models go far beyond licensing scholarly journals to academic researchers and law firms, and include selling mailing addresses, social media data, credit and criminal records, and much more to marketing firms, political consultants, and law enforcement. How did those companies develop?</b></p>
<p>So, as I started researching about Thomson Reuters and LexisNexis’s relationships with ICE, it became clear that these companies weren’t the companies that I thought they were. As a librarian, these companies were marketed as publishers. I knew Reed Elsevier (RE of the RELX) as a publisher of scholarly journals, and LexisNexis (LX) as a publisher of legal resources and news. Thomson Reuters supplied financial and legal search platforms to business and law firm libraries that I’d worked in. </p>
<p>I learned that, over the past decade, these companies have morphed from being “publishers” to being “data analytics corporations.” Library markets are changing as more information becomes open access and freely available online, especially when it comes to legal resources. Government websites and nonprofit groups have pushed to make laws more accessible on the internet. At the same time, data analytics seems to be the future profit source – collecting huge amounts of data and using algorithms, AI, and machine learning to “slice and dice” data to build informational resources for clients. Since the 90’s Thomson Reuters and RELX Group have acquired hundreds of companies and tons of data to position themselves as the premier data analytics firms.</p>
<p><b>Although vendors like Thomson Reuters and RELX are notoriously secretive about the library data they collect and how they use it, do members of the library community have any idea about how that data is used in their broader data broker ecosystem? How might data collected from users of LexisNexis, Scopus, Elsevier journals, etc. be of value to non-library audiences? How it may be aggregated with other data?</b></p>
<p>It seems that Thomson Reuters, RELX Group, and other online research platforms benefit from using library data to market their products, and create new products, for those same users. Sam Moore describes how these platforms use “seamless access” (“<a href="https://hcommons.org/deposits/item/hc:32095/">Get Full Text Research</a>, for example[2]) to gather data about its users that the companies can monetize researchers’ searches to tailor services for those, and other, users. <a href="https://twitter.com/WolfieChristl/status/1295655040741445632">Wolfie Christl similarly noticed</a> that when you do research using Elsevier, ThreatMetrix, an RELX surveillance data product, stores a personal identifier in your browser to track your searching.</p>
<p>We can’t be sure what the companies are doing with this data (aka we don’t know whether they are using it internally or selling it/sharing it externally, etc.) but we do know that our research is being tracked by the companies whose platforms we, and our patrons, rely on to do our research.</p>
<p> </p>
<p><b>You’re working on a book manuscript about data cartels. Can you share a little bit more about that project, and what the larger ecosystem of data cartels looks like?</b></p>
<p>As I tried to figure out what these data analytics companies <i>do</i> and how their different products connected, I learned that there isn’t much research on these publishers-turned-data analytics corporations. Information science tends to focus more on communication technologies and platforms (algorithms, machine learning, social media, search engines) and not as much on the duller, less-dynamic data vendor side. It’s like focusing on modems, themselves, instead of the Internet – boooring. Because there isn’t much discussion of these companies beyond librarianship, we haven’t seen the full pictures of these companies: they don’t just sell platforms to libraries, they also sell platforms to financial firms, cops, news orgs, and more. Several companies are simultaneously academic research oligarchies, legal research duopolies, federal and state police surveillance monopolies. These companies have consolidated control over informational flows in libraries and beyond, restricting and stratifying informational access and data privacy in all of our communities.</p>
<p> </p>
<p><b>In </b><a href="http://www.inthelibrarywiththeleadpipe.org/2019/ice-surveillance"><b>Librarianship at the Crossroads of ICE Surveillance</b></a><b>, you write that we must not pass privacy protections on to patrons, or donate the labor of erasing our patrons’ data to vendors, but rather to demand “privacy by design” from vendors. Have you seen any progress on this front?</b></p>
<p>“Privacy by design” is an idea described by Ann Cavoukian, the former Information and Privacy Commissioner for the Canadian province of Ontario. I bought into this idea in an article I wrote in 2015 (<a href="https://www.journals.uchicago.edu/doi/abs/10.1086/681610">Social Media Privacy: A Rallying Cry to Librarians</a>), and tried to incorporate it into librarians’ work with vendors and the resources we use in our research and reference work. I haven’t seen any data analytics corporations affirm privacy by design concepts lately, and in fact, it seems that, based on research like Moore’s and Christl’s, they are expanding the surveillance in their own products. </p>
<p> </p>
<p><b>In </b><a href="http://www.inthelibrarywiththeleadpipe.org/2019/ice-surveillance"><b>Librarianship at the Crossroads of ICE Surveillance</b></a><b>, you also wrote that librarians are information technology’s early adopters, and often information technology’s first critics. As information professionals, what do you think our role is outside of the library to advocate for data justice?</b></p>
<p>While I think that librarians have a lot of leverage as the gatekeepers for research platform products, the people who sign the contracts, teach the patrons how to use the products, etc., I am always cognizant of <a href="http://www.inthelibrarywiththeleadpipe.org/2018/vocational-awe/">Ettarh Fobazi’s work on “vocational awe</a>.” Librarians can harm ourselves as workers by assuming the huge societal burdens sometimes foisted on libraries and their employees. So, I think we have power, but I don’t think it’s our job, alone, to save the world. We can use our power as we choose, and there have been some really thoughtful and excellent library initiatives around data privacy including the open access movement, ideas around baking privacy guarantees into contracts with data analytics companies, and other negotiations with these data platform giants. We’ve seen how libraries can even choose to walk away from “big deal” contracts, which is very empowering.</p>
<p> </p>
<p><b>Is there anything else you want to add, or any work or other projects you want readers to know about?</b></p>
<p>There is so much awesome librarian work going on right now. Information access and the products we use are changing all the time, and I think that there is no group more aware of how the changing data privacy and access universe impacts our lives than librarians. So, stay strong and keep going! </p>
<ol>
<li style="font-weight: 400"><a href="https://www.jurist.org/commentary/2020/06/sarah-lamdan-data-policing/">Defund the Police, and Defund Big Data Policing, Too (2020)</a>, <a href="http://www.inthelibrarywiththeleadpipe.org/2019/ice-surveillance/">Librarianship at the Crossroads of Big Data & Corporate Surveillance (2019)</a>, <a href="https://socialchangenyu.com/review/when-westlaw-fuels-ice-surveillance-legal-ethics-in-the-era-of-big-data-policing/">When Westlaw Fuels ICE Surveillance: Ethics in the Big Data Policing Era</a> (2019).</li>
<li style="font-weight: 400"><a href="https://hcommons.org/deposits/item/hc:32095/">Individuation through infrastructure: Get Full Text Research, data extraction and the academic publishing oligopoly (2020)</a></li>
</ol>
<p> </p>
<p>The post <a rel="nofollow" href="https://www.diglib.org/endangering-data-interview-with-sarah-lamdan/">Endangering Data Interview with Sarah Lamdan</a> appeared first on <a rel="nofollow" href="https://www.diglib.org">DLF</a>.</p>Sarah Lamdan is a Professor of Law at CUNY School of Law in Long Island City, NY. She has a master’s degree in library science and legal information management. She also has a law certificate in environmental law. Her work focuses on information law and policy. Professor Lamdan works on issues across the spectrum from open government to personal privacy. She is currently writing a book about data control and access called Data Cartels, which will be published by Stanford University Press. Sarah is a member of the Environmental Data & Governance Initiative and works with immigration groups on government surveillance issues. Lamdan’s book, Environmental Information: Research, Access & Environmental Decisionmaking (Environmental Law Institute 2017) serves as a resource for journalists, scientists, and researchers who use government science information in their work. Tell us a bit about your projects and how you became interested in issues of data privacy, collection, and surveillance. I became interested in the topic after seeing a news article in 2017 about ICE’s “extreme vetting” social media surveillance program, and noticing that Thomson Reuters and LexisNexis reps had attended an ICE event to learn about how to win gov’t contracts to participate in the invasive immigrant surveillance program. Thomson Reuters and LexisNexis (part of the data analytics giant RELX Group) are the main suppliers of legal research products for the legal profession. Their products, Westlaw and Lexis, are considered the “gold standard” legal research products, and together, the companies have a legal information duopoly. I was concerned about the ethical implications of immigration lawyers using products that may ultimately be participating in ICE surveillance programs that harm their clients. You’ve written several pieces[1] detailing how many vendor business models go far beyond licensing scholarly journals to academic researchers and law firms, and include selling mailing addresses, social media data, credit and criminal records, and much more to marketing firms, political consultants, and law enforcement. How did those companies develop? So, as I started researching about Thomson Reuters and LexisNexis’s relationships with ICE, it became clear that these companies weren’t the companies that I thought they were. As a librarian, these companies were marketed as publishers. I knew Reed Elsevier (RE of the RELX) as a publisher of scholarly journals, and LexisNexis (LX) as a publisher of legal resources and news. Thomson Reuters supplied financial and legal search platforms to business and law firm libraries that I’d worked in. I learned that, over the past decade, these companies have morphed from being “publishers” to being “data analytics corporations.” Library markets are changing as more information becomes open access and freely available online, especially when it comes to legal resources. Government websites and nonprofit groups have pushed to make laws more accessible on the internet. At the same time, data analytics seems to be the future profit source – collecting huge amounts of data and using algorithms, AI, and machine learning to “slice and dice” data to build informational resources for clients. Since the 90’s Thomson Reuters and RELX Group have acquired hundreds of companies and tons of data to position themselves as the premier data analytics firms. Although vendors like Thomson Reuters and RELX are notoriously secretive about the library data they collect and how they use it, do members of the library community have any idea about how that data is used in their broader data broker ecosystem? How might data collected from users of LexisNexis, Scopus, Elsevier journals, etc. be of value to non-library audiences? How it may be aggregated with other data? It seems that Thomson Reuters, RELX Group, and other online research platforms benefit from using library data to market their products, and create new products, for those same users. Sam Moore describes how these platforms use “seamless access” (“Get Full Text Research, for example[2]) to gather data about its users that the companies can monetize researchers’ searches to tailor services for those, and other, users. Wolfie Christl similarly noticed that when you do research using Elsevier, ThreatMetrix, an RELX surveillance data product, stores a personal identifier in your browser to track your searching. We can’t be sure what the companies are doing with this data (aka we don’t know whether they are using it internally or selling it/sharing it externally, etc.) but we do know that our research is being tracked by the companies whose platforms we, and our patrons, rely on to do our research. You’re working on a book manuscript about data cartels. Can you share a little bit more about that project, and what the larger ecosystem of data cartels looks like? As I tried to figure out what these data analytics companies do and how their different products connected, I learned that there isn’t much research on these publishers-turned-data analytics corporations. Information science tends to focus more on communication technologies and platforms (algorithms, machine learning, social media, search engines) and not as much on the duller, less-dynamic data vendor side. It’s like focusing on modems, themselves, instead of the Internet – boooring. Because there isn’t much discussion of these companies beyond librarianship, we haven’t seen the full pictures of these companies: they don’t just sell platforms to libraries, they also sell platforms to financial firms, cops, news orgs, and more. Several companies are simultaneously academic research oligarchies, legal research duopolies, federal and state police surveillance monopolies. These companies have consolidated control over informational flows in libraries and beyond, restricting and stratifying informational access and data privacy in all of our communities. In Librarianship at the Crossroads of ICE Surveillance, you write that we must not pass privacy protections on to patrons, or donate the labor of erasing our patrons’ data to vendors, but rather to demand “privacy by design” from vendors. Have you seen any progress on this front? “Privacy by design” is an idea described by Ann Cavoukian, the former Information and Privacy Commissioner for the Canadian province of Ontario. I bought into this idea in an article I wrote in 2015 (Social Media Privacy: A Rallying Cry to Librarians), and tried to incorporate it into librarians’ work with vendors and the resources we use in our research and reference work. I haven’t seen any data analytics corporations affirm privacy by design concepts lately, and in fact, it seems that, based on research like Moore’s and Christl’s, they are expanding the surveillance in their own products. In Librarianship at the Crossroads of ICE Surveillance, you also wrote that librarians are information technology’s early adopters, and often information technology’s first critics. As information professionals, what do you think our role is outside of the library to advocate for data justice? While I think that librarians have a lot of leverage as the gatekeepers for research platform products, the people who sign the contracts, teach the patrons how to use the products, etc., I am always cognizant of Ettarh Fobazi’s work on “vocational awe.” Librarians can harm ourselves as workers by assuming the huge societal burdens sometimes foisted on libraries and their employees. So, I think we have power, but I don’t think it’s our job, alone, to save the world. We can use our power as we choose, and there have been some really thoughtful and excellent library initiatives around data privacy including the open access movement, ideas around baking privacy guarantees into contracts with data analytics companies, and other negotiations with these data platform giants. We’ve seen how libraries can even choose to walk away from “big deal” contracts, which is very empowering. Is there anything else you want to add, or any work or other projects you want readers to know about? There is so much awesome librarian work going on right now. Information access and the products we use are changing all the time, and I think that there is no group more aware of how the changing data privacy and access universe impacts our lives than librarians. So, stay strong and keep going! Defund the Police, and Defund Big Data Policing, Too (2020), Librarianship at the Crossroads of Big Data & Corporate Surveillance (2019), When Westlaw Fuels ICE Surveillance: Ethics in the Big Data Policing Era (2019). Individuation through infrastructure: Get Full Text Research, data extraction and the academic publishing oligopoly (2020) The post Endangering Data Interview with Sarah Lamdan appeared first on DLF.Endangering Data Interview with Terra Graziani2020-10-05T13:30:16+00:002020-10-05T13:30:16+00:00http://endangereddataweek.org/2020/10/05/endangering-data-interview-with-terra-graziani<p><em><img data-attachment-id="21896" data-permalink="https://www.diglib.org/endangering-data-interview-with-terra-graziani/terra/" data-orig-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/terra.jpg?fit=326%2C512&ssl=1" data-orig-size="326,512" data-comments-opened="0" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="terra" data-image-description="" data-medium-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/terra.jpg?fit=191%2C300&ssl=1" data-large-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/terra.jpg?fit=326%2C512&ssl=1" loading="lazy" class="size-medium wp-image-21896 alignleft" src="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/terra.jpg?resize=191%2C300" alt="Terra Graziani with her dog wearing a sign that reads "abolish racial capitalism"" width="191" height="300" srcset="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/terra.jpg?resize=191%2C300&ssl=1 191w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/10/terra.jpg?w=326&ssl=1 326w" sizes="(max-width: 191px) 100vw, 191px" data-recalc-dims="1" />Terra Graziani is a researcher and tenants’ rights organizer based in Los Angeles, CA. She founded and co-directs the Los Angeles chapter of the <a href="https://www.antievictionmap.com/">Anti-Eviction Mapping Project</a> (AEMP), a digital storytelling collective documenting dispossession and resistance in solidarity with gentrifying communities through research, oral history, and data work. She is also a researcher with the UCLA Institute on Inequality and Democracy and The Center for Critical Internet Inquiry at UCLA. Before this, she organized with AEMP in the San Francisco Bay Area and worked for several tenants’ rights organizations including The Los Angeles Center for Community Law and Action, The Eviction Defense Collaborative and Tenants Together. She is currently Research Program Officer at <a href="https://educopia.org/">Educopia</a> where she works to cultivate community in the information field. Terra earned her Master’s in Urban and Regional Planning at UCLA and her Bachelor’s degree in Social and Cultural Geography at UC Berkeley.</em></p>
<hr />
<p><b>Tell us a bit about your projects and how you became interested in issues of data privacy, collection, and surveillance.</b></p>
<p>One of my first jobs in the tenant movement was Community Outreach Coordinator at The Eviction Defense Collaborative (EDC) in San Francisco, a legal clinic where anyone who has received an eviction notice goes first to get immediate help responding to their eviction. I was responsible for writing EDC’s annual eviction report, which analyzed the data EDC collected through its clinic to provide a picture of displacement in San Francisco. To make this report, we partnered with The Anti-Eviction Mapping Project (AEMP), and the rest is history! I joined AEMP shortly thereafter and began to learn — both through the research and visualization I was doing with AEMP, as well as through the experiences I was having working with tenants facing eviction at EDC — all about eviction data! Since then, I’ve worked in various roles in the tenant movement organizing against displacement and building power through knowledge production around issues of tenants’ rights, our speculative and racist housing system (market), police and property, policing technologies, and so much more. I spend a lot of time thinking about how we can mobilize data to build power from below, rather than to surveil and punish.</p>
<p> </p>
<p><b>Your work in AEMP and in tenant organizing more broadly has touched on areas where mass data collection can endanger tenants, as well as areas where civic data and grassroots data collection can help to counter anti-tenant narratives and build community power. </b></p>
<p><b>How does the concept of “endangering data” arise in your work?</b></p>
<p>Early on, I learned that the most commonly-used data set on evictions comes from the courts, and that each county’s courts has a different data management system that has a huge impact on how accessible that data is. I also learned that, in California, there is a <a href="https://www.nolo.com/legal-update/california-tenants-protected-win-eviction-lawsuits.html">masking law</a> that prevents court eviction records from being made public. As the article states, before this masking law was passed, “under longstanding California law, records in eviction lawsuits [were] kept sealed for 60 days after they [were] filed. On the 61st day, the court clerk look[ed] to see if the tenant prevailed. If not, the record [became] public — even if there [had] been no ruling in the case, and even if the landlord [had] abandoned the lawsuit.” After these 60 days, names of those who had lost their case and been evicted would be published, third party companies were taking and publishing this data, and landlords would subscribe in order to blacklist tenants. A tenants’ record and credit score could be affected for up to 7 years, significantly impacting their access to housing, which is already so hard to secure. With the new state law, which took effect in January 2017, the landlord would have to win the case in those 60 days for the record to be made public. The law is still not perfect, as tenants still avoid taking their cases to court where, if they have access to an attorney (also a huge problem) they could get a better outcome, for fear of losing and getting an eviction put on their record, but the new laws protect many more tenants from being blacklisted.</p>
<p>This masking law is a double-edged sword, though, because it also means those within the tenant movement who want access to court eviction data, including AEMP, have a very hard time getting it. There is an exemption to the law that grants access to “a person who has an order from a court, issued upon a showing of good cause.” This is the exemption AEMP used to get a court order to access the data in Alameda County. In San Francisco, the court has a practice of sharing address-level eviction data with the rent board, who then we request the data from through public records’ request. The format of the data we receive from these public bodies is all over the place and often takes months of work to make it usable. In sum, the data management and sharing practices of public institutions are so variable and have a huge impact on the work we can do. </p>
<p>One of my many jobs at EDC was to help re-vamp their intake form, meaning I was helping shape what data we collected on tenants facing eviction in San Francisco. I still think about this process often, because working with clinics and tenant organizations to mobilize their data is one way AEMP has been able to visualize a more holistic data set of evictions. If we only look at eviction data from courts, we’re only representing those tenants who made it to court. So many tenants are forced out informally, through intimidation, harassment, informal buy-out offers etc., none of which are captured by court data. Some of this is recorded in clinic records where tenants have gone for help. Another way this is helpful is that many clinics collect demographic data. EDC’s data taught us, for example, that in 2015, compared to the city’s population, <a href="https://static1.squarespace.com/static/52b7d7a6e4b0b3e376ac8ea2/t/5b1276fc8a922da05029225d/1527936777268/EDC_2015.pdf">Black residents were overrepresented by 300%</a> in their eviction records. Race and other sociopolitical aspects of the displacement crisis are not at all captured in court records. </p>
<p>We are still very far away from having holistic, community-controlled, non-punitive data on evictions, but this is the work AEMP is always engaged in. We collect, mobilize, and preserve data for justice, by and for those who face or have faced displacement. And we’re always collectively evolving our practices on how best to do this. </p>
<p> </p>
<p><b>How have you used data to fight for greater justice?</b></p>
<p>Wahoo, you’ve given me some space to talk about making property ownership data public! </p>
<p>It is extremely difficult to track the real, or “beneficial owner” of rental property in our communities, because property owners are able to hide their identities by purchasing and owning property behind LLCs. Currently, owners are not required to disclose their real identity when they register an LLC or purchase a property. This anonymity in the public record is not an oversight – it is by design, and property owners benefit enormously from it. When a tenant is facing eviction, and they only know their landlord to be an LLC, they have to do a lot of work to figure out who to put pressure on to drop their eviction. In order to unveil and hold accountable the increasingly corporate actors who control so much property in our neighborhoods, municipalities around the world should demand the disclosure of beneficial ownership in both companies and property ownership. </p>
<p>The Anti-Eviction Mapping Project has worked for years to unveil property ownership webs in the San Francisco Bay Area, Los Angeles, and New York so that tenants can use this information to fight displacement. We’re building a tool (which is almost ready!) called Evictorbook, which allows you to search a property, see who the real owner is, and see the property’s eviction history as well as whether or not its covered by rent control, and a few more details. So much data is collected on tenants, particularly by the real estate industry. Evictorbook aims to keep track of landlords and gives tenants a tool to fight back with. </p>
<p> </p>
<p><b>You and your colleagues on the AEMP team have written in multiple places[1][2][3] about the need for grassroots, community based work. What steps can we, as academics and information professionals, take for more equitable and democratic data practices?</b></p>
<p>Several members of AEMP, myself included, straddle academia and organizing work. Mary Shi, another AEMP member, and I recently published an article in ACME: An International Journal for Critical Geographies entitled, “Data for Justice Tensions and Lessons from the Anti-Eviction Mapping Project’s Work Between Academia and Activism.” In it, we talk about what we’ve learned from straddling these two spaces, focusing on mutual aid, accountability, and embeddedness as guiding principles to producing knowledge outside of academia. I’ll share our conclusion here, which I think sums up the article nicely:</p>
<blockquote><p>“It is no accident that AEMP, as a project fighting displacement, finds itself straddling the space between academia and activism with its epistemologically critical perspective. Like traditional, objectivist knowledge, displacement is a strategy of violence through erasure. Resistance, therefore, requires strategies that fight this erasure at each point. Countermapping, story-telling, and deep collaborations with community organizers are all strategies AEMP has developed to fight such erasures at multiple levels. And as the guiding principles of mutual aid, accountability, and embeddedness illustrate, it is not only the critical nature of AEMP’s tools but also AEMP’s constant assessment of its work’s impacts as measured from the perspective of the communities, organizers, and activists it is meant to serve that allow AEMP to pursue its mission of producing data for justice.”</p></blockquote>
<p>Projects like AEMP are being offered new opportunities–often by sympathetic insiders– to take advantage of the centuries of resources accumulated by universities, research institutes, and other such organizations in pursuit of their critical objectives. AEMP recognizes these as redistributive opportunities that should be taken with eyes wide open. In this spirit, AEMP continues straddling the space between academia and activism despite the challenges this position entails. The path AEMP has discovered in navigating this terrain is not disavowal and exit but rather constant critique and strategic engagement. We offer our reflections not as an end-all-be-all guide for scholars seeking to do critical, community engaged work, but instead as a sharing of the surest signposts we have discovered along the way. As more scholars reevaluate the way they study changing urban landscapes in particular and the relationship between academia and activism more generally, we hope this piece can contribute to the forging of a more just and reparative relationship between academia and the publics it serves.”</p>
<p>I’ll also share AEMP’s <a href="https://antievictionmap.com/data-use-agreement">Data Use Agreement,</a> which we’ve formulated over the years and welcome questions, concerns, and feedback on. </p>
<ol>
<li style="font-weight: 400"><a href="https://acme-journal.org/index.php/acme/article/view/1776">Graziani, Terra, and Mary Shi (2020) “Data for Justice”</a></li>
<li style="font-weight: 400"><a href="https://shelterforce.org/2018/08/22/eviction-lab-misses-the-mark/">Aiello, Daniela, Lisa Bates, Terra Graziani, Christopher Herring, Manissa Maharawal, Erin McElroy, Pamela Phan, and Gretchen Purser (2018) “Eviction Lab Misses the Mark”</a></li>
<li style="font-weight: 400"><a href="https://www.saje.net/wp-content/uploads/2020/09/The_Vacancy_Report_Final.pdf">Ferrer, Alex, Graziani, Terra et al., The Vacancy Report: How Los Angeles Leaves Homes Empty and People Unhoused</a></li>
</ol>
<p>The post <a rel="nofollow" href="https://www.diglib.org/endangering-data-interview-with-terra-graziani/">Endangering Data Interview with Terra Graziani</a> appeared first on <a rel="nofollow" href="https://www.diglib.org">DLF</a>.</p>Terra Graziani is a researcher and tenants’ rights organizer based in Los Angeles, CA. She founded and co-directs the Los Angeles chapter of the Anti-Eviction Mapping Project (AEMP), a digital storytelling collective documenting dispossession and resistance in solidarity with gentrifying communities through research, oral history, and data work. She is also a researcher with the UCLA Institute on Inequality and Democracy and The Center for Critical Internet Inquiry at UCLA. Before this, she organized with AEMP in the San Francisco Bay Area and worked for several tenants’ rights organizations including The Los Angeles Center for Community Law and Action, The Eviction Defense Collaborative and Tenants Together. She is currently Research Program Officer at Educopia where she works to cultivate community in the information field. Terra earned her Master’s in Urban and Regional Planning at UCLA and her Bachelor’s degree in Social and Cultural Geography at UC Berkeley. Tell us a bit about your projects and how you became interested in issues of data privacy, collection, and surveillance. One of my first jobs in the tenant movement was Community Outreach Coordinator at The Eviction Defense Collaborative (EDC) in San Francisco, a legal clinic where anyone who has received an eviction notice goes first to get immediate help responding to their eviction. I was responsible for writing EDC’s annual eviction report, which analyzed the data EDC collected through its clinic to provide a picture of displacement in San Francisco. To make this report, we partnered with The Anti-Eviction Mapping Project (AEMP), and the rest is history! I joined AEMP shortly thereafter and began to learn — both through the research and visualization I was doing with AEMP, as well as through the experiences I was having working with tenants facing eviction at EDC — all about eviction data! Since then, I’ve worked in various roles in the tenant movement organizing against displacement and building power through knowledge production around issues of tenants’ rights, our speculative and racist housing system (market), police and property, policing technologies, and so much more. I spend a lot of time thinking about how we can mobilize data to build power from below, rather than to surveil and punish. Your work in AEMP and in tenant organizing more broadly has touched on areas where mass data collection can endanger tenants, as well as areas where civic data and grassroots data collection can help to counter anti-tenant narratives and build community power. How does the concept of “endangering data” arise in your work? Early on, I learned that the most commonly-used data set on evictions comes from the courts, and that each county’s courts has a different data management system that has a huge impact on how accessible that data is. I also learned that, in California, there is a masking law that prevents court eviction records from being made public. As the article states, before this masking law was passed, “under longstanding California law, records in eviction lawsuits [were] kept sealed for 60 days after they [were] filed. On the 61st day, the court clerk look[ed] to see if the tenant prevailed. If not, the record [became] public — even if there [had] been no ruling in the case, and even if the landlord [had] abandoned the lawsuit.” After these 60 days, names of those who had lost their case and been evicted would be published, third party companies were taking and publishing this data, and landlords would subscribe in order to blacklist tenants. A tenants’ record and credit score could be affected for up to 7 years, significantly impacting their access to housing, which is already so hard to secure. With the new state law, which took effect in January 2017, the landlord would have to win the case in those 60 days for the record to be made public. The law is still not perfect, as tenants still avoid taking their cases to court where, if they have access to an attorney (also a huge problem) they could get a better outcome, for fear of losing and getting an eviction put on their record, but the new laws protect many more tenants from being blacklisted. This masking law is a double-edged sword, though, because it also means those within the tenant movement who want access to court eviction data, including AEMP, have a very hard time getting it. There is an exemption to the law that grants access to “a person who has an order from a court, issued upon a showing of good cause.” This is the exemption AEMP used to get a court order to access the data in Alameda County. In San Francisco, the court has a practice of sharing address-level eviction data with the rent board, who then we request the data from through public records’ request. The format of the data we receive from these public bodies is all over the place and often takes months of work to make it usable. In sum, the data management and sharing practices of public institutions are so variable and have a huge impact on the work we can do. One of my many jobs at EDC was to help re-vamp their intake form, meaning I was helping shape what data we collected on tenants facing eviction in San Francisco. I still think about this process often, because working with clinics and tenant organizations to mobilize their data is one way AEMP has been able to visualize a more holistic data set of evictions. If we only look at eviction data from courts, we’re only representing those tenants who made it to court. So many tenants are forced out informally, through intimidation, harassment, informal buy-out offers etc., none of which are captured by court data. Some of this is recorded in clinic records where tenants have gone for help. Another way this is helpful is that many clinics collect demographic data. EDC’s data taught us, for example, that in 2015, compared to the city’s population, Black residents were overrepresented by 300% in their eviction records. Race and other sociopolitical aspects of the displacement crisis are not at all captured in court records. We are still very far away from having holistic, community-controlled, non-punitive data on evictions, but this is the work AEMP is always engaged in. We collect, mobilize, and preserve data for justice, by and for those who face or have faced displacement. And we’re always collectively evolving our practices on how best to do this. How have you used data to fight for greater justice? Wahoo, you’ve given me some space to talk about making property ownership data public! It is extremely difficult to track the real, or “beneficial owner” of rental property in our communities, because property owners are able to hide their identities by purchasing and owning property behind LLCs. Currently, owners are not required to disclose their real identity when they register an LLC or purchase a property. This anonymity in the public record is not an oversight – it is by design, and property owners benefit enormously from it. When a tenant is facing eviction, and they only know their landlord to be an LLC, they have to do a lot of work to figure out who to put pressure on to drop their eviction. In order to unveil and hold accountable the increasingly corporate actors who control so much property in our neighborhoods, municipalities around the world should demand the disclosure of beneficial ownership in both companies and property ownership. The Anti-Eviction Mapping Project has worked for years to unveil property ownership webs in the San Francisco Bay Area, Los Angeles, and New York so that tenants can use this information to fight displacement. We’re building a tool (which is almost ready!) called Evictorbook, which allows you to search a property, see who the real owner is, and see the property’s eviction history as well as whether or not its covered by rent control, and a few more details. So much data is collected on tenants, particularly by the real estate industry. Evictorbook aims to keep track of landlords and gives tenants a tool to fight back with. You and your colleagues on the AEMP team have written in multiple places[1][2][3] about the need for grassroots, community based work. What steps can we, as academics and information professionals, take for more equitable and democratic data practices? Several members of AEMP, myself included, straddle academia and organizing work. Mary Shi, another AEMP member, and I recently published an article in ACME: An International Journal for Critical Geographies entitled, “Data for Justice Tensions and Lessons from the Anti-Eviction Mapping Project’s Work Between Academia and Activism.” In it, we talk about what we’ve learned from straddling these two spaces, focusing on mutual aid, accountability, and embeddedness as guiding principles to producing knowledge outside of academia. I’ll share our conclusion here, which I think sums up the article nicely: “It is no accident that AEMP, as a project fighting displacement, finds itself straddling the space between academia and activism with its epistemologically critical perspective. Like traditional, objectivist knowledge, displacement is a strategy of violence through erasure. Resistance, therefore, requires strategies that fight this erasure at each point. Countermapping, story-telling, and deep collaborations with community organizers are all strategies AEMP has developed to fight such erasures at multiple levels. And as the guiding principles of mutual aid, accountability, and embeddedness illustrate, it is not only the critical nature of AEMP’s tools but also AEMP’s constant assessment of its work’s impacts as measured from the perspective of the communities, organizers, and activists it is meant to serve that allow AEMP to pursue its mission of producing data for justice.” Projects like AEMP are being offered new opportunities–often by sympathetic insiders– to take advantage of the centuries of resources accumulated by universities, research institutes, and other such organizations in pursuit of their critical objectives. AEMP recognizes these as redistributive opportunities that should be taken with eyes wide open. In this spirit, AEMP continues straddling the space between academia and activism despite the challenges this position entails. The path AEMP has discovered in navigating this terrain is not disavowal and exit but rather constant critique and strategic engagement. We offer our reflections not as an end-all-be-all guide for scholars seeking to do critical, community engaged work, but instead as a sharing of the surest signposts we have discovered along the way. As more scholars reevaluate the way they study changing urban landscapes in particular and the relationship between academia and activism more generally, we hope this piece can contribute to the forging of a more just and reparative relationship between academia and the publics it serves.” I’ll also share AEMP’s Data Use Agreement, which we’ve formulated over the years and welcome questions, concerns, and feedback on. Graziani, Terra, and Mary Shi (2020) “Data for Justice” Aiello, Daniela, Lisa Bates, Terra Graziani, Christopher Herring, Manissa Maharawal, Erin McElroy, Pamela Phan, and Gretchen Purser (2018) “Eviction Lab Misses the Mark” Ferrer, Alex, Graziani, Terra et al., The Vacancy Report: How Los Angeles Leaves Homes Empty and People Unhoused The post Endangering Data Interview with Terra Graziani appeared first on DLF.Endangering Data Interview with Thomas Padilla2020-09-25T13:00:29+00:002020-09-25T13:00:29+00:00http://endangereddataweek.org/2020/09/25/endangering-data-interview-with-thomas-padilla<p><img data-attachment-id="21883" data-permalink="https://www.diglib.org/endangering-data-interview-with-thomas-padilla/padilla/" data-orig-file="https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/09/padilla.jpg?fit=2048%2C1365&ssl=1" data-orig-size="2048,1365" data-comments-opened="0" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="padilla" data-image-description="" data-medium-file="https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/09/padilla.jpg?fit=300%2C200&ssl=1" data-large-file="https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/09/padilla.jpg?fit=1024%2C683&ssl=1" loading="lazy" class="size-medium wp-image-21883 alignright" src="https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/09/padilla.jpg?resize=300%2C200" alt="Thomas Padilla" width="300" height="200" srcset="https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/09/padilla.jpg?resize=300%2C200&ssl=1 300w, https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/09/padilla.jpg?resize=1024%2C683&ssl=1 1024w, https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/09/padilla.jpg?resize=768%2C512&ssl=1 768w, https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/09/padilla.jpg?resize=1536%2C1024&ssl=1 1536w, https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/09/padilla.jpg?w=2048&ssl=1 2048w" sizes="(max-width: 300px) 100vw, 300px" data-recalc-dims="1" /><em>Thomas Padilla is Interim Head, Knowledge Production at the University of Nevada Las Vegas. He consults, publishes, presents, and teaches widely on digital strategy, cultural heritage collections, data literacy, digital scholarship, and data curation. He is Principal Investigator of the Andrew W. Mellon Foundation supported<a href="https://collectionsasdata.github.io/part2whole/"> Collections as Data: Part to Whole</a> and past Principal Investigator of the Institute of Museum and Library Services supported, <a href="https://collectionsasdata.github.io/">Always Already Computational: Collections as Data</a>. He is the author of the library community research agenda <a href="http://oc.lc/responsibleoperations">Responsible Operations: Data Science, Machine Learning, and AI in Libraries</a>.</em></p>
<hr />
<p><b>Tell us a bit about your projects and how you became interested in cultural heritage data and algorithmic and AI approaches to curation and research?</b></p>
<p>I am interested in cultivating GLAM community capacity around responsible, ethically grounded computational engagement with data. Some of that interest has to do with positionality – me being a mixed race, first generation college student, from a working class background. I’m constantly trying to find ways for my labor to address historic and contemporary marginalization. </p>
<p><i>Always Already Computational: Collections as Data</i> was an Institute of Museum and Library Services supported effort that iteratively developed a range of <a href="https://collectionsasdata.github.io/resources/">deliverables</a> meant to spark capacity around principles-driven creation of computationally amenable collections . In that work I was very lucky to be joined by Laurie Allen, Stewart Varner, Hannah Frost, Elizabeth Russey Roke, and Sarah Potvin. With a better sense of community need I later embarked on <a href="https://collectionsasdata.github.io/part2whole/"><i>Collections as Data: Part to Whole</i></a> – an effort supported by the Andrew W. Mellon Foundation. <i>Part to Whole</i> is essentially a regranting and cohort development program. Hannah Scates Kettler, Stewart Varner, Yasmeen Shorish, and I are currently <a href="https://collectionsasdata.github.io/part2whole/cohortone/">working with 12 institutions</a> (large R1s, historical societies, museums, State-based digital libraries, and more) to develop models that guide collections as data production and models that help organizations develop sustainable services around collections as data. </p>
<p>Over the course of 2019 I worked as a Practitioner Research in Residence at OCLC Research, interviewing and holding convenings for professionals within and outside of libraries in the United States. This work culminated in the community research agenda <a href="https://www.oclc.org/research/publications/2019/oclcresearch-responsible-operations-data-science-machine-learning-ai.html"><i>Responsible Operations: Data Science, Machine Learning, and AI in Libraries</i></a>. I felt a lot of pressure to get this work right. I did not want to write some breathless utopian endorsement of AI. Any success I have in that regard is due to the wisdom of the community, any failures are mine. The library community in the United States feels like it has reached a certain level of awareness regarding the pitfalls of AI, helped considerably by the work of <a href="https://nyupress.org/9781479837243/algorithms-of-oppression/">Safiya Noble</a>, practitioners like <a href="https://www.montana.edu/news/18360/msu-researchers-receive-grant-to-build-algorithmic-awareness-as-form-of-digital-literacy">Jason Clark</a>, and an understanding that library community practices have long <a href="https://www.fordhampress.com/9780823276363/cruising-the-library/">held the potential to systematically impact communities</a> in a discriminatory manner. </p>
<p><a href="http://www.rummanchowdhury.com/">Rumman Chowdhury</a> introduced me to the concept of responsible operations which was a perfect way to encapsulate where it feels like we are as a community. A number of us want to use AI to strengthen library services but only if it doesn’t compromise commitments to cultivating a more equitable society. Of course, no community is uniform in their beliefs, and libraries are no exception. Some at junior and senior levels have quietly – and not so quietly – expressed the view that preoccupation with responsibility or ethics is orthogonal to progress and allows the library community to be beat in some imagined race with the private sector. These are dangerous views and the stakes are real. We must act accordingly. </p>
<p> </p>
<p><b>For years, many in the library and cultural heritage world have critiqued digitization efforts as replicating (or even accelerating) long-standing biases that center on white, male, and US/Eurocentric collection patterns, viewpoints, and catalog descriptions. In both the </b><a href="https://collectionsasdata.github.io/statement/"><b>Santa Barbara Statement on Collections as Data</b></a><b> and the </b><a href="https://zenodo.org/record/3152935#.X2PPmdZlBTY"><b>Always Already Computational: Collections as Data</b></a><b> final report, you and your partners have pointed to a crucial need for critical engagement with biases and shortcomings and an intention to address the needs of vulnerable communities represented in the materials. What are some examples of these approaches that you’ve found to be successful?</b><br />
</p>
<p><i>Collections as Data: Part to Whole </i>requires that regrantees demonstrate capacity to serve underrepresented communities – a consideration that spans thematic coverage of the collection in question, community buy-in, and a demonstrated commitment to ethical principles that work against the potential for harm. Examples of <i>Part to Whole </i>work addressing your questions include but are not limited to Kim Pham’s effort at the University of Denver to develop a <a href="https://specialcollections.du.edu/cad/form/termsOfUse">terms of use</a> for collections as data and Amanda Henley and Maria Estorino’s effort at the University of North Carolina Chapel Hill to <a href="https://onthebooks.lib.unc.edu/about/">discover and increase access to Jim Crow laws</a> and other racially-based legislation in North Carolina between Reconstruction and the Civil Rights Movement. </p>
<p>More broadly, there is so much good work being done. I am super inspired by Dorothy Berry’s advocacy at Harvard, resulting in a <a href="https://news.harvard.edu/gazette/story/2020/07/houghtons-2020-21-digitization-focus-black-american-history/">2020-2021 exclusive focus on the digitization of Black American History</a>. I am inspired by the Global Indigenous Data Alliance’s <a href="https://www.gida-global.org/care"><i>CARE Principles</i></a>, co-led by Stephanie Russo Carroll and Maui Hudson. A response to the <a href="https://www.go-fair.org/fair-principles/"><i>FAIR Principles</i></a><i>, CARE </i>problematizes <i>FAIR’s, </i>“focus on characteristics of data that will facilitate increased data sharing among entities while ignoring power differentials and historical contexts.” A <i>CARE </i>principle like indigenous “Authority to Control” presents a difficult and needed challenge to the cultural heritage community. What could it look like for more institutions to relinquish control of collections to their rightful owners? It is not often the case that capital – stolen or not – is returned and I imagine even the most well meaning libraries will struggle mightily within their own hierarchies to make this happen. I appreciate Eun Seo Jo and Timnit Gebru’s effort to <a href="https://dl.acm.org/doi/abs/10.1145/3351095.3372829">bridge the archives community and machine learning community</a>. Attempts to thread the needle on cross-domain work is always tough but it is definitely needed. T-Kay Sangwand’s <a href="https://kula.uvic.ca/articles/10.5334/kula.36/"><i>Preservation is Political: Enacting Contributive Justice and Decolonizing Transnational Archival Collaborations</i></a> is a must read. <a href="https://michellecaswell.org/research">Michelle Caswell’s work</a> – as a whole – is fundamental to improving efforts in these spaces. </p>
<p> </p>
<p><b>In your </b><a href="https://www.oclc.org/content/dam/research/publications/2019/oclcresearch-responsible-operations-data-science-machine-learning-ai.pdf"><b>Responsible Operations: Data Science, Machine Learning, and AI in Libraries</b></a><b> report, you cite Nicole Coleman’s suggestion that, in regard to machine learning, libraries might be better served to “manage bias” rather than attempt (or claim) to eliminate it. Can you talk a little bit more about that framing and why you feel it’s productive in the library world?</b></p>
<p>I think people heard enough from me about it in <i>Responsible Operations. </i>I encourage folks to read Nicole’s subsequently published article, <a href="https://journal.calaijol.org/index.php/ijol/article/view/162"><i>Managing Bias When Library Collections Become Data</i></a><i>. </i></p>
<p> </p>
<p><b>What do you think the role for library and information professionals is in larger conversations about “endangering data” and algorithmic and data justice?</b></p>
<p>I think there are many of us doing this work. While former Illinois University Librarian <a href="https://files.eric.ed.gov/fulltext/ED318475.pdf">Paula Kaufman’s testimony before Congress</a> (pg. 77) against a Federal surveillance program gives me chills every time I read it, I often end up thinking about what combination of colleagues, mentors, institutional culture, and personal and professional ethics were in place to make that act of bravery possible. That naturally leads to thinking about what it would take to cultivate similarly principled acts, large and small, among my colleagues. That seems like a promising road to head down. </p>
<p> </p>
<p><b>Is there anything else you want to add, or any work or other projects you want readers to know about?</b></p>
<p>I appreciate the opportunity to share thoughts during Endangered Data Week. In addition to the people and projects mentioned above, I encourage folks to check out the <a href="https://spectrum.library.concordia.ca/986506/"><i>Indigenous Protocol and Artificial Intelligence Position Paper</i></a>; Mozilla’s recent work on <a href="https://foundation.mozilla.org/en/initiatives/data-futures/"><i>Data for Empowerment</i></a><i>, </i>and Ruha Benjamin’s incredibly powerful <a href="https://youtu.be/TrEiEjjt7v4?t=880">Data4BlackLives keynote</a>. </p>
<p>The post <a rel="nofollow" href="https://www.diglib.org/endangering-data-interview-with-thomas-padilla/">Endangering Data Interview with Thomas Padilla</a> appeared first on <a rel="nofollow" href="https://www.diglib.org">DLF</a>.</p>Thomas Padilla is Interim Head, Knowledge Production at the University of Nevada Las Vegas. He consults, publishes, presents, and teaches widely on digital strategy, cultural heritage collections, data literacy, digital scholarship, and data curation. He is Principal Investigator of the Andrew W. Mellon Foundation supported Collections as Data: Part to Whole and past Principal Investigator of the Institute of Museum and Library Services supported, Always Already Computational: Collections as Data. He is the author of the library community research agenda Responsible Operations: Data Science, Machine Learning, and AI in Libraries. Tell us a bit about your projects and how you became interested in cultural heritage data and algorithmic and AI approaches to curation and research? I am interested in cultivating GLAM community capacity around responsible, ethically grounded computational engagement with data. Some of that interest has to do with positionality – me being a mixed race, first generation college student, from a working class background. I’m constantly trying to find ways for my labor to address historic and contemporary marginalization. Always Already Computational: Collections as Data was an Institute of Museum and Library Services supported effort that iteratively developed a range of deliverables meant to spark capacity around principles-driven creation of computationally amenable collections . In that work I was very lucky to be joined by Laurie Allen, Stewart Varner, Hannah Frost, Elizabeth Russey Roke, and Sarah Potvin. With a better sense of community need I later embarked on Collections as Data: Part to Whole – an effort supported by the Andrew W. Mellon Foundation. Part to Whole is essentially a regranting and cohort development program. Hannah Scates Kettler, Stewart Varner, Yasmeen Shorish, and I are currently working with 12 institutions (large R1s, historical societies, museums, State-based digital libraries, and more) to develop models that guide collections as data production and models that help organizations develop sustainable services around collections as data. Over the course of 2019 I worked as a Practitioner Research in Residence at OCLC Research, interviewing and holding convenings for professionals within and outside of libraries in the United States. This work culminated in the community research agenda Responsible Operations: Data Science, Machine Learning, and AI in Libraries. I felt a lot of pressure to get this work right. I did not want to write some breathless utopian endorsement of AI. Any success I have in that regard is due to the wisdom of the community, any failures are mine. The library community in the United States feels like it has reached a certain level of awareness regarding the pitfalls of AI, helped considerably by the work of Safiya Noble, practitioners like Jason Clark, and an understanding that library community practices have long held the potential to systematically impact communities in a discriminatory manner. Rumman Chowdhury introduced me to the concept of responsible operations which was a perfect way to encapsulate where it feels like we are as a community. A number of us want to use AI to strengthen library services but only if it doesn’t compromise commitments to cultivating a more equitable society. Of course, no community is uniform in their beliefs, and libraries are no exception. Some at junior and senior levels have quietly – and not so quietly – expressed the view that preoccupation with responsibility or ethics is orthogonal to progress and allows the library community to be beat in some imagined race with the private sector. These are dangerous views and the stakes are real. We must act accordingly. For years, many in the library and cultural heritage world have critiqued digitization efforts as replicating (or even accelerating) long-standing biases that center on white, male, and US/Eurocentric collection patterns, viewpoints, and catalog descriptions. In both the Santa Barbara Statement on Collections as Data and the Always Already Computational: Collections as Data final report, you and your partners have pointed to a crucial need for critical engagement with biases and shortcomings and an intention to address the needs of vulnerable communities represented in the materials. What are some examples of these approaches that you’ve found to be successful? Collections as Data: Part to Whole requires that regrantees demonstrate capacity to serve underrepresented communities – a consideration that spans thematic coverage of the collection in question, community buy-in, and a demonstrated commitment to ethical principles that work against the potential for harm. Examples of Part to Whole work addressing your questions include but are not limited to Kim Pham’s effort at the University of Denver to develop a terms of use for collections as data and Amanda Henley and Maria Estorino’s effort at the University of North Carolina Chapel Hill to discover and increase access to Jim Crow laws and other racially-based legislation in North Carolina between Reconstruction and the Civil Rights Movement. More broadly, there is so much good work being done. I am super inspired by Dorothy Berry’s advocacy at Harvard, resulting in a 2020-2021 exclusive focus on the digitization of Black American History. I am inspired by the Global Indigenous Data Alliance’s CARE Principles, co-led by Stephanie Russo Carroll and Maui Hudson. A response to the FAIR Principles, CARE problematizes FAIR’s, “focus on characteristics of data that will facilitate increased data sharing among entities while ignoring power differentials and historical contexts.” A CARE principle like indigenous “Authority to Control” presents a difficult and needed challenge to the cultural heritage community. What could it look like for more institutions to relinquish control of collections to their rightful owners? It is not often the case that capital – stolen or not – is returned and I imagine even the most well meaning libraries will struggle mightily within their own hierarchies to make this happen. I appreciate Eun Seo Jo and Timnit Gebru’s effort to bridge the archives community and machine learning community. Attempts to thread the needle on cross-domain work is always tough but it is definitely needed. T-Kay Sangwand’s Preservation is Political: Enacting Contributive Justice and Decolonizing Transnational Archival Collaborations is a must read. Michelle Caswell’s work – as a whole – is fundamental to improving efforts in these spaces. In your Responsible Operations: Data Science, Machine Learning, and AI in Libraries report, you cite Nicole Coleman’s suggestion that, in regard to machine learning, libraries might be better served to “manage bias” rather than attempt (or claim) to eliminate it. Can you talk a little bit more about that framing and why you feel it’s productive in the library world? I think people heard enough from me about it in Responsible Operations. I encourage folks to read Nicole’s subsequently published article, Managing Bias When Library Collections Become Data. What do you think the role for library and information professionals is in larger conversations about “endangering data” and algorithmic and data justice? I think there are many of us doing this work. While former Illinois University Librarian Paula Kaufman’s testimony before Congress (pg. 77) against a Federal surveillance program gives me chills every time I read it, I often end up thinking about what combination of colleagues, mentors, institutional culture, and personal and professional ethics were in place to make that act of bravery possible. That naturally leads to thinking about what it would take to cultivate similarly principled acts, large and small, among my colleagues. That seems like a promising road to head down. Is there anything else you want to add, or any work or other projects you want readers to know about? I appreciate the opportunity to share thoughts during Endangered Data Week. In addition to the people and projects mentioned above, I encourage folks to check out the Indigenous Protocol and Artificial Intelligence Position Paper; Mozilla’s recent work on Data for Empowerment, and Ruha Benjamin’s incredibly powerful Data4BlackLives keynote. The post Endangering Data Interview with Thomas Padilla appeared first on DLF.Endangered Data Week Returns This Fall!2020-08-04T15:15:25+00:002020-08-04T15:15:25+00:00http://endangereddataweek.org/2020/08/04/endangered-data-week-returns-this-fall<p><img data-attachment-id="21728" data-permalink="https://www.diglib.org/endangered-data-week-returns-this-fall/edw-cloud/" data-orig-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/08/edw-cloud.png?fit=2011%2C1157&ssl=1" data-orig-size="2011,1157" data-comments-opened="0" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="edw-cloud" data-image-description="" data-medium-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/08/edw-cloud.png?fit=300%2C173&ssl=1" data-large-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/08/edw-cloud.png?fit=1024%2C589&ssl=1" loading="lazy" class="wp-image-21728 size-medium alignright" src="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/08/edw-cloud.png?resize=300%2C173" alt="Endangered Data Week cloud logo" width="300" height="173" srcset="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/08/edw-cloud.png?resize=300%2C173&ssl=1 300w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/08/edw-cloud.png?resize=1024%2C589&ssl=1 1024w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/08/edw-cloud.png?resize=768%2C442&ssl=1 768w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/08/edw-cloud.png?resize=1536%2C884&ssl=1 1536w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2020/08/edw-cloud.png?w=2011&ssl=1 2011w" sizes="(max-width: 300px) 100vw, 300px" data-recalc-dims="1" />We invite your participation in <a href="http://endangereddataweek.org/"><b>Endangered Data Week</b></a> (EDW), a distributed network of events running from September 21-25, 2020. This year’s Endangered Data Week will look different than previous years, as we focus on ways to respond to the present moment and have the biggest impact given the current circumstances.</p>
<p><b>Endangered Data</b></p>
<p>EDW is an annual, grassroots effort to foster an environment of data consciousness through distributed conversations, workshops, and political activism. The topics we’ve covered include: raising awareness about risks to public data, workshops to teach data literacy, discussions of civic data, FOIA, online privacy, and many others. </p>
<p>We began EDW in Spring of 2017, on the heels of #DataRescue and concerns from many different sectors about the security and availability of federal data. Accessibility, preservation, and use of government data has been front and center through each of the last three Endangered Data Weeks. This has been for good reason—in those years, we have seen the availability, collection methods, and descriptive framing of federal datasets and records altered for political reasons. As the Environmental<a href="https://envirodatagov.org/"> Data and Governance Initiative (EDGI)</a> has documented, the <a href="https://www.independent.co.uk/news/world/americas/us-politics/trump-climate-change-government-websites-global-warming-a9020461.html">language used on federal websites has changed to obscure topics such as climate change</a> and environmental data, <a href="https://www.ecowatch.com/pollution-database-trump-toxmap-2641963037.html">most notably TOXMAP</a>, have been removed. The White House continues to try to manipulate the 2020 Census to benefit them, first by <a href="https://www.npr.org/2019/06/13/731629018/as-legal-battle-persists-census-citizenship-question-is-put-to-the-test">attempting to drive down participation by including a question about citizenship</a>, and now by <a href="https://www.npr.org/2020/07/21/893880159/president-trump-wants-to-exclude-unauthorized-immigrants-from-census">attempting to find other ways to exclude undocumented people</a>. And we are currently seeing <a href="https://www.salon.com/2020/05/13/what-are-you-hiding-nebraska-stops-releasing-coronavirus-data-from-meat-plants-after-cases-spike/">battles over COVID-19 data at the state</a> and <a href="https://www.washingtonpost.com/health/2020/07/16/coronavirus-hospitalization-data-outcry/">federal level</a> in attempts to suppress inconvenient information about the mishandling of this pandemic.</p>
<p><b>Endangering Data</b></p>
<p>However, as organizers in this present moment, <b>we find ourselves increasingly focused less on threats to publicly available data, but instead on data collection practices that threaten the public</b>. As millions of people across the country have taken to the streets to protest police and vigilante brutality and structural racism in the wake of the murders of George Floyd, Breonna Taylor, Ahmaud Arbery, and countless others, they find themselves tracked by an increasingly prevalent, and increasingly intelligent, surveillance infrastructure. Local and federal police are using film of protests, <a href="https://theintercept.com/2020/07/16/face-masks-facial-recognition-dhs-blueleaks/">combined with facial recognition software</a>, to attempt to identify protestors. They are also using so-called <a href="https://www.wired.com/story/blueleaks-anonymous-law-enforcement-hack/">“open source intelligence”</a>—social media profiles and other publicly available information—to identify, monitor, and develop criminal charges against protestors. Last year, the <a href="https://www.youtube.com/watch?v=vRVHFsGnj80">DLF Government Records Transparency and Accountability expert panel</a> focused on governmental data, highlighting the ways in which data serves as a medium of state violence. In 2020, we see that conversations on data literacy, privacy, and activism are not only timely, but also have immediate ramifications on the safety and political agency of marginalized communities.</p>
<p>While government collection of data represents the clearest threats to privacy and well-being, we would be remiss not to look further. Tech companies, including but not limited to social media, track and aggregate our communication, browsing history, and purchases to better serve us ads, and brick-and-mortar businesses also rely on <a href="https://www.dailydot.com/debug/target-police-relationship-george-floyd/">increasingly complex and invasive surveillance technology</a> in the name of “<a href="https://www.dailydot.com/debug/target-police-relationship-george-floyd/">loss prevention</a>.” As a community composed of mostly academics and information professionals, we also hope participants will look closer into the surveillance we are enacting upon patrons and students, in the form of logging digital resource access information, contact tracing, or through learning management systems at our institutions.</p>
<p>While we don’t plan to stop discussing threats to public data, we want to encourage more events that focus on raising awareness of this type of data collection, retention, and analysis, discussions of how to protect ourselves and our communities, and ways that our work can more directly contribute to a more just society.</p>
<p><b>How to Get Involved</b></p>
<p>In previous years, EDW has consisted primarily of small, in-person workshops and discussions, but obviously that will have to change this year. We are looking to have a scaled-down, all virtual Endangered Data Week, including panel discussion broadcasts, asynchronous writing or resource collection activities, and virtual/asynchronous workshops.</p>
<p>We will be tweeting updates with the hashtag<a href="https://twitter.com/search?q=%23EndangeredData"> #EndangeredData</a> and <a href="https://endangereddataweek.org/map/?sorts%5Bdate%5D=1">adding events to the website</a> and as they are planned, so please keep an eye on the events list for ways to participate. If you are interested in putting together an event, want to suggest an idea, or have any questions or comments, please <a href="mailto:endangereddata@diglib.org">get in touch</a>! If you’re planning an event, you can add it to the website using<a href="https://docs.google.com/forms/d/e/1FAIpQLSeTZ30rbBS5axmn-QpWZML_nEqT_bmiz9V2TiQYYTTQSsKUxw/viewform"> our form</a>. </p>
<p><em>-Endangered Data Week Organizing Team</em></p>
<p>The post <a rel="nofollow" href="https://www.diglib.org/endangered-data-week-returns-this-fall/">Endangered Data Week Returns This Fall!</a> appeared first on <a rel="nofollow" href="https://www.diglib.org">DLF</a>.</p>We invite your participation in Endangered Data Week (EDW), a distributed network of events running from September 21-25, 2020. This year’s Endangered Data Week will look different than previous years, as we focus on ways to respond to the present moment and have the biggest impact given the current circumstances. Endangered Data EDW is an annual, grassroots effort to foster an environment of data consciousness through distributed conversations, workshops, and political activism. The topics we’ve covered include: raising awareness about risks to public data, workshops to teach data literacy, discussions of civic data, FOIA, online privacy, and many others. We began EDW in Spring of 2017, on the heels of #DataRescue and concerns from many different sectors about the security and availability of federal data. Accessibility, preservation, and use of government data has been front and center through each of the last three Endangered Data Weeks. This has been for good reason—in those years, we have seen the availability, collection methods, and descriptive framing of federal datasets and records altered for political reasons. As the Environmental Data and Governance Initiative (EDGI) has documented, the language used on federal websites has changed to obscure topics such as climate change and environmental data, most notably TOXMAP, have been removed. The White House continues to try to manipulate the 2020 Census to benefit them, first by attempting to drive down participation by including a question about citizenship, and now by attempting to find other ways to exclude undocumented people. And we are currently seeing battles over COVID-19 data at the state and federal level in attempts to suppress inconvenient information about the mishandling of this pandemic. Endangering Data However, as organizers in this present moment, we find ourselves increasingly focused less on threats to publicly available data, but instead on data collection practices that threaten the public. As millions of people across the country have taken to the streets to protest police and vigilante brutality and structural racism in the wake of the murders of George Floyd, Breonna Taylor, Ahmaud Arbery, and countless others, they find themselves tracked by an increasingly prevalent, and increasingly intelligent, surveillance infrastructure. Local and federal police are using film of protests, combined with facial recognition software, to attempt to identify protestors. They are also using so-called “open source intelligence”—social media profiles and other publicly available information—to identify, monitor, and develop criminal charges against protestors. Last year, the DLF Government Records Transparency and Accountability expert panel focused on governmental data, highlighting the ways in which data serves as a medium of state violence. In 2020, we see that conversations on data literacy, privacy, and activism are not only timely, but also have immediate ramifications on the safety and political agency of marginalized communities. While government collection of data represents the clearest threats to privacy and well-being, we would be remiss not to look further. Tech companies, including but not limited to social media, track and aggregate our communication, browsing history, and purchases to better serve us ads, and brick-and-mortar businesses also rely on increasingly complex and invasive surveillance technology in the name of “loss prevention.” As a community composed of mostly academics and information professionals, we also hope participants will look closer into the surveillance we are enacting upon patrons and students, in the form of logging digital resource access information, contact tracing, or through learning management systems at our institutions. While we don’t plan to stop discussing threats to public data, we want to encourage more events that focus on raising awareness of this type of data collection, retention, and analysis, discussions of how to protect ourselves and our communities, and ways that our work can more directly contribute to a more just society. How to Get Involved In previous years, EDW has consisted primarily of small, in-person workshops and discussions, but obviously that will have to change this year. We are looking to have a scaled-down, all virtual Endangered Data Week, including panel discussion broadcasts, asynchronous writing or resource collection activities, and virtual/asynchronous workshops. We will be tweeting updates with the hashtag #EndangeredData and adding events to the website and as they are planned, so please keep an eye on the events list for ways to participate. If you are interested in putting together an event, want to suggest an idea, or have any questions or comments, please get in touch! If you’re planning an event, you can add it to the website using our form. -Endangered Data Week Organizing Team The post Endangered Data Week Returns This Fall! appeared first on DLF.Join Endangered Data Week at Mozilla Festival October 28, 20182018-10-24T13:28:21+00:002018-10-24T13:28:21+00:00http://endangereddataweek.org/2018/10/24/mozfest<p><em>This post comes from <a href="mailto:jheppler@unomaha.edu">Jason Heppler</a>, <a href="mailto:sarah.melton@bc.edu">Sarah Melton</a>, <a href="mailto:brandontlocke@gmail.com">Brandon Locke</a>, and <a href="mailto:rachmattson@gmail.com">Rachel Mattson</a>.</em></p>
<p><strong><span class="magee-dropcap dropcap" style="color:;">W</span>e're</strong> excited to announce that <a href="http://endangereddataweek.org/">Endangered Data Week</a> (EDW) is participating at the<a href="https://mozillafestival.org">Mozilla Festival</a> on October 28, 2018. If you're in London this week, we'd love to see you!</p>
<p>Mozilla Festival is a seven day celebration for, by, and about people who love the internet, showcasing world-changing ideas and technology through workshops, talks, and interactive sessions.</p>
<p><a href="https://guidebook.com/guide/147793/event/21682616/">In our session</a>, we'll share a scenario and dataset (inspired by real endangered data) with participants and invite them to problem solve how they might save and steward that data away from erasure or misuse. We'll use participants' responses to set up a closing discussion about how all of us can help our communities work for better data consciousness, civic literacy, and government transparency.</p>
<hr />
<p>Please feel free to share this announcement with anyone else in your communities and networks who might be interested in participating.</p>This post comes from Jason Heppler, Sarah Melton, Brandon Locke, and Rachel Mattson.Join Endangered Data Week during the Mozilla Global Sprint, May 10-112018-04-26T18:42:54+00:002018-04-26T18:42:54+00:00http://endangereddataweek.org/2018/04/26/join-endangered-data-week-during-the-mozilla-global-sprint-may-10-11<p><img loading="lazy" src="https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/02/EDW-Logo-300x185.png?resize=120%2C74&ssl=1" alt="EDW Logo" width="120" height="74" data-recalc-dims="1" /></p>
<p>This post comes from <a href="mailto:jheppler@unomaha.edu">Jason Heppler</a>, <a href="mailto:sarah.melton@bc.edu">Sarah Melton</a>, <a href="mailto:brandontlocke@gmail.com">Brandon Locke</a>, and <a href="mailto:rachmattson@gmail.com">Rachel Mattson</a>.</p>
<p><strong>We’re</strong> excited to announce that <a href="http://endangereddataweek.org/">Endangered Data Week</a> (EDW) is participating in the upcoming <a href="https://foundation.mozilla.org/opportunity/global-sprint/">Mozilla Global Sprint</a> on May 10th and 11th, 2018. We’re writing to invite you to join in and participate!</p>
<p>Mozilla Global Sprints are community events designed to support the development of open projects for a healthy internet. They are meant to be fast-paced, fun, and collaborative. During this year’s sprint, EDW plans to develop a user-friendly toolkit of endangered data-related tutorials and resources—we will collect materials created by organizers of past Endangered Data Week events and also generate new materials that event organizers can use in developing future Endangered Data Week events.</p>
<p>There’s an EDW Global Sprint <a href="https://github.com/endangereddataweek/resources/blob/master/global-sprint-2018.md">primer</a> with more info. Additional details, as well as a place to chat and ask questions during the Sprint, is in the <a href="https://public.etherpad-mozilla.org/p/endangereddataweek">EDW Etherpad</a>.</p>
<p>You are invited to join in and participate no matter what your skill set, time commitments, or experience! We are looking for librarians, archivists, journalists, scholars, artists, lawyers, data scientists, educators, and others; people with experience in teaching, writing, coding, design; as well students and learners of all kinds—anyone who’s passionate about the preservation of digital information, data, and records. You can choose to help for 10 minutes, a few hours, or two full days; and you can participate onsite at one of our EDW sprint locations (in Omaha, East Lansing, and Boston) or remotely. Check out our evolving list of ways to contribute to the toolkit <a href="https://github.com/endangereddataweek/resources/issues">here</a>. And please feel free to reach out to us if you have additional ideas for materials we should add to the toolkit or any questions.</p>
<p>All that’s missing is you! Come join us!</p>
<hr />
<p>Please feel free to share this announcement with anyone else in your communities and networks who might be interested in participating.</p>
<p>The post <a rel="nofollow" href="https://www.diglib.org/join-endangered-data-week-during-the-mozilla-global-sprint-may-10-11/">Join Endangered Data Week during the Mozilla Global Sprint, May 10-11</a> appeared first on <a rel="nofollow" href="https://www.diglib.org">DLF</a>.</p>This post comes from Jason Heppler, Sarah Melton, Brandon Locke, and Rachel Mattson. We’re excited to announce that Endangered Data Week (EDW) is participating in the upcoming Mozilla Global Sprint on May 10th and 11th, 2018. We’re writing to invite you to join in and participate! Mozilla Global Sprints are community events designed to support the development of open projects for a healthy internet. They are meant to be fast-paced, fun, and collaborative. During this year’s sprint, EDW plans to develop a user-friendly toolkit of endangered data-related tutorials and resources—we will collect materials created by organizers of past Endangered Data Week events and also generate new materials that event organizers can use in developing future Endangered Data Week events. There’s an EDW Global Sprint primer with more info. Additional details, as well as a place to chat and ask questions during the Sprint, is in the EDW Etherpad. You are invited to join in and participate no matter what your skill set, time commitments, or experience! We are looking for librarians, archivists, journalists, scholars, artists, lawyers, data scientists, educators, and others; people with experience in teaching, writing, coding, design; as well students and learners of all kinds—anyone who’s passionate about the preservation of digital information, data, and records. You can choose to help for 10 minutes, a few hours, or two full days; and you can participate onsite at one of our EDW sprint locations (in Omaha, East Lansing, and Boston) or remotely. Check out our evolving list of ways to contribute to the toolkit here. And please feel free to reach out to us if you have additional ideas for materials we should add to the toolkit or any questions. All that’s missing is you! Come join us! Please feel free to share this announcement with anyone else in your communities and networks who might be interested in participating. The post Join Endangered Data Week during the Mozilla Global Sprint, May 10-11 appeared first on DLF.Endangered Data Week: Charlottesville Open Data Portal2018-04-09T13:12:22+00:002018-04-09T13:12:22+00:00http://endangereddataweek.org/2018/04/09/endangered-data-week-charlottesville-open-data-portal<p><em>This post is one of several we are publishing about this year’s <a href="http://endangereddataweek.org/">Endangered Data Week</a>. This contribution comes from <a href="https://twitter.com/sheila_bl">Sheila Blackford</a>, librarian for the University of Virginia’s <a href="http://archive.millercenter.org/about">Miller Center</a>, managing editor of </em>American President<em>, and principle investigator for the <a href="http://presidentialcollections.org">Connecting Presidential Collections</a> project.</em></p>
<p>#Charlottesville</p>
<p>The year of 2017 was a hard one for the city of Charlottesville, Virginia. The KKK and Neo-Nazis came to town, protesters and counter-protesters marched and clashed, and three people died. Suddenly Charlottesville was on the national and international news, and people began to wonder what happened and how we got here. Many of the crowds were from out of town and had descended on the city for all the happenings. But after the dust cleared, many in the community began asking questions about race relations, policing, poverty, and affluence. Trying to figure out what happened led to more general questions about the city’s history, its demographics, and its government.</p>
<p>Coincidently the same month as the violent protests, the city of Charlottesville launched a website that might help answer some of the questions. The Charlottesville Open Data Portal includes 78 data sets organized in 10 categories. The data sets include details on the 2010 census, historically underutilized business zones, capital improvement projects, crime data, and real estate assessments. The data is machine readable, and the city encourages people to download the data and play with it. The portal has APIs, and the data can be downloaded as full sets or filtered sets.</p>
<p>As part of <a href="http://endangereddataweek.org/">Endangered Data Week</a> in February 2018, the Scholars Lab at the University of Virginia hosted a presentation about Charlottesville Open Data. Steve Hawkes, Software Applications Manager for the city of Charlottesville, introduced the portal and gave an overview about its details. Then the presentation delved into one example of how people might use the data. By looking at the addresses related to parking tickets, it is possible to see where the police give parking tickets most frequently, the days of the week and the times of the day that cars are most likely to get a ticket, and the specific sections of city streets that are especially prone to ticketing. Not only does this information warn a driver where to be careful about overstaying their parking times near downtown, but it might also allow the city planners to identify areas where parking is especially problematic and develop ideas on how to improve parking options.</p>
<p>The city encourages people to use the data in creative ways and to analyze the data to allow for better decision-making based on actual data. The website has a discussion group and a feedback form to encourage people to use the data, request other data, and provide details about how they have used the data. One limitation on the portal is that the data is relatively new. For example, real estate assessments only go back about 20 years and parking ticket data only goes back to 2002. Still the portal is a great place to start for citizens to learn more about their city and engage with data about its various aspects.</p>
<p>For more information, visit <a href="http://opendata.charlottesville.org/">http://opendata.charlottesville.org/</a></p>
<p>The post <a rel="nofollow" href="https://www.diglib.org/endangered-data-week-charlottesville-open-data-portal/">Endangered Data Week: Charlottesville Open Data Portal</a> appeared first on <a rel="nofollow" href="https://www.diglib.org">DLF</a>.</p>This post is one of several we are publishing about this year’s Endangered Data Week. This contribution comes from Sheila Blackford, librarian for the University of Virginia’s Miller Center, managing editor of American President, and principle investigator for the Connecting Presidential Collections project. #Charlottesville The year of 2017 was a hard one for the city of Charlottesville, Virginia. The KKK and Neo-Nazis came to town, protesters and counter-protesters marched and clashed, and three people died. Suddenly Charlottesville was on the national and international news, and people began to wonder what happened and how we got here. Many of the crowds were from out of town and had descended on the city for all the happenings. But after the dust cleared, many in the community began asking questions about race relations, policing, poverty, and affluence. Trying to figure out what happened led to more general questions about the city’s history, its demographics, and its government. Coincidently the same month as the violent protests, the city of Charlottesville launched a website that might help answer some of the questions. The Charlottesville Open Data Portal includes 78 data sets organized in 10 categories. The data sets include details on the 2010 census, historically underutilized business zones, capital improvement projects, crime data, and real estate assessments. The data is machine readable, and the city encourages people to download the data and play with it. The portal has APIs, and the data can be downloaded as full sets or filtered sets. As part of Endangered Data Week in February 2018, the Scholars Lab at the University of Virginia hosted a presentation about Charlottesville Open Data. Steve Hawkes, Software Applications Manager for the city of Charlottesville, introduced the portal and gave an overview about its details. Then the presentation delved into one example of how people might use the data. By looking at the addresses related to parking tickets, it is possible to see where the police give parking tickets most frequently, the days of the week and the times of the day that cars are most likely to get a ticket, and the specific sections of city streets that are especially prone to ticketing. Not only does this information warn a driver where to be careful about overstaying their parking times near downtown, but it might also allow the city planners to identify areas where parking is especially problematic and develop ideas on how to improve parking options. The city encourages people to use the data in creative ways and to analyze the data to allow for better decision-making based on actual data. The website has a discussion group and a feedback form to encourage people to use the data, request other data, and provide details about how they have used the data. One limitation on the portal is that the data is relatively new. For example, real estate assessments only go back about 20 years and parking ticket data only goes back to 2002. Still the portal is a great place to start for citizens to learn more about their city and engage with data about its various aspects. For more information, visit http://opendata.charlottesville.org/ The post Endangered Data Week: Charlottesville Open Data Portal appeared first on DLF.Endangered Data Week at the University of Maryland2018-04-06T12:24:56+00:002018-04-06T12:24:56+00:00http://endangereddataweek.org/2018/04/06/endangered-data-week-at-the-university-of-maryland<p><em>This post is one of several we are publishing about this year’s <a href="http://endangereddataweek.org/">Endangered Data Week</a>. This contribution comes from <a href="https://twitter.com/joseph_koivisto?lang=en">Joseph Koivisto</a>, Systems Librarian at the McKeldin Library of the University of Maryland.<br />
</em></p>
<p>At the University of Maryland, <a href="http://endangereddataweek.org/">Endangered Data Week</a> (EDW) is less of an event and more of a triathlon. This year’s events featured three events held at the University Libraries and surrounding environs over the course of two weeks. Collectively, the events gave university staff, faculty, and students — graduate and undergraduate — a sprinting tour of the many issues that surround endangered data as a topic.</p>
<p>Working as a partnership between the University Libraries and the <a href="http://mith.umd.edu/">Maryland Institute for Technology in the Humanities</a>, both <a href="https://twitter.com/Purdom_L">Purdom Lindblad</a> and I coordinated these events with the intention of keeping libraries central to the overall tone and tenor of our EDW activities. As the values of openness, accessibility, user-centered services, and activism are core to both libraries and EDW, we felt that the University Libraries would not only be an obvious physical home for our activities, but would also help to frame our conversations on the sometimes uncool topics of data preservation, resource description, and other activities that serve a maintenance role in the overall data landscape. Also central to our planning was the belief that EDW is necessarily an interdisciplinary event and therefore required that we prominently feature diverse disciplinary voices.</p>
<p>In short, we set ourselves a fairly high bar.</p>
<p>The first event of the week was our interdisciplinary panel, “What Counts as Data?” For this panel, our speakers included:</p>
<ul>
<li>Angus Murphy (UMD Department of Plant Science & Landscape Architecture)</li>
<li>Catherine Knight-Steele (UMD Department of Communications and director of the African American History, Culture, and Digital Humanities program)</li>
<li>Jen Serventi (Office of Digital Humanities, National Endowment for the Humanities)</li>
<li>Joanne Archer (UMD Libraries, Special Collections and University Archives)</li>
<li>Ricardo Punzalan (UMD iSchool, moderator)</li>
</ul>
<blockquote class="twitter-tweet" data-width="550" data-dnt="true">
<p lang="en" dir="ltr"><a href="https://twitter.com/hashtag/EndangeredData?src=hash&ref_src=twsrc%5Etfw">#EndangeredData</a> panel discussing the different sources and forms of data. Looking at the broader conceptions of data in both a sense of what has been data for someone and how that becomes data in and of itself <a href="https://t.co/ReGngFYi0b">https://t.co/ReGngFYi0b</a> <a href="https://t.co/tvghbmrZn6">pic.twitter.com/tvghbmrZn6</a></p>
<p>— Jordan S. Sly (@jordanssly) <a href="https://twitter.com/jordanssly/status/968190017213554689?ref_src=twsrc%5Etfw">February 26, 2018</a></p></blockquote>
<p><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>The panel discussion began with a consideration of the varied definitions of data from the different perspectives brought together for the day’s event. This highlighted the critical differences between traditional science, the humanities, and archival perspectives when considering the implications of and responsibilities towards data. Catherine Knight-Steele added yet another layer of complexity by considering the impact of ‘data-fication’ of humanities work and the aids and hindrances that accompany computational approaches to traditionally non-digital questions. The panel also considered what ethical approaches to data preservation and dissemination would look like from the panelists’ disciplinary perspectives, raising questions around scalability of open access approaches to immense data sets, costs associated with rigorous standards and practices, and responsibilities towards data originators. Ricardo Punzalan, our gracious moderator, further explored the notion of data responsibilities by posing questions on accountability, asking to whom we are accountable and what the expectation of accountability means to both individual stakeholders and the larger society.</p>
<p>Following on the heels of our panel, a short series of lightning talks featured practitioners giving insight to their interactions with government data sets, data management practices, and critical perspectives on EDW. Speakers included:</p>
<ul>
<li>Amy Wickner (UMD Libraries & iSchool)</li>
<li>Jessica Lu (MITH Postdoctoral associate)</li>
<li>Kelley O’Neal (UMD Libraries)</li>
<li>Matthew Miller (UMD Roshan Institute, moderator)</li>
</ul>
<p>These excellent speakers inspired a great conversation centered on data preservation practices that lead to larger discussions on advocacy, infrastructure, and institutional roles in preservation of local, state, and federal data.</p>
<p>Next, we continued our EDW activities with a data preservation workshop that covered a basic introduction to data management planning and an overview of a variety of tools used for data management and preservation. The motivation for this course was our belief that awareness of and dedication to the values of data preservation begin with individual practice. By exposing staff, faculty, and students to the core concepts of data management, we hoped to rise the overall level of discourse on campus and, hopefully, improve attendee skills. Along with myself, David Durden (UMD Libraries) and Adam Kriesberg (UMD iSchool) led discussion on data management practices and data preservation initiatives (EDGI, Data Refuge, & c.). We also gave demonstrations on the following tools:</p>
<ul>
<li>Open Refine</li>
<li>Data Accessioner</li>
<li>Fixity</li>
</ul>
<p>And last but not least, our EDW happy hour was held at MilkBoy ArtHouse, a College-Park-via-Philadelphia establishment that serves great drinks and a serviceable cheese steak. Sadly, due to the intense windstorm of March 2<sup>nd</sup>, we had to postpone the event to March 7<sup>th</sup>. Not to be deterred, the University Libraries had a good turn out and interesting discussions were had by all in attendance.</p>
<p>To round out our Endangered Data Week events, I spoke at the University of Virginia’s <a href="http://scholarslab.org/">Scholars’ Lab</a> as part of their Regional Digital Humanities Symposium. I gave a quick overview of our efforts at UMD and laid out our plans for making EDW 2019 even better.</p>
<p>How, you may be asking, do we intend to do that?</p>
<ul>
<li>Fewer events. While we like the idea of having a bunch of events around EDW, we feel that the effort that is put in to organizing multiple events – each with different needs, requirements, and obligations – winds up creating a black hole into which your time, energy and motivation fall. To that end, we hope that over the next year we will be able to conduct multiple EDW-related events that raise awareness of the week itself and work towards the overall EDW mission of raising awareness and promoting advocacy.</li>
<li>ADVERTISING! For the past two years, Purdom and I have let our advertising and outreach efforts slide, leaving us with minimal campus awareness apart from the stalwart few that are already keyed into Endangered Data Week. Shifting more our energies towards advertising and outreach should help us to get our ducks in order.</li>
<li>We plan to seek out partnerships with staff, faculty, and graduate students to help us further evolve the UMD EDW vision. Initially, we will be reaching out to the <a href="http://dsah.umd.edu/">Digital Studies in the Arts and Humanities</a> cohort to see what types of ideas that would like to bring to the table.</li>
<li>Advocacy. Inspired by the EDW advocacy initiatives Brandon Locke discussed in the <a href="http://endangereddataweek.org/events/2018-02-27-endangereddata-week-twitter-chat/">#EndangeredData twitter conversation</a>, we hope to add an element of advocacy that will engage with local, Maryland, and National legislators to help raise consciousness of Endangered Data topics in our political arena. We hope to also partner with UMD scholars that work in the areas of political science and policy to help us create meaningful approaches to advocacy that will encourage material and sustainable changes to government policy.</li>
</ul>
<p>The post <a rel="nofollow" href="https://www.diglib.org/endangered-data-week-at-the-university-of-maryland/">Endangered Data Week at the University of Maryland</a> appeared first on <a rel="nofollow" href="https://www.diglib.org">DLF</a>.</p>This post is one of several we are publishing about this year’s Endangered Data Week. This contribution comes from Joseph Koivisto, Systems Librarian at the McKeldin Library of the University of Maryland. At the University of Maryland, Endangered Data Week (EDW) is less of an event and more of a triathlon. This year’s events featured three events held at the University Libraries and surrounding environs over the course of two weeks. Collectively, the events gave university staff, faculty, and students — graduate and undergraduate — a sprinting tour of the many issues that surround endangered data as a topic. Working as a partnership between the University Libraries and the Maryland Institute for Technology in the Humanities, both Purdom Lindblad and I coordinated these events with the intention of keeping libraries central to the overall tone and tenor of our EDW activities. As the values of openness, accessibility, user-centered services, and activism are core to both libraries and EDW, we felt that the University Libraries would not only be an obvious physical home for our activities, but would also help to frame our conversations on the sometimes uncool topics of data preservation, resource description, and other activities that serve a maintenance role in the overall data landscape. Also central to our planning was the belief that EDW is necessarily an interdisciplinary event and therefore required that we prominently feature diverse disciplinary voices. In short, we set ourselves a fairly high bar. The first event of the week was our interdisciplinary panel, “What Counts as Data?” For this panel, our speakers included: Angus Murphy (UMD Department of Plant Science & Landscape Architecture) Catherine Knight-Steele (UMD Department of Communications and director of the African American History, Culture, and Digital Humanities program) Jen Serventi (Office of Digital Humanities, National Endowment for the Humanities) Joanne Archer (UMD Libraries, Special Collections and University Archives) Ricardo Punzalan (UMD iSchool, moderator) #EndangeredData panel discussing the different sources and forms of data. Looking at the broader conceptions of data in both a sense of what has been data for someone and how that becomes data in and of itself https://t.co/ReGngFYi0b pic.twitter.com/tvghbmrZn6 — Jordan S. Sly (@jordanssly) February 26, 2018 The panel discussion began with a consideration of the varied definitions of data from the different perspectives brought together for the day’s event. This highlighted the critical differences between traditional science, the humanities, and archival perspectives when considering the implications of and responsibilities towards data. Catherine Knight-Steele added yet another layer of complexity by considering the impact of ‘data-fication’ of humanities work and the aids and hindrances that accompany computational approaches to traditionally non-digital questions. The panel also considered what ethical approaches to data preservation and dissemination would look like from the panelists’ disciplinary perspectives, raising questions around scalability of open access approaches to immense data sets, costs associated with rigorous standards and practices, and responsibilities towards data originators. Ricardo Punzalan, our gracious moderator, further explored the notion of data responsibilities by posing questions on accountability, asking to whom we are accountable and what the expectation of accountability means to both individual stakeholders and the larger society. Following on the heels of our panel, a short series of lightning talks featured practitioners giving insight to their interactions with government data sets, data management practices, and critical perspectives on EDW. Speakers included: Amy Wickner (UMD Libraries & iSchool) Jessica Lu (MITH Postdoctoral associate) Kelley O’Neal (UMD Libraries) Matthew Miller (UMD Roshan Institute, moderator) These excellent speakers inspired a great conversation centered on data preservation practices that lead to larger discussions on advocacy, infrastructure, and institutional roles in preservation of local, state, and federal data. Next, we continued our EDW activities with a data preservation workshop that covered a basic introduction to data management planning and an overview of a variety of tools used for data management and preservation. The motivation for this course was our belief that awareness of and dedication to the values of data preservation begin with individual practice. By exposing staff, faculty, and students to the core concepts of data management, we hoped to rise the overall level of discourse on campus and, hopefully, improve attendee skills. Along with myself, David Durden (UMD Libraries) and Adam Kriesberg (UMD iSchool) led discussion on data management practices and data preservation initiatives (EDGI, Data Refuge, & c.). We also gave demonstrations on the following tools: Open Refine Data Accessioner Fixity And last but not least, our EDW happy hour was held at MilkBoy ArtHouse, a College-Park-via-Philadelphia establishment that serves great drinks and a serviceable cheese steak. Sadly, due to the intense windstorm of March 2nd, we had to postpone the event to March 7th. Not to be deterred, the University Libraries had a good turn out and interesting discussions were had by all in attendance. To round out our Endangered Data Week events, I spoke at the University of Virginia’s Scholars’ Lab as part of their Regional Digital Humanities Symposium. I gave a quick overview of our efforts at UMD and laid out our plans for making EDW 2019 even better. How, you may be asking, do we intend to do that? Fewer events. While we like the idea of having a bunch of events around EDW, we feel that the effort that is put in to organizing multiple events – each with different needs, requirements, and obligations – winds up creating a black hole into which your time, energy and motivation fall. To that end, we hope that over the next year we will be able to conduct multiple EDW-related events that raise awareness of the week itself and work towards the overall EDW mission of raising awareness and promoting advocacy. ADVERTISING! For the past two years, Purdom and I have let our advertising and outreach efforts slide, leaving us with minimal campus awareness apart from the stalwart few that are already keyed into Endangered Data Week. Shifting more our energies towards advertising and outreach should help us to get our ducks in order. We plan to seek out partnerships with staff, faculty, and graduate students to help us further evolve the UMD EDW vision. Initially, we will be reaching out to the Digital Studies in the Arts and Humanities cohort to see what types of ideas that would like to bring to the table. Advocacy. Inspired by the EDW advocacy initiatives Brandon Locke discussed in the #EndangeredData twitter conversation, we hope to add an element of advocacy that will engage with local, Maryland, and National legislators to help raise consciousness of Endangered Data topics in our political arena. We hope to also partner with UMD scholars that work in the areas of political science and policy to help us create meaningful approaches to advocacy that will encourage material and sustainable changes to government policy. The post Endangered Data Week at the University of Maryland appeared first on DLF.The Many Forms of Endangered Data: Notes from Michigan2018-04-04T15:28:39+00:002018-04-04T15:28:39+00:00http://endangereddataweek.org/2018/04/04/the-many-forms-of-endangered-data-notes-from-michigan<p><em>This post is one of several we are publishing about this year’s <a href="http://endangereddataweek.org/">Endangered Data Week</a>. This contribution comes from</em> <em><a href="https://twitter.com/612to651?lang=en">Justin Schell</a>, Director of the Shapiro Design Lab at the University of Michigan Libraries.</em></p>
<p>On Saturday, March 3rd, I participated in a panel discussion organized by the Wayne State University (WSU) National Digital Stewardship Alliance (NDSA) & Association for Information Science and Technology (ASIS&T) about the many different forms that endangered data can take, and what is being done to try and address these challenges across many different fields. The panel consisted of <a href="http://sis.wayne.edu/faculty/bio.php?id=41845">Kimberly Schroeder</a> (Coordinator of the Archival Program and Lecturer at WSU); <a href="https://library.wayne.edu/info/staff-directory/fr8961">Dr. Katherine Akers</a> (Biomedical Research Data Specialist at the Shiffman Medical Library at WSU); Dr. <a href="http://sis.wayne.edu/faculty/bio.php?id=149548">Laura Sheble</a> (Assistant Professor at WSU’s School of Information Sciences); and myself, Director of the <a href="https://www.lib.umich.edu/design-lab">Shapiro Design Lab</a> at the University of Michigan Library and member of <a href="https://envirodatagov.org/">EDGI</a>, the Environmental Data & Governance Initiative.</p>
<p>I’ll briefly summarize our conversation in this post, but you can view the full video (which consists of our slides and audio) at the bottom of this post.</p>
<p>Each panelist took a different approach to the idea of endangered data (and the means of addressing those challenges), and how it intersected with their own work.</p>
<p>I went first and gave an overview and background of the larger “Data Rescue” movement that began in December of 2016 and has evolved through 40 different archiving events and a broad coalition of librarians, archivists, data professionals, and concerned citizens helping to preserve access to public federal information. I discussed some lessons learned over the past year, ways that people can stay involved (by using <a href="https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak?hl=en-US">Chrome</a> and <a href="https://addons.mozilla.org/en-US/firefox/addon/wayback-machine_new/">Firefox</a> extensions to automatically save pages to the Internet Archive’s <a href="http://archive.org/web/">Wayback Machine</a>), and a number of new directions that have emerged out of these events, directions that are summarized in this <a href="https://docs.google.com/document/d/1vCUALNoGvHE4F2AkcbvXvjLmMGrxx4CP1gYayc5RZdw/edit">one-year review document</a>.</p>
<p>Katherine Akers discussed a project she did with her WSU Library colleagues to better understand what kind of federal data was highly used and/or valued by WSU faculty. Much of this came from the biomedical field, and the data that faculty most frequently reported was the <a href="https://www.ncbi.nlm.nih.gov/">National Center for Biotechnology Information</a>. The challenge, as Akers noted, is how to judge the vulnerability of these datasets, with some having mirrors and large-scale preservation plans, while others more vulnerable due to lack of funding and/or infrastructure. Her presentation ended with a call to the librarian and archive community to better understand and address this vulnerability question, in order to better ensure continued access of important public data to researchers.</p>
<p>Kimberly Schroeder focused on format obsolescence in her presentation, and the challenges of preserving (and sometimes even loading or playing) content that is either born-digital or from an earlier moment in the digital era (think floppy disks and Jaz drives). A great many number of archives, she argues, are not set up to handle these kinds of materials, and we often don’t know what we’re missing because we can’t get them to load or play, much less assess the contents and their importance. However, this isn’t just a question of archaic formats; compact discs can range wildly in terms of quality, from gold discs to the cheapest recordable CDs, some of which will no longer play. With information trapped on these pieces of media, Schroeder voiced a concern shared by many in the digital preservation and cultural heritage field, that future generations of scholars will be severely limited as they try to study the eras documented within such increasingly inaccessible information.</p>
<p>Laura Sheble took a broader philosophical view on the question of endangered data, asking not just about endangered data, but also data that endangers people in its collection and use. She illustrated much of this in a discussion of a <a href="https://datadrivendetroit.org/danny-devries/another-fine-mess-mass-confusion-on-literacy-rates-in-detroit/">quasi-viral (and inaccurate) statistic</a> that nearly 50% of people in Detroit couldn’t read, which began as a 1992 study, was picked up again in 2011, and was seen again as recently as 2017. Sheble argues that, beyond the damage that such a false claim does for the people of Detroit, data hasn’t been collected well enough to actually measure literacy rates successfully. In a parallel vein, Sheble discussed the changing data practices within the health field, including electronic medical records that were built around billing practices (and the consequences of that for how data is gathered, saved, and made available) and what the shift to both personalized and incentivized medicine (FitBit and other activity trackers serving as proxies for health) means for accessibility of data, to doctors, researchers, and patients.</p>
<p>You can view the full discussion, along with the slides of each presenter, with the video below.</p>
<p><iframe class='youtube-player' width='1170' height='659' src='https://www.youtube.com/embed/5iyI6U2hVA4?version=3&rel=1&fs=1&autohide=2&showsearch=0&showinfo=1&iv_load_policy=1&wmode=transparent' allowfullscreen='true' style='border:0;'></iframe></p>
<p>The post <a rel="nofollow" href="https://www.diglib.org/the-many-forms-of-endangered-data-notes-from-michigan/">The Many Forms of Endangered Data: Notes from Michigan</a> appeared first on <a rel="nofollow" href="https://www.diglib.org">DLF</a>.</p>This post is one of several we are publishing about this year’s Endangered Data Week. This contribution comes from Justin Schell, Director of the Shapiro Design Lab at the University of Michigan Libraries. On Saturday, March 3rd, I participated in a panel discussion organized by the Wayne State University (WSU) National Digital Stewardship Alliance (NDSA) & Association for Information Science and Technology (ASIS&T) about the many different forms that endangered data can take, and what is being done to try and address these challenges across many different fields. The panel consisted of Kimberly Schroeder (Coordinator of the Archival Program and Lecturer at WSU); Dr. Katherine Akers (Biomedical Research Data Specialist at the Shiffman Medical Library at WSU); Dr. Laura Sheble (Assistant Professor at WSU’s School of Information Sciences); and myself, Director of the Shapiro Design Lab at the University of Michigan Library and member of EDGI, the Environmental Data & Governance Initiative. I’ll briefly summarize our conversation in this post, but you can view the full video (which consists of our slides and audio) at the bottom of this post. Each panelist took a different approach to the idea of endangered data (and the means of addressing those challenges), and how it intersected with their own work. I went first and gave an overview and background of the larger “Data Rescue” movement that began in December of 2016 and has evolved through 40 different archiving events and a broad coalition of librarians, archivists, data professionals, and concerned citizens helping to preserve access to public federal information. I discussed some lessons learned over the past year, ways that people can stay involved (by using Chrome and Firefox extensions to automatically save pages to the Internet Archive’s Wayback Machine), and a number of new directions that have emerged out of these events, directions that are summarized in this one-year review document. Katherine Akers discussed a project she did with her WSU Library colleagues to better understand what kind of federal data was highly used and/or valued by WSU faculty. Much of this came from the biomedical field, and the data that faculty most frequently reported was the National Center for Biotechnology Information. The challenge, as Akers noted, is how to judge the vulnerability of these datasets, with some having mirrors and large-scale preservation plans, while others more vulnerable due to lack of funding and/or infrastructure. Her presentation ended with a call to the librarian and archive community to better understand and address this vulnerability question, in order to better ensure continued access of important public data to researchers. Kimberly Schroeder focused on format obsolescence in her presentation, and the challenges of preserving (and sometimes even loading or playing) content that is either born-digital or from an earlier moment in the digital era (think floppy disks and Jaz drives). A great many number of archives, she argues, are not set up to handle these kinds of materials, and we often don’t know what we’re missing because we can’t get them to load or play, much less assess the contents and their importance. However, this isn’t just a question of archaic formats; compact discs can range wildly in terms of quality, from gold discs to the cheapest recordable CDs, some of which will no longer play. With information trapped on these pieces of media, Schroeder voiced a concern shared by many in the digital preservation and cultural heritage field, that future generations of scholars will be severely limited as they try to study the eras documented within such increasingly inaccessible information. Laura Sheble took a broader philosophical view on the question of endangered data, asking not just about endangered data, but also data that endangers people in its collection and use. She illustrated much of this in a discussion of a quasi-viral (and inaccurate) statistic that nearly 50% of people in Detroit couldn’t read, which began as a 1992 study, was picked up again in 2011, and was seen again as recently as 2017. Sheble argues that, beyond the damage that such a false claim does for the people of Detroit, data hasn’t been collected well enough to actually measure literacy rates successfully. In a parallel vein, Sheble discussed the changing data practices within the health field, including electronic medical records that were built around billing practices (and the consequences of that for how data is gathered, saved, and made available) and what the shift to both personalized and incentivized medicine (FitBit and other activity trackers serving as proxies for health) means for accessibility of data, to doctors, researchers, and patients. You can view the full discussion, along with the slides of each presenter, with the video below. The post The Many Forms of Endangered Data: Notes from Michigan appeared first on DLF.Endangered Data Week at the University of Victoria2018-04-04T14:18:42+00:002018-04-04T14:18:42+00:00http://endangereddataweek.org/2018/04/04/endangered-data-week-at-the-university-of-victoria<p><em>This post is one of several posts we are publishing about this year’s <a href="http://endangereddataweek.org/">Endangered Data Week</a>. This contribution comes from <a href="https://twitter.com/jmhuculak">J. Matthew Huculak</a>, Digital Scholarship Librarian at the University of Victoria Libraries.<br />
</em></p>
<p>At the Digital Humanities Summer Institute in 2017, The University of Victoria Libraries inaugurated a newly created space called the Digital Scholarship Commons (DSC)—a place for collaborative learning, training, and resources around digital fluency. We’re a fairly new space in the library, and students are still getting to know about the workshops, consultations, and training <a href="https://onlineacademiccommunity.uvic.ca/dsc/">we give in the Commons</a>. When <a href="http://endangereddataweek.org/">Endangered Data Week</a> was announced, we thought it was a great opportunity to talk with our community about data preservation and the challenges libraries and archives face in preserving data in the digital age. And to talk about how we’re thinking about these challenges in the DSC. To illustrate this point, we decided to take over the main foyer of the library in order to set up an NES console on which students, faculty, staff, and the wider community could play <em>Duck Hunt</em>.</p>
<figure id="attachment_18758" aria-describedby="caption-attachment-18758" style="width: 225px" class="wp-caption aligncenter"><img data-attachment-id="18758" data-permalink="https://www.diglib.org/endangered-data-week-at-the-university-of-victoria/fig2_telecast/" data-orig-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig2_telecast.jpg?fit=3024%2C4032&ssl=1" data-orig-size="3024,4032" data-comments-opened="0" data-image-meta="{"aperture":"2.2","credit":"","camera":"iPhone SE","caption":"","created_timestamp":"1519659588","copyright":"","focal_length":"4.15","iso":"32","shutter_speed":"0.025","title":"","orientation":"1"}" data-image-title="Fig2_telecast" data-image-description="" data-medium-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig2_telecast.jpg?fit=225%2C300&ssl=1" data-large-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig2_telecast.jpg?fit=768%2C1024&ssl=1" loading="lazy" class="wp-image-18758 size-medium" src="https://i1.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig2_telecast-225x300.jpg?resize=225%2C300&ssl=1" alt="Figure 2: Library-wide advertisement for Endangered Data Week event" width="225" height="300" srcset="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig2_telecast.jpg?resize=225%2C300&ssl=1 225w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig2_telecast.jpg?resize=768%2C1024&ssl=1 768w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig2_telecast.jpg?w=2340 2340w" sizes="(max-width: 225px) 100vw, 225px" data-recalc-dims="1" /><figcaption id="caption-attachment-18758" class="wp-caption-text">Figure 2: Library-wide advertisement for Endangered Data Week event</figcaption></figure>
<p>Last year, I taught to the Introduction to Digital Humanities course in the English Department at UVic, and by far, one of the most popular days in the classroom was when we came over to the library to look at the media archeology material being collected by my colleague John Durno, the Systems Librarian for UVic. John gave a lecture on media archeology, and then had students “recover” some old files with Kryoflux. We spent the class playing <a href="https://archive.org/details/Hi-Res_Adventure_1_Mystery_House_1980_On-Line_Systems">Mystery House</a> in the Internet Archive’s Internet Arcade, and on an Apple II from our collection. What struck me from that day’s activities was how engaged students were with the hands-on activities with the older machines as well as how empowered they felt after having learned about how they might contribute to the future of digital preservation through their own data curation activities.</p>
<p>So, for the big event during EDW, we wanted to harness that energy of having students interact with the past in order to think about the future. We decided <em>Duck Hunt</em> would allow us to illustrate the following points:</p>
<ol>
<li>We are living in a <a href="https://www.theatlantic.com/technology/archive/2015/10/raiders-of-the-lost-web/409210/">digital dark age</a>.</li>
<li>In order to preserve a digital object, librarians and archivists have to maintain
<ul>
<li>The file</li>
<li>The software that runs the file</li>
<li>The hardware that runs the software to read the file</li>
</ul>
</li>
</ol>
<p>In order to play Duck Hunt on the original NES, you need to find an older Cathode Ray Tube (CRT) television since the original <a href="https://www.howtogeek.com/181303/htg-explains-how-the-nintendo-zapper-worked-and-why-it-doesnt-work-on-new-tvs/">“light gun” is programmed</a> to work with that technology. Second, you need to find the cartridge and gaming system to hook up to the television. Our colleague, Jen Wells, was kind enough to let us borrow her childhood system. But, in the end, we were hoping that the pure joy of playing on an old gaming system would give us a way to talk about some serious issues facing us all.</p>
<figure id="attachment_18757" aria-describedby="caption-attachment-18757" style="width: 432px" class="wp-caption aligncenter"><img data-attachment-id="18757" data-permalink="https://www.diglib.org/endangered-data-week-at-the-university-of-victoria/fig3_artandduckhunt/" data-orig-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt.jpg?fit=4032%2C3024&ssl=1" data-orig-size="4032,3024" data-comments-opened="0" data-image-meta="{"aperture":"2.2","credit":"","camera":"iPhone SE","caption":"","created_timestamp":"1519651615","copyright":"","focal_length":"4.15","iso":"50","shutter_speed":"0.033333333333333","title":"","orientation":"1"}" data-image-title="Fig3_ArtandDuckhunt" data-image-description="" data-medium-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt.jpg?fit=300%2C225&ssl=1" data-large-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt.jpg?fit=1024%2C768&ssl=1" loading="lazy" class="wp-image-18757 " src="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt-1024x768.jpg?resize=432%2C324&ssl=1" alt="Figure 3: John Durno talks about Telidon Art while Duck Hunt is ready for player 1" width="432" height="324" srcset="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt.jpg?resize=1024%2C768&ssl=1 1024w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt.jpg?resize=300%2C225&ssl=1 300w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt.jpg?resize=768%2C576&ssl=1 768w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt.jpg?resize=400%2C300&ssl=1 400w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt.jpg?w=2340 2340w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig3_ArtandDuckhunt.jpg?w=3510 3510w" sizes="(max-width: 432px) 100vw, 432px" data-recalc-dims="1" /><figcaption id="caption-attachment-18757" class="wp-caption-text">Figure 3: John Durno talks about Telidon Art while Duck Hunt is ready for player 1</figcaption></figure>
<p>To accompany the gaming set up, we created a <a href="https://www.diglib.org/wp-content/uploads/sites/3/2018/04/EndangeredDataWeek2018_small.pdf">poster</a> that asks, “Is Your Data Safe?” The poster talks briefly about issues of the digital dark age, the Digital Millennium Copyright Act, as well initiatives in the library to address the hardware issues some of us face. Durno has just been given the green light to open up our libraries first Media Archeology Lab that will be made available to patrons to work with software in emulated and non-emulated environments. Durno has been doing brilliant work recovering the art of <a href="https://en.wikipedia.org/wiki/Telidon">Telidon videotext/teletext</a> artists like Glenn Howarth and Geoffrey Shea, whose pioneering artwork was almost lost to the trashbin of history (Durno was literally given floppies pulled from a garbage can). John had some of this artwork playing on a large screen behind the video game station, and had a table set up to show off early computer storage—including an array of floppy disks of all sizes.</p>
<figure id="attachment_18756" aria-describedby="caption-attachment-18756" style="width: 464px" class="wp-caption aligncenter"><img data-attachment-id="18756" data-permalink="https://www.diglib.org/endangered-data-week-at-the-university-of-victoria/fig1_johnandmatt/" data-orig-file="https://i2.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig1_JohnandMatt.jpg?fit=2016%2C1512&ssl=1" data-orig-size="2016,1512" data-comments-opened="0" data-image-meta="{"aperture":"1.8","credit":"","camera":"iPhone 7","caption":"","created_timestamp":"1519660263","copyright":"","focal_length":"3.99","iso":"40","shutter_speed":"0.04","title":"","orientation":"1"}" data-image-title="Fig1_JohnandMatt" data-image-description="" data-medium-file="https://i2.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig1_JohnandMatt.jpg?fit=300%2C225&ssl=1" data-large-file="https://i2.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig1_JohnandMatt.jpg?fit=1024%2C768&ssl=1" loading="lazy" class="wp-image-18756 " src="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig1_JohnandMatt-1024x768.jpg?resize=464%2C348&ssl=1" alt="Figure 1: John Durno and Matt Huculak in front of the "Is Your Data Safe" poster" width="464" height="348" srcset="https://i2.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig1_JohnandMatt.jpg?resize=1024%2C768&ssl=1 1024w, https://i2.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig1_JohnandMatt.jpg?resize=300%2C225&ssl=1 300w, https://i2.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig1_JohnandMatt.jpg?resize=768%2C576&ssl=1 768w, https://i2.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig1_JohnandMatt.jpg?resize=400%2C300&ssl=1 400w, https://i2.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig1_JohnandMatt.jpg?w=2016&ssl=1 2016w" sizes="(max-width: 464px) 100vw, 464px" data-recalc-dims="1" /><figcaption id="caption-attachment-18756" class="wp-caption-text">Figure 1: John Durno and Matt Huculak in front of the “Is Your Data Safe” poster</figcaption></figure>
<p>The event was a great success. Students came to play on the NES, stayed to talk about media archeology, and learned about how they could protect their own data. Ray Siemens’ Electronic Textual Cultures Lab (ETCL) donated 30 USB sticks to our event so we could teach students how to encrypt their own drives (data privacy is a big concern in British Columbia), and we handed out literature produced by University Systems on how protecting one’s data by backing it up on university-provided storage space.</p>
<p>At the end of it all, we had a wonderful day of conversations and connecting with our community about how they could support libraries and archives in preserving our shared digital futures.</p>
<figure id="attachment_18759" aria-describedby="caption-attachment-18759" style="width: 487px" class="wp-caption aligncenter"><img data-attachment-id="18759" data-permalink="https://www.diglib.org/endangered-data-week-at-the-university-of-victoria/fig4_map/" data-orig-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig4_map.jpg?fit=2868%2C1454&ssl=1" data-orig-size="2868,1454" data-comments-opened="0" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="Figure 4: University of Victoria only Canadian university participating on map" data-image-description="" data-medium-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig4_map.jpg?fit=300%2C152&ssl=1" data-large-file="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig4_map.jpg?fit=1024%2C519&ssl=1" loading="lazy" class="wp-image-18759 " src="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig4_map-1024x519.jpg?resize=487%2C247&ssl=1" alt=" Figure 4: University of Victoria only Canadian university participating on map" width="487" height="247" srcset="https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig4_map.jpg?resize=1024%2C519&ssl=1 1024w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig4_map.jpg?resize=300%2C152&ssl=1 300w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig4_map.jpg?resize=768%2C389&ssl=1 768w, https://i0.wp.com/www.diglib.org/wp-content/uploads/sites/3/2018/04/Fig4_map.jpg?w=2340 2340w" sizes="(max-width: 487px) 100vw, 487px" data-recalc-dims="1" /><figcaption id="caption-attachment-18759" class="wp-caption-text">Figure 4: University of Victoria only Canadian university participating on map</figcaption></figure>
<p>The post <a rel="nofollow" href="https://www.diglib.org/endangered-data-week-at-the-university-of-victoria/">Endangered Data Week at the University of Victoria</a> appeared first on <a rel="nofollow" href="https://www.diglib.org">DLF</a>.</p>This post is one of several posts we are publishing about this year’s Endangered Data Week. This contribution comes from J. Matthew Huculak, Digital Scholarship Librarian at the University of Victoria Libraries. At the Digital Humanities Summer Institute in 2017, The University of Victoria Libraries inaugurated a newly created space called the Digital Scholarship Commons (DSC)—a place for collaborative learning, training, and resources around digital fluency. We’re a fairly new space in the library, and students are still getting to know about the workshops, consultations, and training we give in the Commons. When Endangered Data Week was announced, we thought it was a great opportunity to talk with our community about data preservation and the challenges libraries and archives face in preserving data in the digital age. And to talk about how we’re thinking about these challenges in the DSC. To illustrate this point, we decided to take over the main foyer of the library in order to set up an NES console on which students, faculty, staff, and the wider community could play Duck Hunt. Figure 2: Library-wide advertisement for Endangered Data Week event Last year, I taught to the Introduction to Digital Humanities course in the English Department at UVic, and by far, one of the most popular days in the classroom was when we came over to the library to look at the media archeology material being collected by my colleague John Durno, the Systems Librarian for UVic. John gave a lecture on media archeology, and then had students “recover” some old files with Kryoflux. We spent the class playing Mystery House in the Internet Archive’s Internet Arcade, and on an Apple II from our collection. What struck me from that day’s activities was how engaged students were with the hands-on activities with the older machines as well as how empowered they felt after having learned about how they might contribute to the future of digital preservation through their own data curation activities. So, for the big event during EDW, we wanted to harness that energy of having students interact with the past in order to think about the future. We decided Duck Hunt would allow us to illustrate the following points: We are living in a digital dark age. In order to preserve a digital object, librarians and archivists have to maintain The file The software that runs the file The hardware that runs the software to read the file In order to play Duck Hunt on the original NES, you need to find an older Cathode Ray Tube (CRT) television since the original “light gun” is programmed to work with that technology. Second, you need to find the cartridge and gaming system to hook up to the television. Our colleague, Jen Wells, was kind enough to let us borrow her childhood system. But, in the end, we were hoping that the pure joy of playing on an old gaming system would give us a way to talk about some serious issues facing us all. Figure 3: John Durno talks about Telidon Art while Duck Hunt is ready for player 1 To accompany the gaming set up, we created a poster that asks, “Is Your Data Safe?” The poster talks briefly about issues of the digital dark age, the Digital Millennium Copyright Act, as well initiatives in the library to address the hardware issues some of us face. Durno has just been given the green light to open up our libraries first Media Archeology Lab that will be made available to patrons to work with software in emulated and non-emulated environments. Durno has been doing brilliant work recovering the art of Telidon videotext/teletext artists like Glenn Howarth and Geoffrey Shea, whose pioneering artwork was almost lost to the trashbin of history (Durno was literally given floppies pulled from a garbage can). John had some of this artwork playing on a large screen behind the video game station, and had a table set up to show off early computer storage—including an array of floppy disks of all sizes. Figure 1: John Durno and Matt Huculak in front of the “Is Your Data Safe” poster The event was a great success. Students came to play on the NES, stayed to talk about media archeology, and learned about how they could protect their own data. Ray Siemens’ Electronic Textual Cultures Lab (ETCL) donated 30 USB sticks to our event so we could teach students how to encrypt their own drives (data privacy is a big concern in British Columbia), and we handed out literature produced by University Systems on how protecting one’s data by backing it up on university-provided storage space. At the end of it all, we had a wonderful day of conversations and connecting with our community about how they could support libraries and archives in preserving our shared digital futures. Figure 4: University of Victoria only Canadian university participating on map The post Endangered Data Week at the University of Victoria appeared first on DLF.