02/26/19

What Should We Make of Obama’s Library Plans?

Last week former President Barack Obama signed a Memorandum of Understanding with the National Archives and Records Administration (NARA) which articulated plans to digitize and provide online access to all of the declassified records from his administration. I’ll get to that MOU in a minute once we recap how we got here. In 2017, Obama announced that his foundation would not be building a Presidential Library like every one of his predecessors dating back to Herbert Hoover, and instead construct a “Presidential Center” that will include a museum telling the story of his administration but no onsite research library. This plan is outlined in a New York Times article from last week that further explains some of the reasons why Obama has chosen to break with precedent in this way, including significant fundraising requirements for the federal portion of the library and a bevy of issues across the existing presidential libraries.

So why am I writing this post? Unlike (I suspect) most of the readers of the Times article, when I saw a link to NARA’s MOU announcement I clicked through and was eager to see what it contained. What I discovered was a ten page document that gave the outline of a plan to solicit bids from vendors to conduct this digitization work in the upcoming months and years. The general plan seems to make sense. NARA will retain custody of the records and has not changed its perspective on the Presidential Records Act. This plan may save money because NARA will not be responsible for maintaining another large facility in a new location. NARA will have to approve the vendor before the start of digitization work. However, I was left feeling skeptical due to my experience studying public-private partnerships in archives as well as my knowledge of NARA’s track record with electronic records.

My dissertation examined digitization agreements between state archives in the US and organizations like Ancestry.com. I’ll keep it brief but one key finding from my project was that the negotiation and contract writing phase of these project are extraordinarily important because they govern not just the digitization work, but the future of access to state records for years to come. When I read the Obama-NARA MOU, I was looking for some specifics to reassure me that the bids coming in to digitize these 30 million or so pages of paper records would conform to some standards and best practices in the digitization and digital preservation communities. Unfortunately, the MOU contained no such comfort. The bids will all need to reference metadata specifications, technical details, and information security measures, but nothing in the MOU referred to NARA’s Digitization Guidelines from 2004 or made it clear what other standards compliance would be required from a winning bid. The Obama Foundation is going to select a vendor and seek approval from NARA for their choice, based on these agreed upon criteria. I read this MOU as full of good intention but short on the sorts of details that would convince me of the likelihood that this plan will be executed on time and budget.

Another important thing I learned while working on my dissertation was how the importance of enforcement and public sector vigilance in these partnerships. In order to produce a positive outcome, the representatives of the public sector need to do everything in their power to write a strong contract and then enforce its terms. The public sector needs to be asking the right questions in order to ensure that they are protecting the rights of the American people, in whose name these records were produced in the first place. For example, will the vendor produce the minimum required metadata to comply with NARA standards or strive to generate more? What types of access are expected with the proposed system and will that enable enough people to find and engage with the records they want to read from the Obama Presidency? What will the access system for these records actually look like? I’m concerned that their control over this digitization process is giving the Obama Foundation too much control over the former president’s records during a critical phase of their lifecycle, which is the reason why the Presidential Records Act exists. The foundation is preparing the RFP which will hopefully include more specifics and the answers to the above questions plus more.

One thing I emphasize to my students is that digital preservation is fundamentally about trust. When we create digital surrogates of paper records, or when we provide access to large collections of born-digital materials, we as archivists ask for the trust of our users that what we’re serving them over the internet is worthy of their trust. Dissertations, family histories, court cases, and other work of users are based on these digital documents and we must do everything we can to build their trust in our holdings. What I’ve seen in this MOU does not provide enough detail to indicate whether or not any vendor selected to digitize Obama’s records will be held to a high enough standard to build trust into the access system for these presidential records that will not have a permanent home which includes a public reading room. This makes the building of that trust all the more critical. For now, I take the good intentions of NARA and the foundation at face value and look forward to picking apart the RFP when it is released.

Reading through this agreement and the plethora of #hottakes that have accompanied it, I will play the role of a consummate academic and keep asking questions while observing closely. I agree that the presidential libraries model was/is in serious trouble, but am not yet convinced that this new path forward will yield better results for the nation and for users of archival records. There has already been a great deal of digital ink spilled on this topic. For more, here are some links to resources I’ve been reading as I formulated my thoughts on this situation. While I’d like to remain optimistic, I am waiting for more detail before I feel confident that Obama’s declassified records will be made available online once the five year window following his presidency has elapsed.


07/2/18

Kazakhstan

Earlier this month I visited Kazakhstan to give a series of lectures at the Presidential Archives as part of the Second Annual Summer School for Young Archivists, as well as in Shymkent at the Regional Archives. I had an amazing trip and really enjoyed meeting new colleagues, sharing my experiences working here in the US, and traveling around a new country and region of the world I’d not been to before. Here are links to my slides from each of the presentations I made:

The author holding a falcon, wearing Kazakh robe and fur hat.

Mausoleum of Khoja Ahmed Yasawi

I want to extend many thanks to my hosts from the US Embassy in Kazakhstan, especially Elina Akhtiyarova for inviting me on this trip and coordinating everything. Thanks are also due to the Director of the Kazakhstan Presidential Archives, Boris Japarov and Bastar Eskarayev, the Head of Archives in the South Kazakhstan Region who both hosted me at their institutions. Thank you so much! Рақмет!

03/26/18

Brief Thoughts and My Remarks from #EAW18

Last week I had the pleasure of attending the National Forum on Ethics and Archiving the Web, held in New York City. The event itself ran from Thursday-Saturday and featured a highly diverse and outstanding lineup of speakers from the worlds of libraries, archives, digital humanities, journalism, media studies, art, ethnic studies and more. I didn’t do the best job of taking my own notes but highlights included Safia Noble’s keynote on her work related to her recent book Algorithms of Oppression.

I primarily wanted to use this post to share the remarks I made as part of my Thursday panel, “Web Archiving as Civic Duty.” I was honored to share the stage with four other people (sadly, Stacy Wood’s travel was delayed and she just missed the panel due to a freak late season snowstorm) to speak about some of my recent work and thoughts on social media and digital preservation for government materials. It’s mostly unedited although I did go off-script a bit. Let me know your thoughts on the phrase “Tweets May Be Archived” and be sure to check out the paper, co-authored by Amelia Acker and myself, that inspired this talk here: https://doi.org/10.1002/pra2.2017.14505401001. Happy reading!


My inspiration for this talk comes from a phrase I’ve thought a lot about for the past year or so: “tweets may be archived.” When President Obama controlled the @POTUS twitter account, the bio included this line “Tweets may be archived.” Now, the @POTUS account specifies “Tweets archived” and links to a White House privacy policy at https://wh.gov/privacy indicating that they “may” collect mentions and other interactions with official government accounts, but provides little in the way of technical details.

I’m here to say that “tweets may be archived” is not good enough when it comes to preserving and providing access to social media data created by public sector organizations. These posts, and the interactions with them by citizens, bots, and other platform users are critical elements for maintaining understandability of these materials over time.

The public needs to understand how platforms like Twitter and Facebook shape the data they make available to all users, not just the federal government. This is perhaps even more urgent following this week’s reports about the ways in which Cambridge Analytica used the Facebook Graph API to mine and utilize vast amounts of user data. Transparency around APIs, interaction data, and preservation is also critically important for the federal government entities that produce federal records on these sites.

Many social media records created by elected officials and US federal government agencies are considered federal records in light of a 2013 NARA Bulletin on Social Media, as well as the Presidential Records Act. As such, these posts, tweets, and updates are required to be maintained over the long-term by the public sector. It is not enough to suggest that we may or may not be able to easily collect interaction data from social media platforms. The design and infrastructure of these platforms resist the types of digital preservation workflows developed around documents, images, scientific data, and other digital objects.

To put this in OAIS terms, if you don’t know what is in your SIP, how can you plan to describe, preserve, and provide access to the information it contains over time?

In some recent research I conducted with my colleague Amelia Acker, we found that the Facebook and Twitter data files from the Obama Administration’s digital transition team were rich and interesting but also confusing, sometimes incomplete, and not easily usable by researchers or other interested users. There was no contextual information corresponding to, say, the addition of new features to the platforms such as automatic retweets on Twitter or the introduction of Life Events on Facebook. For example, President Obama used the Life Event feature to mark the killing of Osama bin Laden soon after they were introduced to Facebook. Without contextual information, there’s no way to know why Life Events suddenly appear on the profile in 2011, or to understand what the Obama administration’s use of this feature suggests about their savviness when it came to social media.

Essentially, the data packages provided to the administration by Facebook, Twitter, and other platforms were the same as they would be for any user seeking to download their data, and we know they do not necessarily treat their users all that well! Who in the audience has downloaded their Facebook or Twitter data?

Furthermore, the interaction data, those retweets, comments and responses, were not included in the data package. Turns out that social media platforms treat federal records like those of any other user which is perhaps minimally acceptable but not good enough for preservation purposes. Instead of mentioning that engagement with federal entities on social media “may” be archived, social media preservation needs more transparency around the archival capabilities of platforms, changes to features and apps, and other metadata which will increase their legibility over time.

How can we hope to ever build systems that provide meaningful comparison across platforms if we don’t even have a good sense of what a profile data file from Facebook or Twitter even contains? Federal social media records document the behavior of the government online, and the engagements to these posts from citizens, people around the world, and bots represent a critical element of these records which provide a fuller picture of platform activity and has value. The digital preservation and curation community cannot rely on the private sector to produce records that are useful outside of the platforms for which they were designed because private companies have limited incentive to act this way. Additional approaches are needed to ensure that public sector information created on private platforms can remain accessible beyond the lifespan of any one platform. Today’s Twitter could be tomorrow’s MySpace, but federal records require thinking on a much longer timescale. “Tweets may be archived” is simply not good enough.

12/19/17

End of Year Update

Earlier this week I looked at this website and realized I had not posted anything on here since January! That’s far too long to go without any updates, so here’s a few highlights of what I’ve been up to at the University of Maryland iSchool this year…

I have continued my research and work at the USDA National Agricultural Library, along with an additional project on the preservation of social media data from the Barack Obama presidential administration. Here are a few of the peer-reviewed papers I authored along with various colleagues that were published during the past few months:

  • Kahn, E., Arbuckle, P., Kriesberg, A. (2017) Challenge Paper: Challenges to Sharing Data and Models for Life Cycle Assessment. Journal of Data and Information Quality. 9(1), https://doi.org/10.1145/3106236
  • Kriesberg, A., Huller, K., Punzalan, R., Parr, C. (2017) An Analysis of Federal Policy on Public Access to Scientific Research Data. Data Science Journal. 16, p.27. DOI: http://doi.org/10.5334/dsj-2017-027
  • Acker, A., & Kriesberg, A. (2017). Tweets may be archived: Civic engagement, digital preservation and Obama white house social media data: Tweets May Be Archived: Civic Engagement, Digital Preservation and Obama White House Social Media Data. Proceedings of the Association for Information Science and Technology, 54(1), 1–9. https://doi.org/10.1002/pra2.2017.14505401001.

I’ve taught a total of three courses in the MLIS program to iSchool Masters students: INST 641: Policy and Ethics in Digital Curation, INST 643: Curation in Cultural Institutions, and INST 647: Management of Electronic Records and Information. I’ve included links to syllabi in the previous sentence to give an idea of the types of topics covered in these courses. My teaching experiences have been mostly very positive- I’ve enjoyed working with the students here in College Park and helping them develop the skills and knowledge to become successful archivists, librarians, and information professionals.

Beyond that, this year I traveled to Barcelona for the 9th RDA Plenary meeting, attended a workshop on the Impact of Digital Repositories, attended AERI in Toronto, and helped facilitate workshops here at the UMD Libraries on Wikipedia and steps to protect Endangered Data. It’s been quite a year!

I’d like to wish all my readers, colleagues, reviewers (even you, reviewer #2), family and friends a happy holiday season and new year. I’ll leave you with a seasonally appropriate GIF from the National Archives and a nod to next year’s Winter Olympics.

via GIPHY

01/26/17

One Librarian, One Reference

Happy New Year! I still get to say that through the month of January. It’s been a while but I’m back to let you, my loyal reader, know that I am going to participate in an exciting event next week, Tuesday 1/31, at McKeldin Library on the UMD campus. We are hosting a Wikipedia Library #1lib1ref event, a mini edit-a-thon of sorts where librarians come together around the world to add references and citations to Wikipedia. This initiative is sponsored by the Wikipedia Library, with the goal of improving Wikipedia through connecting editors with librarians and reference resources.

An owl standing on a book

Longtime readers of this blog will know that I am a big proponent of Wikipedia, having edited and participated in public events in the past. I am very excited to meet fellow Wikipedians at UMD and perhaps convince some folks from the libraries and iSchool to get more involved with editing!

09/12/16

Fall Update: Teaching and Conferencing this Week

Greetings, dear readers. It’s been a while but I have been doing a lot of different things this summer! Now that the semester has started I can share the syllabus for the course I am teaching. It is my first time being 100% in charge of my own class and so far (two weeks in) I am really enjoying it. The course is INST643: Curation in Cultural Institutions. Here is a link to the syllabus. I put in a lot of work designing this course- let me know what you think in the comments!

In unrelated news, I will be travelling to Denver, CO this week for the 8th Plenary Meeting of the Research Data Alliance. This will be my first RDA meeting, and comes after I was awarded an RDA/US Data Share Fellowship this summer. For this fellowship, I am studying the use of controlled vocabularies in agricultural information access systems. I am super excited to see old colleagues and make new ones at this conference. Look for me in the poster hall Thursday, Friday, and Saturday.

07/4/16

Web Archiving #Brexit

Like many of us around the world, I’ve been following the news out of the United Kingdom after the country voted to leave the EU late last month. In the aftermath of the vote, many Britons were shocked to discover that some of the “Leave” campaign’s promises related to the money paid to the E.U. by the United Kingdom were not going to come to fruition. These are documented across the web, but this succinct Boing Boing post highlights the attempts by these politicians to erase their old campaign website from the internet. Thanks to the Internet Archive, it continues its life as a cached copy, documenting the change to the website which removed content relating to increasing funding for the National Health Service, among other social programs. I was struck by the power of web archiving to document political movements as they are represented online, and how they make it more difficult for politicians to eliminate potentially embarrassing content from the internet.

This article reminded me of another excellent example of the power of web archiving, from the New Yorker article “The Cobweb” by Jill Leopore. She explained that the internet archive also preserved a copy of a website maintained by Ukrainian separatists which appears to show that this group was responsible for downing the Malaysia Airlines flight which went down over Ukraine on July 17, 2014. Why was this particular site was crawled by Internet Archive bots? Well, because:

Anatol Shmelev, the curator of the Russia and Eurasia collection at the Hoover Institution, at Stanford, had submitted to the Internet Archive, a nonprofit library in California, a list of Ukrainian and Russian Web sites and blogs that ought to be recorded as part of the archive’s Ukraine Conflict collection.

I recognize that these two events are not particularly related, other than the fact that web archiving figures in our attempts to understand current events and monitor how people represent themselves and their politics online. As more of our collective lives as humans are lived out in digital spaces, resources like the Internet Archive will only become more valuable as a way of piecing the past together. If you haven’t explored the Wayback Machine, give it a shot! I guarantee you’ll find some really interesting/fun/terrible/amazing old websites on there, just punch in a few domains and have fun…

(P.S. Jill Lepore is the best. Her first book, The Name of War: King Philip’s War and the Origins of American Identity was a major inspiration for my senior honors thesis in History. Read it! Or, at least read more of her articles in the New Yorker, they are awesome.)

03/3/16

Amsterdam and IDCC

Last week, I traveled to Amsterdam to attend and present at the International Digital Curation Conference. I wrote a post about the conference here on the Archives Lab site but I wanted to add a more personal touch here. Amsterdam was a beautiful city which I was happy to explore in between conference events.

Being me, I had to find an archive or library to slip into. I ended up popping in at the Staadsarchief, Amsterdam’s City Archives. It was a beautiful building which houses a few exhibition spaces as well as information about the UNESCO World Heritage sites in the area, including the entire city canal ring. The lower exhibition includes some of the city’s founding documents including the charter. It was a real treat!

Staadsarchief, Amsterdam, NL

Staadsarchief, Amsterdam, NL

As always, I was inspired by the conference and excited to attend IDCC again in the future. Thanks to everyone who stopped by my poster. Here’s a picture of it, via Twitter, and a link to it via the conference website.

02/21/16

Upcoming Conference: IDCC 16

I will be presenting a poster entitled “Agricultural Data Curation: Examples from a National Library” at the International Digital Curation Conference this week. This is the first time I will be publicly sharing the work I’ve been doing as part of my Post-doc, and I’m very excited! As you might be able to guess from the title, this poster presents initial results from my work with the Knowledge Services Division at the National Agricultural Library. We highlight the role collaboration plays in the four primary projects currently ongoing at the division.

Are you going to be in Amsterdam for IDCC? Let me know! I look forward to seeing old colleagues and meeting new ones.

01/26/16

Blizzard Movie Night Yields Unexpected Archivist

I am close to digging out from the historic blizzard which has blanketed the Washington DC region with 2 feet (maybe?) of snow. Since Thursday evening, I have spent a lot of time in my apartment and, on a whim, decided to watch the Enough Said starring James Gandolfini and Julia Louis-Dreyfus.  The movie interested me because it was Gandolfini’s last; little did I know the surprise in store as the plot unfolded.

Gandolfini plays Albert, a recent divorcee and DIGITAL ARCHIVIST who works at a place called the “American Library of Cultural History” which houses a significant collection of television films. I’ll avoid spoilers that do not involve archives– Albert oversees digitization and created metadata for archival episodes of television. What’s more, there is a scene in the closed stacks of the library, complete with a stolen kiss amongst the Hollinger boxes! The rest of the movie was great as well and is recommended for archivists, librarians, curators, and everyone else too :-). It was very well-acted and definitely worth a watch.

While doing some post-film googling, I discovered this post from an excellent site called reel-librarians about Enough Said as well. Add it to the blogroll!