top of page
  • Writer's pictureBryan Benilous

Why We Digitize?

We live in such transformative times. When I was in grad school in the 1990s, I was still doing most of my research in analog formats using print, microfilm and even card catalogs. Leading massive digitization programs for companies like ProQuest and East View, I experienced and participated in a digital transformation of historical research that has created a new area of study, the Digital Humanities. 

Over the first quarter of this century, Historians have fully embraced new technologies. They gained new skills and embraced collaboration with the computer sciences to explore new frontiers in “Big Data.” The work done has brought amazing new research, predictive analytics, and built a massive corpus of digital content now being used for machine learning and AI.

Content not yet digitized, discoverable, and accessible, no longer exists in a usable format for research. And therefore, everything must be digitized and eventually will be digitized.

It bears reminding that none of this is possible without content being collected and digitized in the first place. Billions of pages of historical content has been digitized to date - hundreds of millions of pages of newspapers; practically every single dissertation written in the past 100 years; millions of books dating back to those first printed, like the Gutenberg Bible. Documents, ephemera, recordings, maps, born-digital content, so much has already been digitized but it is just the tip of the iceberg with billions more to go; and new technologies making the process less expensive with better quality.

Historic digitization programs working in collaboration among Libraries, governments, for-profit publishers and not-for-profit organizations have focused on grandiose programs to digitize the holdings of libraries of the Western World. They have made great strides but realized that institutional bias in historic collections skewed these massive datasets, underrepresented minority and marginalized communities. Maybe this is why AI Chatbots have deep flaws related to racism, antisemitism and often fail tasks aligned to historical research.

Both librarians and academic researchers see this and are eager to address this challenge. It is more difficult as collection decisions a century ago impact what is readily available. Librarians need to engage with their local communities and fill big gaps in holdings. So much is already lost and so much more will be lost if there is not a concerted effort to make the digitization process more equitable. 

The institutions that hold this content are not as well endowed and lack the resources to adequately preserve, let alone digitize their holdings. The best way to support these institutions is to collaborate - pool funding and resources, provide peer support.

Living in South Florida, it is all too clear that we are racing the clock until the next massive natural disaster. Hurricanes are getting bigger and they are adding a 6th category of intensification to the scale. Wildfires are spreading in the West. Tornadoes decimating the interior.  When (not if) the storm hits, historic content will be damaged and lost forever. Digital preservation must be a key area of focus for all institutions.

Paperboy was formed with the simple objective of helping libraries and archives digitize their collections. We aim to support Open Access collections, diversify content and collaborate to build a community of peer support.

Send an email to if you would like to collaborate.


bottom of page