The Internet is forever.” So goes a saying regarding the impossibility of removing material—such as stolen photographs—permanently from the Web. Yet paradoxically the vast and growing digital sphere faces enormous losses. Google has been criticized for failing to ensure access to its archive of Usenet newsgroup postings that stretch back to the early 1980s. And now Internet pioneer Vint Cerf has warned of a “digital dark age“ that would result if decades of data—emails, photographs, website postings—becoming lost or unreadable.
Millions of paper records more than 500 years old exist today. But your entire family photo collection could be lost forever with just a single hard drive failure. Stone tablets, parchment, paper, printed photographs have all lasted through the centuries. But some of our data may not. What do we do about preserving the digital deluge?
Cost Versus Value
Technical solutions already exist, but they’re not well known and relatively expensive. How much are we prepared to pay to ensure that digital stuff today is usable in the future? Because if there’s cost involved, inevitably we have to think about what has value that makes it worth keeping.
How can we calculate that value? As an example, the holdings of the U.K. Data Archive include machine-readable versions of all of the General Household Surveys (GHS) carried out between 1971 and 2011. This was a continuous national survey of people living in private households conducted on an annual basis. The cost of the GHS in 2001 was reported as 1.43 million pounds (about $2.2 million) making the value of the survey and its data at least that. As it was the 30th year of this survey the value could be said to be higher as it was part of a series, so we could say the survey was worth more than it cost.
The Office for National Statistics transferred the 2001 data to the U.K. Data Archive in 2002, where we prepared them for preservation and access and published them. Up until today this survey data has been downloaded by 426 people working in government departments, 759 staff working in education, 1,331 students, and 109 others for various uses. So benefits accrue from making the data available even after its creators have exhausted their primary value—re-use is a significant benefit from preserving data and adds value.