Historians have a new toy to play with. The National Library of Wales has just launched Welsh Newspapers Online – a new digital archive that will eventually provide access to more than 1 million pages of Welsh newspapers.
If, like me, you missed the big launch event on Wednesday you can catch up with the twitter stream (#papur) or read a copy of Jim Mussell’s brilliant seminar paper. It’s a bit too early to give the archive a full review. The site is still in beta (advanced search features and the ability to download articles have not yet been implemented) but it already promises to shake up the landscape of digital newspaper research.
At first glance, the arrival of yet another digital archive might not seem like such a momentous occasion. After all, we’ve become rather accustomed to these resources in recent years – the 19th Century British Library Newspaper Archive and the British Newspaper Archive already provide access to hundreds of British newspapers and have quickly become embedded within our everyday research practices. However, Welsh Newspapers Online has an important new string to its bow – it’s completely free to access.
Whilst the British Library has been forced to work with commercial partners to fund its digitization program, the Welsh Assembly has opted (with a bit of help from the EU) to digitize the country’s newspaper collections using public money. Similar projects have been developed elsewhere in the world – the United States has Chronicling America, New Zealand has Papers Past, and Australia has Trove – but this is the first major, open-access newspaper archive to arrive in the UK.
Why is this important? Firstly, it’s great to see the Welsh Assembly recognize the cultural value of newspaper archives and embrace this opportunity to open up their country’s history to the world. The price of a year’s subscription to the BNA (£80) has never been too prohibitive for dedicated researchers, but many more casual users will now enjoy the chance to explore historical newspapers without the encumbrance of a paywall. In particular, the archive will be usable in classrooms of all ages. Given Michael Gove’s commendable belief in the importance of history teaching, it’s surprising that his government hasn’t supported the development of a similarly accessible way to explore the whole of our ‘Island Story’™.
Open access also creates exciting new opportunities for academic research. The methodological possibilities of an archive are defined by the parameters of its interface: the searches we construct, the ways we filter data, and the forms in which results are displayed all shape the questions we can ask of an archive. By developing new interfaces we can ask new questions. Google’s ngram viewer, for example, allows us to explore data from the existing Google Books archive in a radically new way by graphing the frequency of words and phrases over time. Rather than use keyword searches to zoom in on specific fragments of text, it encourages us to ‘read’ the archive from a distance and measure trends across thousands of books and hundreds of years. Same data; new perspectives.
These new ventures are only possible if archive developers allow their data to be accessed and re-purposed. The Old Bailey Online is a brilliant example of how an open platform encourages new kinds of research: it links to the spatially-focused Locating London’s Past, communicates with the people-centered Connected Histories, and allows its API to connect with text analysis tools like Voyant. If we want to develop even more new tools to explore this archive, we need only secure the funding and expertise to do so. The Datamining With Criminal Intent project, for example, is currently combining three open-access research tools (The Old Bailey, Zotero, and TAPoR) as a part of a project funded by the Digging into Data initiative. This kind of openness and connectability is vital to developing new digital humanities research tools.
Unfortunately, the paywalls that currently sit in front of most British newspaper archives have limited the development of new interfaces. Projects like Mining the Dispatch have shown how digital humanities tools can be used to explore open-access newspaper archives in fascinating new ways, but applying these techniques to commercial archives is, at best, laborious. I remain hopeful that commercial publishers like Gale and Brightsolid might be persuaded to open their data to this kind of research (provided it was done in a way that protected their commercial interests), but these kind of negotiations, and the compromises they require, naturally restrict the viability, scale, and accessibility of new projects.
Welsh Newspapers Online promises to change all of this. In its own words, it is “committed to sharing the data behind [the project]” and has promised to provide access to the website’s APIs soon. This is to be commended in the strongest possible terms. It’s hard to predict what new tools we’ll build to explore the WNO, but its commitment to openness is an invitation to our imaginations; a chance to start thinking about how we might explore press archives in innovative new ways.
The archive will eventually contain a wide range of Welsh newspapers: dailies and weeklies, conservative and liberal, English-language and Welsh-language. Each region will have several titles, inviting the possibility of some interesting comparative studies, and investigations into the relationships between rival papers. By the summer of 2013 more than 100 titles will be available, spanning the period 1804-1910. Some of these are relatively short runs (the Rhos Herald, for example, will be covered between 1909 and 1910), but others stretch across several decades (the Caernarvon and Denbigh Herald runs between 1836 and 1910). The decision to extend the archive beyond the customary ‘digital divide’ of 1900 will be welcomed by historians of the Edwardian period, but those interested in the rest of the 20th century remain thwarted by copyright. Interestingly, the archive also contains multiple editions of some of the papers. Here, for example, we see the same story appearing in the 1st, 3rd, ‘special’, ‘extra special’, and 5th edition of the Evening Express:
This is a refreshing approach to digitization (which has hitherto privileged one edition over all others) and promises to open up some interesting avenues of research. However, it does clutter-up the search results page a bit. A ‘show multiple editions’ toggle would be a welcome addition to the interface.
It’s uncertain yet how many gaps will be present in these runs. The small-print warns us that “the digital collection reflects the physical holdings of the National Library of Wales and not all newspaper issues for the years specified will be available.” The extent of these gaps will go a long way towards determining the usefulness of the archive, so let’s hope that they’re not too extensive.
It’s too early to pass a definitive judgement on the quality of the archive’s interface. Key features have yet to be implemented and it always takes a few days of usage for the problems with an interface to become apparent. The early signs are fairly good. Hit-term highlighting is in place already. Raw OCR data (which looks pretty clean to me) is displayed by default and can be copied and pasted easily by the user. Pages are viewed in a similar fashion to the British Newspaper Archive – a smooth, dynamic interface that lets you zoom and in out of a page and pan across it with ease. The downside of these modern interfaces is that users can’t download or copy images directly from the site. A PDF downloading feature will be added in the future, but these systems (which have always been rather clunky in the past) are never as easy to use as a simple right-click+copy. I’ll be sticking to my snipping tool. The interface also has a handy button that extends the viewer across the width of your browser without going into a full-screen mode that prevents you from accessing other programs. It lacks some of the tools available using Gale’s feature-heavy NewsVault, but its simplicity and speed place it among the best newspaper-viewing interfaces I’ve used.
What’s more, the open-access philosophy embraced by the archive makes it easy to share content. So, I can link you directly to a hit-term highlighted article featuring another version of the ‘You Kick the Bucket; We Do the Rest’ joke that I’ve already tracked around the world.
It’s too early to reach any final conclusions about Welsh Newspapers Online. The archive is incomplete and the interface is still missing key features like an advanced search option. We’ll learn much more about the project’s strengths and weaknesses over the coming months. However, its commitment to openness already marks it out as a welcome new player in the world of digital newspaper archives. The Welsh Assembly and the National Library of Wales should be congratulated on making this happen. Researchers, teachers, academics, and history enthusiasts from around the world will now come to Visit Wales in a new way; historical tourists, wandering happily through open valleys of print.