Dec 25 2011

The Jokes of Christmas Past

This time last year I was trudging along a slushy pavement with a soggy copy of The Times in my hand. It was only Christmas Eve, but I’d been given an early present – an interview I did with one of the paper’s journalists had just been published. These were heady times. A few month’s earlier I’d given a paper at Yale University and written a well received article for The Guardian. I was beginning to fancy myself as a bit of a media don. This was more than a touch premature – I  haven’t got close to a newspaper, radio show, or TV documentary since. However, at the time, a glittering showbiz career was beckoning (if only in my own mind) and I was undeniably excited.

The whole process started a few weeks earlier. I met with Mike Addelman, the University of Manchester’s brilliant press officer, to talk about publicizing some of my research. I ran through a few possibilities, but when I mentioned my work on American jokes his eyes lit up. “This stuff will really sell”, he said. The only thing missing was a topical hook. It was at this point that I said something that  I’ve lived to regret: “American humour has had a huge influence on the development of British comedy”, I explained. “Everything from stand-up routines, to sitcoms, to Christmas cracker jokes owes a debt of gratitude to the first generation of Yankee jesters.” The reference to Christmas crackers was just an off-the-cuff remark (I needed a third genre of humour to finish my list), but it was like dropping a tinsel-wrapped hand-grenade into the conversation. It was December. Christmas was upon us. Here was our hook.

There was just one problem: it wasn’t strictly true. The more I thought about it, the clearer it became to me that the labored puns in Christmas crackers have always been quintessentially British. Whilst some have undoubtedly drawn inspiration from the work of American humourists, this transatlantic influence is far less noticeable in cracker jokes than most other forms of British comedy. However, the wheels were now in motion and no amount of anxious hand wringing on my part seemed to slow things down. In retrospect, I should have just said: “Mike, I was wrong, we can’t run with the Christmas cracker angle.” Instead, we wrote a press release.

It’s still available online. As you can see from the headline, I really managed to downplay the festive angle: “Victorians went ‘crackers’ for American Jokes”. It’s cheesy, but I still rather like it. Other references to Christmas pop up regularly throughout the piece. Fortunately, I’d managed to find adverts describing American jokebooks as ideal Christmas presents. This put me on slightly more solid ground and, eventually, I was happy to send the press release out into the ether. However, before we gave it to the media, Mike contacted a friend at The Times and offered him exclusive rights to the story. To my surprise, they accepted and told me that they’d interview me over the phone later in the week.

I spent the next few days obsessively making sure that my mobile phone signal didn’t drop below two bars. When Russell Jenkins, the journalist from The Times, finally called me we spoke for about half an hour. It was an enjoyable interview. I spoke a bit too quickly, but managed to get most of my key points across in a clear fashion. I did my best to gloss over the cracker angle and play up the joke books. Later that day I sent a few more quotes over e-mail. Here’s what I wrote about Christmas:

These jests were particularly popular near Christmas – a time that was sometimes referred to by the Victorians as ‘joke-season’. Books of American humour were regularly advertised in the press as perfect Christmas presents. It isn’t hard to imagine some of these jests turning up in Christmas crackers.

Looking back, it was a weaselly piece of backtracking. STOP THE PRESSES: historian from Manchester discovers that it “isn’t hard to imagine” something interesting might have happened. Here’s the headline they actually went for…

When I first saw it my heart sank. It was everything I’d feared. Not only had they picked up the Christmas cracker angle, they’d decided to ramp things up to a whole new level. Rather than claim that America influenced the writing of British cracker jokes, I now seemed to be blaming the Yanks for the whole sorry genre. Two days later, they even hammered the point home in an editorial piece:

 

I don’t want to accuse a Murdoch paper of being economical with the truth – it’s Christmas, after all, and even Satan deserves a day off. Nor do I want to cast aspersions on Russell Jenkins – he’s a nice chap, a very good journalist, and he didn’t write anything that I didn’t feed to him. No, dear reader, the real villain of this story is me. In my rush to transform historical research into  news I allowed the story to slip out of my control. As a result, rather than feel pride at seeing my research in a major national newspaper, I felt like a dirty sellout; a media whore who dropped his academic integrity the moment fame came knocking. Merry Christmas to me.

Reading it again a year later I think it’s possible that I over reacted. It’s really not that bad. When you look beyond the headline, the rest of the article captures the spirit of my research fairly faithfully. Aside from a few cringe-worthy quotes (did I really say that “the Victorians enjoyed a laugh”?), there is little here to be embarrassed about. Nevertheless, the whole experience was a valuable crash course in dealing with the media. As academics, we exercise a great deal of control over the presentation of our ideas – our books, articles, conference papers, and even lectures are carefully scripted. However, the moment I started a conversation with a journalist, this control and security evaporated. Suddenly, I was thinking aloud – and doing it to a man with a notepad. It’s a dangerous situation. Who knows what idiotic, unsubstantiated, or career-ending statements I might have made if that phone call lasted another 20 minutes?

Enough self flagellation. It’s Christmas day, so let’s finish with an uplifting selection of festive Victorian jokes. Merry Christmas and thanks to all of you for supporting the blog in its first month. I’ll see you all in 2012…. if you survive these terrible gags.

 

Victorian Christmas Jokes

Mrs. Henry Peck (whose mother has been visiting them for over four months): “I don’t know what to buy mother for a Christmas present. Do you?
Mr. Henry Peck: “Yes! Buy her a travelling bag!

—-

“Thomas, spell weather,” said a schoolmaster to one of his pupils. “W-i-e-a-t-h-i-o-u-r, weather.” “Well, Thomas, you may sit down,” said the teacher. “I think this is the worst spell of weather we have had since last Christmas.”

—-

“What do you think of the woman with a past?”
“At Christmas she is likely to be won by the man with a present.”

—-

Bagley: “Susan, I intend to invite a few friends to our Christmas dinner. There is De Baggs, Ponsonby, Jupkins –eh?”
Mrs B. “Genial fellows, no doubt. Invite them, William, by all means, and I’ll tell you what I’ll do. I will give Norah a holiday and roast the turkey myself.”
Bagley: “Oh, well, if you’re determined to spoil our enjoyment, I’ll dine at a hotel.”

—-

Jabbers: “Going to get married on the twenty-fifth? Well, you are a chump!”
Havers: “Why?”
“Because all your friends will make one gift do for both wedding and Christmas present.”
“Of course. But hereafter I can do the same with my anniversary and Christmas presents to my wife. See?”

—-

Bill the Burglar (on Christmas Eve): “Sure, Mike; I’m Sandy Claws. Lay down an’ cover up yer heads an go to sleep, or I’ll not leave a thing. See?”

—-

Rose: “I declare! I forgot to remove the prince mark from the Christmas present I sent to Mamie.”
Nellie: “Well, she would know the price, anyhow.”

—-

Mr Taddles: “What was in that package which was stolen from you on your way home?”
Mrs Taddles: “If I must tell, it was a box of cigars I had bought for your Christmas gift. Are you sorry?”
Mr Taddles: “Yes, dear – very sorry – for the thief!”

 

 


Dec 19 2011

Smiling Victorians

Two years ago I taught on an undergraduate course which gave 1st year students an introduction to Victorian Britain. In the opening seminar I divided my students into groups and asked them to define a ‘typical Victorian’. As I expected, they drew upon every cliche in the book: top hats, bonnets, monocles and waxed mustaches cropped up in every discussion. When I asked them to imagine their character’s surroundings, they immediately thought of gloomy workhouses, smoke-filled factories and crumbling Dickensian rookeries. Finally, I asked them to describe their character’s personality. All of them imagined the ‘typical Victorian’ as glum, joyless, or incapable of expressing any emotion at all. When I jokingly asked them to do their best impression of a Victorian they all stared back at me with expressions of disdainful indifference (which I decided not to interpret as genuine contempt).

These responses were not unexpected. For the best part of a century we’ve imagined the Victorians in these unflattering terms. Most people tend to think of them as old fashioned, stuffy, pompous, cripplingly respectable,  emotionally stunted, sexually repressed, and obsessive about manners and decorum. One of the most enduring (though probably apocryphal) images of the period is of Queen Victoria, dressed in black mourning clothes, stating, as if for the whole nation, that “we are not amused”. I’d always thought that I was above such generalizations. As a historian of nineteenth-century popular culture, I’ve made it my mission to prove that the Victorian’s were just like us – that they fell in love, laughed at each other’s jokes, enjoyed a good night out and didn’t spend their lives mired in perpetual gloom.

It turns out that I’m not as enlightened as I thought. A few days ago I stumbled across a remarkable flickr collection of 1,700 nineteenth-century photographs and carte de visites. I’d seen plenty of Victorian photos before, but these were different – the people in them were smiling. I was bowled over. Some of them looked just like old school friends. Others grinned like members of my family. For the first time, in all of the years I’ve been studying them, the Victorians looked like real people. It was a delightfully unsettling experience.  I realized that I was still carrying around all of the prejudices that I thought I’d cast off – the sight of a smiling Victorian still jarred with my deepest preconceptions about the period and its people.

It’s a powerful reminder of how our understanding of the past is mediated through the technologies, objects and texts that capture it. I’ve always pictured the 1920s in the flickering black and white of early cinema – I find it hard to imagine a scene from the period without jazz music playing in background and a flapper in the corner dancing the Charleston.  By the same token, our perceptions of the Victorian period are heavily influenced by its sepia-tinged photographs. For much of the twentieth century, these pictures would have acted as daily reminders of a half-forgotten world. Grim looking grandfathers would glare down disapprovingly from the mantelpiece. It’s hardly surprising that the Victorians are remembered as stilted and joyless. Of course, the source of their gloomy expressions may well have been technical rather than cultural. By the end of the century, photography still required lengthy exposure times and the only way to prevent blurring was to keep absolutely still. It’s possible that the conventions of portraiture led people to strike stern, distinguished poses on purpose – but the more I look at Victorian photographs the more inclined I am to imagine a hidden smile waiting to break out.


Dec 12 2011

British Newspaper Archive – changes to the ‘fair usage’ cap.

When the British Newspaper Archive was launched a few weeks back a lot of researchers were frustrated to discover that the ‘unlimited’ subscription package actually had a ‘fair use’ cap of 1000 page views per month. When I e-mailed the archive’s customer service team about it they informed me that the archive was intended for ‘personal use’ only and that the cap was non-negotiable. Fortunately, they seem to have had a slight change of heart. The ‘fair usage’ section of the archive’s terms & conditions has now been updated to read:

Why do we have a fair usage policy for subscribers? Well, it is certainly not a way to penalise or hold back our customers from conducting their personal research.

We have this in place purely for the (very rare) cases where people might abuse the service, and it is designed to keep the price of subscriptions as low as possible for our customers.

You are permitted to view an average of 1000 pages per month (calculated over a 3 month period). If you get close to the limit, we’ll send you an email to warn you. We always contact users to establish the reason for abnormally heavy use of the site and if they’re just doing their own personal research, we obviously don’t penalise them.

We constantly review the limit, based on average usage of the site by all users. We will continue to keep an eye on this and make adjustments as necessary.

Many services today (such as broadband packages) have similar fair usage policies and they work in the same way as ours i.e they are designed to catch those who use the service excessively (which would drive up the price or reduce the quality of service for the majority of users).

We hope this explains things – Please contact Customer Support if you have any further questions

This looks like good news. The three month average is definitely a welcome concession. It’s hard to interpret precisely what happens when you exceed the limit now – they seem to be suggesting that users will be contacted and exempted from the restrictions if they’re just using the archive for personal research. I’d still like to see how this works in practice before paying for an £80 subscription, but it looks like the problem has been resolved. Well done to all who complained about it and credit to the BNA for listening to our concerns.

 


Dec 11 2011

BNA security problems – bad link to blame

If you clicked on any of the hotlinks in my review of the British Newspaper Archive you might have been taken to an address with “www1.” at the start. If you were also using IE or Firefox this might have resulted in your browser warning you about a security risk. It’s a false alarm; a minor glitch that stems from the addition of the “1″ after “www”. The BNA have assured us that their website is completely secure and that the problem has now been resolved. I’ve fixed the links in my own review – if you’ve linked to the archive on your own blog it would be worth double checking to make sure that the address is correct.

Thanks to Charles Robinson for alerting me to the problem.


Dec 5 2011

Hit-term Highlighting: a half-baked solution

In my recent review of The British Newspaper Archive I moaned about the fact that ‘hit-term highlightingwas mysteriously absent from its interface. Unlike every other archive on the market, the BNA doesn’t highlight your search term on the article image. Here’s how it works in other databases:

In this example, I performed a keyword search for the term ‘Victorian’. One of the articles it returned was this lengthy piece from the Liverpool Mercury. It’s 5616 words long. Fortunately, thanks to hit-term highlighting, I can just skip straight to the word shaded in green and read the part of the article that I’m interested in. A similar search on the BNA would require me to carefully read a column and a half of text in order to find the word I searched for. This really slows down the research process when you’ve got 500 articles to analyse.

With any luck, brightsolid will address this problem with an update to the BNAs interface. This might take a while – in the meantime, there’s a temporary solution to the problem that should save us all a bit of time:

Step 1: perform a normal keyword search.

Step 2: open up an article.

Step 3: Click the ‘Show Article Text’ button at the top of the left hand menu. This reveals the raw OCR text sitting beneath your chosen article.

Step 4: Open your web browser’s ‘find’ tool. The quickest way to do this is to press ‘ctrl+f”

Step 5: Type your keyword into the ‘find’ tool. This should highlight all instances of that word which appear on the page – including the place it appears in the raw OCR.

Step 6: Find and click your keyword in the raw OCR.

Step 7: This should place a thin black box around a line of the article image. Within this box, you’ll find your keyword.

Here’s a video of me searching an article for the term ‘sleeper’:


Dec 5 2011

‘Jonathan’s Jokes: American Humour in the Late-Victorian Press’

My first academic article will be published in the next issue of Media History. It’s all about ‘American Humour’ columns and their role in shaping transatlantic relations during the late nineteenth century. For those of you who can’t wait to read it in print (hello?… is anybody still here?), an advance copy is now available on the journal’s website. Unfortunately, a subscription to Media History is required to view it – unless you’re mad enough to pay £21 to buy your own copy (in which case send the money directly to me and I’ll throw in a signed photograph). It’s going to be published as part of a special issue on ephemeral print culture which will include fantastic articles by Jim Mussell, Laurel Brake, Adrian Bingham, Pam Epstein (author of the brilliant advertisingforlove.com), and Karl Christian Führer. A perfect Christmas gift for the discerning historian-about-town.

Abstract:
During the final quarter of the nineteenth century, columns of American jokes became a regular feature of numerous British newspapers. The Newcastle Weekly Currant, for example, had a weekly column of ‘Yankee Snacks’; The North Wales Chronicle had ‘American Humour’; the Hampshire Telegraph its ‘Jonathan’s Jokes’; and the Northern Weekly Gazette sported a ‘Stars and Stripes’ column. Lloyd’s Weekly Newspaper introduced a regular column of ‘American Jokes’ in 1896, the same year it achieved an unprecedented circulation of one million readers. Almost half a century before Hollywood, here was a distinctively American form of popular culture which took Britain by storm. It has, however, received little academic attention. This article explores the development of the American humour column, considers the way in which it was consumed by British readers, and argues that these seemingly ephemeral jokes played a key role in shaping Victorian encounters with America. 

 

 


Dec 4 2011

The Past Belongs to Brightsolid

On Friday night I had an illuminating Twitter conversation with Will Tattersdill (@faceometer) – a fellow researcher who shares some of my concerns about the new British Library Newspaper Archive. He pointed out an interesting passage in the archive’s terms and conditions:

What you can use the service for:

You can only use the website for your own personal non-commercial use e.g. to research newspaper archives and other archives featured on the website that you are interested in and to purchase goods that we may sell on the website. We are also happy for you to help out other people by telling them about the newspaper archives and other information available on the website and how and where they can be found. However, you must not provide them with copies of any of the newspapers (either an original image of the newspapers or the information on the results page), even if you provide them for free.

It’s easy to brush this off as a classic example of small-print gobbledegook - the  kind of thing we all mindlessly agree to every time we’re forced to update iTunes. But, the more I think about it, the more astonishing this passage seems to be. Are they really suggesting that we can’t show copies of their digital newspapers to other people? Even worse, are they suggesting that we can’t even share the information contained within them? It’s one thing to prevent people from making a profit from these materials, but to try and prohibit us from sharing the fruits of our research with friends, colleagues, and students is truly remarkable. Perhaps I’m jumping the gun here, but does this mean I can’t describe the results of a search in an academic article? Am I prohibited from displaying an a newspaper page via powerpoint in an undergraduate lecture? By posting a screenshot of a (barely legible) article in my review, have I broken their terms and conditions?

I’m not sure. But it’s prompted me to ask an important question: who really owns this material? Almost all of the newspapers in the  BNA are out of copyright and have been preserved by the British Library at the expense of the taxpayer. They belong to us, and we’re all free to copy and quote from them as much as we like. However, it seems that digitised newspapers are an entirely different story. When an out-of-copyright text is scanned, the resulting ‘digital object’ is subject to new copyright protection. More significantly, this copyright isn’t held by the original writers and publishers, but by the library or digitisation company that performs the scans. In legal terms, it seems that we’re not actually browsing the British Library’s newspaper archive but accessing brightsolid’s collection of digitised texts.

This might seem like a minor distinction, but it has important implications. The BNA is intended to replace Colindale as home of the nation’s historical newspaper collections. However, in order to fund this transition, the British Library has allowed a commercial publisher to assume ownership of the new archive’s contents. It’s up to this commercial company to determine how we access the archive and what we can do with its materials.  The past has been privatised. This is brightsolid’s world now – we just live in it.

Edit: a few additional thoughts in the comments.


Dec 1 2011

Review: The British Newspaper Archive

 

Christmas arrived early for historians this week. On Tuesday morning, amid a blaze of publicity, the British Library unveiled the new home of its digitised newspaper collection – The  British Newspaper Archive (BNA).

Developed in partnership with commercial publisher brightsolid, the BNA provides online access to hundreds of eighteenth, nineteenth and early-twentieth-century newspapers. It’s an ambitious, long-term project – more than 3 million pages have already been digitised and the library hopes to reach 40 million pages over the next decade. If the project is successful it’ll have important implications for both professional and amateur historians. In the next couple of years, the British Library intends to close its newspaper archive at Colindale and transfer its holdings to a remote storage facility in Boston Spa. Whilst hard copies of undigitised newspapers should still be accessible, it’s clear that the British Library wants more and more researchers to access its collections online.  The BNA, in other words, is the shape of things to come – and it’s vitally important that the British Library gets it right.

 

Content

The archive currently provides access to 170 newspapers. Many of these papers were available in the 19th Century British Library Newspapers database and have been transferred directly into the new archive. Only the Penny Illustrated Paper (which always seemed slightly out of place in the previous database) has been omitted from the new collection. Unfortunately, this means that gaps in the original archive are still a problem – the Northern Echo, for example, still has content missing from the crucial period between 1871 and 1872 when W. T. Stead first took over as editor. On the plus side, glitches from the previous database have been solved. The Preston Chronicle, for example, is no longer incorrectly listed as the Preston Guardian.

The real strength of the archive lies in its new content. 100 new newspapers are now accessible for the first time – almost all of them provincial papers. A full list of these papers is available here. Highlights from the new collection include long runs of the Bath Chronicle (1760-1903), the Chelmsford Chronicle (1783-1882), the Leeds Times (1833-1901), the Manchester Evening News (1870-1903), the Northampton Mercury (1770-1903), the Worcestershire Chronicle (1838-1903), and the Yorkshire Gazette (1819-1899). The new database is far less London-centric than previous offerings – most areas of the country are now represented by at least one paper, and major cities like Manchester, Liverpool, Birmingham, and Sheffield have multiple titles.

Whilst the majority of this new content focuses on the nineteenth century, some papers stretch deep into the eighteenth and twentieth centuries. Twelve titles include at least a decade of issues from the eighteenth century, including the Birmingham Gazette which goes all the way back to 1741. Whilst numerous papers cover the first three years of the twentieth century, only four titles stretch beyond the first decade: The Cheltenham Looker-On (1913), The Motherwell Times (1924), the Nottingham Evening Post(1944), and the Western Times (1940). It’s extremely encouraging to see the British Library push beyond the boundary of 1900 – let’s hope that this is first step towards bridging the ‘digital divide’ which has recently sprung up between 19th and 20th century history.

Perhaps the most exciting thing about the new database is the promise of more content. Unlike previous databases, which were updated in bulk every year or so, the holdings of the BNA are constantly being expanded. 8000 new pages are supposedly being uploaded to the website every day. Unfortunately, it’s not possible to quickly see which papers have recently been added or updated – this makes it difficult to keep abreast of the archive’s changing contents. Nor, for that matter, does the British Library give any hints about which papers will be digitised in the future. There’s no way of knowing whether a publication that’s critical to your research will appear in the archive tomorrow morning or in 10 years’ time. There’s something exciting about this I suppose. As the website itself points out, “who knows what you’ll find tomorrow, next week, next year, and beyond”. However, I suspect we’ll have to rethink and shore-up our methodologies in order to build research projects on constantly shifting sands – more on this in a future blog post.


Search Engine

A good search engine is crucial to the success of a digital archive – the methodological possibilities of these resources are determined primarily by the questions they allow us to ask. The BNA has most of the tools we’ve come to expect from newspaper databases. Users can perform a basic ‘Search’ by imputing keywords into a single search box, or they can construct more complex queries using the ‘Advanced Search’ page. The ‘Advanced’ interface allows users to put keywords into four boxes:

 

 

  1. The first option searches for articles which include all inputted keywords. So, putting “America, Twain, New York” into the search box will find all articles which include these three terms somewhere in the text. Articles which only include the words ‘America’ and ‘Twain’ will not be found. For those of you familiar with Boolean searches, this is basically the equivalent of using ‘AND’.
  2. The second option searches for articles which include any (but not necessarily all) of the inputted keywords. So, this time a search for “America, Twain, New York” will find every article in which at least one of these terms appears. This will return articles containing the word ‘America’ which don’t feature ‘Twain’ or ‘New York’. In Boolean terms, this is a straightforward ‘OR’ search.
  3. The third option allows users to exclude articles containing certain keywords. So, we might search for articles featuring the word ‘Twain’ which do not contain a reference to ‘America’. In Boolean terms, this is the equivalent of ‘NOT’.
  4. Finally, the fourth box allows users to search for a complete phrase. This returns articles which feature keywords in a particular order and is broadly equivalent of enclosing an ordinary keyword search in quotation marks.

These searches can then be filtered by:

  • Place of publication
  • Publication title
  • Date
  • Article type (Advertisement, Article, Family Notice, Illustrated, Miscellaneous)
  • Public tag – more on this later.

Search results can be ordered by either ‘relevance’ or ‘date’ – this makes a nice change from Gale databases which only display results in chronological order.

Once a search has been performed, results can be filtered again by date, title, region, country, place, article type, and ‘public tag’. Crucially, the public tag feature allows articles to be sorted by additional categories, including: classifieds, adverts, news, commerce, arts, sport, crime, etc. The accuracy of these tags (many of which seem to have been imported from the previous database) isn’t great, but they can be helpful when filtering out irrelevant articles.

 

All of this works fairly well – if anything, the search engine is faster and more user-friendly than in previous databases. Unfortunately, some functionality has been lost. Most importantly, ‘proximity operators’ are no longer available. In the previous database, a search for “Twain n10 America” returned all articles in which the words ‘Twain’ and ‘America’ appeared no more than 10 words apart. This was a tremendously useful way to filter out results in which keywords appeared too far apart – it saved a lot of time and opened up a range of interesting methodological possibilities. In my thesis, for example, I use proximity operators to track changes in the number of articles featuring the words ‘America’ and ‘Competition’ in close proximity. It would be tremendously useful if this essential tool was reinstated in the new archive.

As for more advanced search methodologies like datamining or ‘culturomics’ – the chances of seeing the necessary tools introduced into the new database are slim-to-none.

 

Interface

The BNA’s interface is a mixed bag. It includes some welcome new additions. Each search result is now accompanied by a snippet of scanned text which helps users to decide whether an article is relevant before opening it – this should save a lot of time when wading through thousands of hits. Similarly, articles are now displayed within the full newspaper page – this makes it possible to zoom out using your mouse’s scroll wheel and explore the rest of the page. This should please historians who have (quite rightly) been warning us about the danger of viewing articles in isolation.

Unfortunately, this is where the good news ends. The BNA’s interface suffers from at least two major problems:

  1. 1.       No hit-term highlighting.
    In previous databases, keywords would be highlighted in colour whenever you opened an article. This made is easy to quickly identify which parts of long articles you wanted to read. Every database since the Times Digital Archive has had this feature – it’s absolutely essential. Without hit-term highlighting, wading through a 2000 word article in search of a single keyword is a laborious chore. To do this 100 times in a day is infuriating and massively slows down the research process. I can’t even begin to fathom why the BNA doesn’t include it.  Its absence is an inexcusable step backwards. If another element of the interface prevents the use of hit-term highlighting (such as the nice new zoomable images) then it needs to be unceremoniously scrapped. Right now.
  2. 2.       Saving articles.
    In the 19th Century British Library Newspaper database, downloading an article was as easy as right clicking it and saving it to a relevant folder. It was quick, easy, flexible, and resulted in easily reusable jpg files. Now, articles can only be downloaded as full-page pdfs. If you want to paste an article into a word document, slot it into a powerpoint presentation, or upload it to twitter, you’ll have to convert it back into a jpg. To make matters worse, the quality of these files is embarrassingly low – in fact, it’s virtually impossible to read them. Here’s a sample:

Fortunately, a solution is at hand: for the low, low price of £35.95 the good people at brightsolid will print out a high-quality version of the page and send it to you through the post. Alternatively, you might prefer to use the print-screen key or the ‘Snipping Tool’ included with recent versions of Windows and save a more readable version for free.

OCR

Ensuring the accuracy of optical character recognition software (OCR) has always been one of the biggest challenges facing newspaper digitisation projects. Even the best software produces patchy results – some articles are transcribed with 100% accuracy, whilst others end up a garbled mess.  As a result, software companies have typically preferred to hide raw OCR text from users; if we knew how inconsistent it was, they worry, we’d lose all faith in their product. So, it’s refreshing to see that the BNA openly displays raw, uncorrected OCR text alongside articles. It might put some users off, but we end up with a much better feel for how accurate our searches are.

More impressively, the BNA allows users to correct OCR errors and improve the database for other users. The interface for this process works fairly well. Lines of OCR text are displayed for correction on the left, and a black box highlights the specific area of the article which needs to be transcribed. A red box might have been slightly easier to see amidst the newsprint, but perhaps I’m being picky. In truth, the fact that this idea has been implemented so effectively makes the absence of hit-term highlighting doubly perplexing.

 

It remains to be seen how many users will bother to make corrections. I’d like to see the process incentivised a bit more –perhaps we could earn credits (more on them shortly) for each article we correct? It’s also unclear how the BNA intends to moderate corrections and prevent people from defacing the archive. However, I don’t want to be too critical of what is undoubtedly a step in the right direction. Whilst this form of ‘crowdsourcing’ won’t deliver 100% accurate ocr across the whole database – it would take thousands of users correcting around the clock to keep up with the 8,000 new pages added  each day – it’s certainly better than nothing.

In addition to OCR corrections, users can also ‘tag’ articles with their own descriptive keywords. If enough users take advantage of this feature it promises to be another tremendous innovation. I suspect it’ll be particularly useful for finding images.

 

Subscriptions

Finally, we reach the dreaded question: how much does all of this cost? It would be nice if the British Library followed the example of their colleagues in Australia and New Zealand and allowed us to explore the archive for free. Sadly, in order to cover the cost of digitisation, the British Library has had to turn the content over to a commercial publisher. Unlike their previous partner Gale (which caters primarily to the academic market), brightsolid has a background in targeting amateur genealogists with websites like findmypast.co.uk and 1911census.co.uk. As a result, the BNA is presently only available to individual subscribers. This renders it immediately unusable for teaching. JISC claim to be in negotiations with the British Library and brightsolid to provide institutional access to the database – until this happens, the BNA won’t be of any use in the classroom.

Three packages are currently available to individual subscribers:

  • 2 days (500 credits) – £6.95
  • 30 days (3000 credits) – £29.95
  • 12 months (unlimited access*) – £79.95

The ‘credit’ system is a bit complicated. It costs 5 credits to view an article published over 107 years ago in black and white, 10 credits to view similar articles in colour, and 15 credits to view articles published within the last 107 years. It’s fair to say, having bought the 2 day package to test the database out, that these credits don’t go very far. Browsing through one 20th century issue of the Nottingham Evening Post wiped out a quarter of my credits in five minutes.

For serious researchers, the 12 month unlimited subscription is the only real option. At first glance, £80 seems fairly reasonable – I’d spend way more than that on a two-day research trip to Colindale. However, buried in the small-print is a rather unpleasant surprise. If subscribers to the ‘unlimited’ package view more than 1000 pages in a calendar month, their account is frozen until the start of the next month. For some researchers, this cap will be perfectly tolerable. Unfortunately, as a press historian I’d expect to burn through at least 500 page views on a routine day of research. I’d be locked out of the database for 28 days of every month (save February, which has 28 days clear and 29 nine in a leap year). These quotas place an unacceptable restriction on research – I never want to be in a situation where my decision to read an article is determined not by its potential value to my research, but by the number of credits left in my account.

I e-mailed the archive’s customer service team and informed them that the cap would make many forms of academic research extremely difficult. They informed me that the BNA was intended for ‘personal use’ only. It’s nice to know where we stand.

[edit: good news -  the 1000 cap seems to have been relaxed]

Conclusion

In sum, there’s a lot to like about The British Newspaper Archive. The open approach to OCR, the introduction of crowdsourcing, and, above all, the incredible range of new content makes it a potentially fantastic new tool for researchers. I want to love it. Unfortunately, it currently suffers from at least four critical faults. The lack of hit-term highlighting, the inability to download a usable version of an article, the absence of institutional subscriptions, and the misjudged cap on the ‘unlimited’ package are all in need of urgent attention. Until these issues are fixed, its potential for academic research (not to mention its usefulness in the classroom) will remain frustratingly limited.