HIST426: Using Digital Texts as Historical Sources

Available 2017/18

Course ConvenorProfessor Ian Gregory


Taught: Michaelmas Term

Computers are changing the ways in which we do many things in our lives, however, many of the approaches historian use remain rooted in the analogue age. Perhaps the only major change that computers have led to among historians to date is the use of major digitised archives, for example Early English Books Online, Old Bailey Online or the British Newspaper Archive. Even here, many historians simply use searching and browsing without any clear idea of what is going on behind the user interface and therefore little idea of how to critique the resources and techniques (primarily keyword searching) that they are using. Additionally, while searching and browsing are effective tools, they only make minimal use of the potential of digital resources.

This course sets out to address both of these issues. The first part of the course looks at how paper sources are digitised and encoded to create digital resources. As part of this we will introduce mark-up languages such as hypertext mark-up language (HTML), the language that underlies most web-pages, and extensible mark-up language (XML). The purpose of this is to enable the student to understand how digital sources are created, and to understand both the benefits and limitations of the technologies that they use. The second part the course looks at how, once we have a source in digital form, we can explore it in sophisticated ways. Large digital resources may contain many millions, if not billions of words of text. Corpus linguistics enables us to explore these resources in ways that identify and summarise themes of interest in ways that go far beyond simply keyword searches. It also helps the historian decide which parts of a large body of text require close reading and which do not.

The course does not assume any prior knowledge of computing beyond the basic file handling, word processing and internet use that all history students will have. It will draw on examples covering a wide range of topics from the early modern to modern British. Students will also have the opportunity to use the techniques and approaches learnt with their own sources.


Hitchcock T. (2013) “Confronting the digital: or how academic history writing lost the plot” Cultural and Social History, 10, pp.9-23

Pumfrey S., Rayson, P. and Mariani J. (2012) “Experiments in 17th century English: manual versus automatic conceptual history” Literary and Linguistic Computing, 27, pp. 395-408


Assessment: Two pieces of coursework. One essay (30%) and one practical exercise (70%)