Lancaster UniversityGraduate School
Faculty of Arts and Social Sciences

You are here: Home >

UCREL Corpus Research Seminar: eMargin: A collaborative text annotation tool

Date: 6 December 2012 Time: 2.00-3:00 pm

Venue: George Fox LT3

UCREL Corpus Research Seminar

eMargin: A collaborative text annotation tool

Andrew Kehoe and Matt Gee (Birmingham City University)

An increasing number of researchers are using corpus linguistic techniques in the study of literary texts, ranging from simple frequency counts to more complex statistical analyses. However, despite the growth in corpus stylistics, there remains some resistance to seemingly abstract, 'mathematical' models within the wider field of literary studies. In this field, the dominant approach is still 'close reading': the detailed manual examination, annotation and interpretation of textual extracts. Some drawbacks of this approach are that annotations cannot easily be shared or searched, and that the features identified cannot be analysed in a quantifiable manner. The main problem, however, is that, despite the recent growth in e-books and textual databases, there is a surprising lack of software to facilitate the kind of close textual analysis required in academic study.

This paper introduces eMargin, an online tool for the collaborative analysis and annotation of literary texts ( The system is designed to offer a digital equivalent of the marginalia associated with the academic study of texts, allowing users to highlight and colour-code spans of text (from word to paragraph) and to associate either single word 'tags' or longer textual notes with these highlights. The collaborative element lies in the fact that users can share their annotations with each other and participate in threaded discussions linked to these specific parts of the text.

Our initial intention in developing eMargin was to demonstrate how intuition and automated analyses can co-exist in the study of literary texts. We will explain how our tool is being used increasingly in other academic disciplines and with a wider variety of texts. We will describe how we are beginning to explore the integration of eMargin with the Virtual Learning Environments in use at HE institutions in order to provide a new way of annotating student work and providing feedback.

We will also demonstrate the potential of eMargin as a corpus annotation tool, given that it allows user-friendly mark-up through highlighting of textual spans (with associated colour or tag), discussion between annotators, and export of the resulting mark-up in XML format.

Event website:


Who can attend: Anyone


Further information

Organising departments and research centres: Computing and Communications, Linguistics and English Language, University Centre for Computer Corpus Research on Language (UCREL)


| Home | Who's who? | Research Training | News and Events | Resources |

Graduate School, Faculty of Arts and Social Sciences, Lancaster University, Lancaster LA1 4YD, UK
Tel: +44 (0) 1524 510880 E-mail:
Copyright & Disclaimer | Privacy and Cookies Notice