Department of Linguistics and English Language, Lancaster University, LA1 4YT, United Kingdom
Tel: +44 (0) 1524 594577 Fax: +44 (0) 1524 843085

Analysis of spoken London English using corpus tools


Summary: The project uses anonymised transcripts of sociolinguistic interviews prepared as part the ESRC-project Linguistic Innovators: the English of adolescents in London (RES 000-23-0680, PI Paul Kerswill, CI Jenny Cheshire) to perform semi-automated corpus analyses of grammatical and discourse features.

Key Facts

Funder: British Academy

Type of Activity: Academic Research - Externally Funded

Principal Investigator: Eivind Torgersen

Co-investigator: Paul Kerswill

Research Associate: Costas Gabrielatos

Dept/Research Groups: Linguistics and English Language, Language Variation and Linguistic Theory (LVLT)

Keywords: Corpus linguistics, Corpus tools, Sociolinguistics, Grammar, English grammar, English language, Computerised corpora, Corpus linguistic methodology, Language variation and change

Project Description

The project will undertake a corpus analysis of one grammatical and one discourse feature. The grammatical feature to be analysed is the distribution and form of the indefinite article 'a'/'an' in front of vowel sounds. The analysis will seek to find correlations between the choice between 'a' and 'an' and the semantic features of the noun phrase following the indefinite article. We will examine both the linguistic and sociolinguistic contexts in which the indefinite article occurs. We will examine possible effects of word frequency, spelling, the quality of the following vowel, and word stress, as well as sociolinguistic information. The discourse feature to be analysed is the tag. The London interview data contain a very large number of discourse markers which would be very time-consuming to analyse manually. We will examine the use of fillers and tag questions as well as lexis and formulaic phrases used as discourse markers (eh, okay, right, yeah, innit). Previous research based on the COLT corpus has suggested an increase in the use of tags, in particular 'innit' and 'right', as teenagers get older, but the results may be unreliable due to a small dataset. We have a large dataset from exactly that age group. The COLT data are from 1993 while ours are from 2005 so it is possible to observe change and also identify new tags. We will also examine effects of gender and ethnicity. In COLT there is a significant difference between males and females in the distribution of 'yeah' and 'innit' and a difference between ethnic groups in the distribution of 'right' and 'innit': the ethnic minority speakers use more tags, but the manner and contexts in which the tags are used has not been investigated. We will look at phonetic context, word frequency, word length and location in the utterance in addition to the sociolinguistic variables.

Purpose of Research

Academic Research - Externally Funded

Project Funder

British Academy - £6877