UCREL Corpus Research Seminar: Corpus methods and a questionnaire for the diagnosis of pain symptoms2:00 pmLancaster University, FASS Meeting Room 3
Presenter: Elena Semino (LAEL, Lancaster University)
This talk arises from a collaboration with a consultant at the Eastman Dental Hospital in London who specialises in the diagnosis of chronic facial pain. One of the tools that are used for the diagnosis of chronic pain generally, and facial pain in particular, is the McGill Pain Questionnaire (MPQ). The MPQ includes 78 one-word descriptors (e.g. 'sharp', 'hot', 'nauseating'), arranged into 20 semantically-related groups. It is widely used in order to assess both the quality and intensity of patients' pain, with the ultimate goal of arriving at diagnosis and effective treatment or management. While the use of the MPQ is well-established, there are also problems with it, which are partly due to the choice of linguistic descriptors and the ways in which they are grouped. I will present the results of preliminary corpus-based analyses I have carried out in order to investigate various aspects of variation in the MPQ that might interfere with its effectiveness in clinical contexts. I will also share some of the experiences I have had in communicating my findings with health professionals. I will seek advice from the audience of the best ways of taking this work forward using corpus linguistic methods.
Thu16May20132pm - 3pmLancaster University, FASS Meeting Room 2/3
Presenter: Mike Thelwall (University of Wolverhampton)
This talk will describe a sentiment analysis program, SentiStrength, which estimates the strength of sentiments expressed in social web texts. SentiStrength has many language variants but works primarily in English and takes advantage of both traditional and computer-specific ways of expressing sentiment :) The program is available for testing and downloading online at http://sentistrength.wlv.ac.uk in both Windows and Java versions. The talk will demonstrate some applications of SentiStrength, including sentiment trends in Twitter relating to major events and finding good answers to questions via Yahoo!.
UCREL Corpus Research Seminar: "Boring, pompous and arrogant" or "funny, interesting, and HOT!" What university students really think of their professors!2pm - 3pmLancaster University, FASS Meeting Room 3
The evaluation of the effectiveness of teaching in universities most commonly involves asking students for feedback - so called Student Evaluations of Teaching (SETs). These are typically written questionnaires comprising predominantly Likert-scale type items. As such, the responses that students can provide are heavily constrained by the design of the instrument. Although SETs often contain opportunities for free-text comments, to date, there have been no large-scale analyses of the perceptions, opinions and beliefs that the students themselves choose to express. Systematic analyses of free-text responses have the potential to inform our understanding of what students value in teaching, what constitutes effective teaching and how we might best assess it.
This presentation reports a large-scale analysis of over half a million free-text comments on the popular review website RateMyProfessors.com (RMP). Principal components analysis (PCA) of adjectives was used to identify how students commonly perceive their instructors. Seven principal components were extracted and categorized based on component loadings:
- HELPFULNESS (e.g. helpful, willing, approachable)
- FUNNINESS (e.g. funny, hilarious, entertaining)
- INTELLIGENCE (e.g. brilliant, intelligent, knowledgeable)
- RUDENESS (e.g. rude, condescending, arrogant)
- INCOMPETENCE (e.g. disorganized, confusing, not_clear)
- TOUGHNESS (e.g. tough, difficult, not_easy)
- HOTNESS (e.g. hot, gorgeous, sexy)
Secondary analyses explored the relationship between these dimensions and (1) corresponding Likert scale ratings, and (2) the Five Factor model of personality traits. It is argued that the components reflect latent dimensions along which commonly perceive their instructors. Discussion of the results raises practical implications for educational assessment and teaching, and methodological implications for the use of questionnaire data in the social sciences.
UCREL Corpus Research Seminar: The taint of militancy is not upon them: representations of suffragists, suffragettes and direct action in The Times, 1908-19142pm - 3pmLancaster University, FASS Meeting Room 3
Presenter: Kat Gupta
Abstract coming soon.
UCREL Corpus Research Seminar: Presenting the new General Service List: Rationale, method, implications2pm - 3pmLancaster University, FASS Meeting Room 1
Presenters: Vaclav Brezina & Dana Gablasova
Learning vocabulary is a complex process in which the learner needs to acquire both the form and a variety of meanings/uses of a given lexical item (Nation, 2001). For the beginner the main question, of course, is where to start. General vocabulary wordlists can assist in this process by providing a list of common vocabulary items. Although there are a number of general vocabulary lists available, the by far most influential and widely-used both in pedagogy and vocabulary research is West's GSL (Carter 2012). However, a number of problems with West's GSL have been pointed out over the years (cf. Gilner 2011).
In response to the problems identified with the GSL, this study offers a bottom-up, quantitative approach to the development of a New General Service List (new-GSL) by means of examining frequent general words across four language corpora (LOB, BNC, BE06 and EnTenTen12) of the total size of almost 13 billion running words. The four corpora were selected to represent a variety of corpus sizes (from one million to over 12 billion tokens) and approaches to representativeness and sampling (from small samples to whole documents). The study brought strong evidence about the stability of the core English vocabulary across a variety of language corpora including different written and spoken contexts. We examined the overlap between 3000 most frequent vocabulary items and identified substantial correspondence between the four corpora in terms of the number of shared items (71%) as well as the distribution of the words in the wordlists (as established by a series of Spearman's correlations). The final product, the new-GSL, consists of a total of 2,496 words. It is divided into the base part (2118 items) and the current vocabulary part (378 items). The new-GSL covers between 80.1 and 81.7 per cent of text in the source corpora, which is comparable to the coverage of West's GSL. In its present form, the new-GSL can be used both for lexical research and development of teaching materials.
Thirty years ago saw the publication of Geoffrey Leech’s Principles of Pragmatics. To mark that event and the subsequent flowering of pragmatics at Lancaster we are holding a Pragmatics Summer School. Topics areas will include: the semantics-pragmatics interface, cross-cultural pragmatics, intercultural pragmatics, variational pragmatics, historical pragmatics, Spanish Pragmatics, pragmatics and corpus linguistics, (im)politeness, (im)politeness and CMC.
- Geoffrey Leech
- Jenny Thomas
- Ken Turner
- Helen Spencer-Oatey
- Jonathan Culpeper
- Maria Elena Placencia
- Dawn Archer
- Derek Bousfield
- Claire Hardaker
Website: For more information, visit the website.
This Summer School is an intensive, hands-on introduction to the use of Geographical Information Systems (GIS) aimed at PhD students and other junior researchers in the digital humanities. GIS is the field of geography devoted to the visualisation, in the form of maps, of non-visual data sources. These data sources can range from statistical databases to corpora of literary texts.
Over four days, a series of intensive lab-based sessions will be used to introduce GIS, from the basic concepts, to the use of key software including Arc GIS, to a consideration of approaches for applying GIS in different kinds of humanities research. The aim is to give participants the skills needed to exploit GIS techniques in their own research – allowing the spatial dimension to emerge in the study of digital humanities.
This Summer School event is free to attend, but registration in advance is compulsory, as places are limited. For more details, click here.
The UCREL Summer School 2013 is the third event in a highly successful series that began in 2011. Sponsored by UCREL at Lancaster University - one of the world's leading and longest-established centres for corpus-based research - its aim is to support students of language and linguistics in the development of advanced skills in corpus methods.
The UCREL Summer School is intended primarily for postgraduate research students (and secondarily for Masters-level students and postdoctoral researchers) who require in-depth knowledge of corpus-based methodologies for their degree projects. It is not aimed at raw beginners, but rather at PhD students who have at least some introductory experience of analysis using language corpora, and who wish to expand their knowledge of key issues and techniques in cutting-edge corpus research.
The programme consists of a series of intensive two-hour sessions, some involving practical work, others more discussion-oriented. Topics include: Advanced corpus queries and the use of regular expressions, and Spelling variation: historical, child and online data; Using XML in corpus encoding and analysis and The statistics of collocation; Studying language change in diachronic corpora; Corpus-based approaches to metaphor in discourse; Pragmatics, politeness and impoliteness in the corpus; Using comparable and parallel corpora in contrastive and translation studies; Understanding statistics for corpus analysis.
The UCREL summer school is part of three 'Lancaster Summer Schools in Interdisciplinary Digital Methods', see the website for further information.
How to apply
The UCREL Summer School is free to attend, but registration in advance is compulsory, as places are limited. For more details, see the website.
The first ESRC Summer School in Corpus Approaches to Social Sciences takes place under the aegis of CASS.
Who can attend?
A crucial part of the CASS remit is to provide researchers across the social sciences with the skills needed to apply the tools and techniques of corpus linguistics to the research questions that matter in their own discipline. This event is aimed at junior social scientists – especially PhD students and postdoctoral researchers – in any of the social science disciplines. Anyone with an interest in the analysis of social issues via text and discourse – especially on a large scale – will find this summer school of interest.
The programme consists of a series of intensive two-hour sessions, some involving practical work, others more discussion-oriented. Topics include: Introduction to corpus linguistics; Corpus tools and techniques; Collecting corpus data; Foundational techniques for social science data - keywords and collocation; Understanding statistics for corpus analysis; Discourse analysis for the social sciences; Semantic annotation and key domains; Corpus-based approaches to metaphor in discourse; Pragmatics, politeness and impoliteness in the corpus. Speakers include Paul Baker, Jonathan Culpeper, and Elena Semino.
The CASS Summer School is part of three ‘Lancaster Summer Schools in Interdisciplinary Digital Methods’, see the website for further information. There are additional daily plenary lectures shared with the other two Summer School events, each illustrating cutting-edge digital research methods using corpus data. The confirmed plenary speakers are Tony McEnery, Ian Gregory, and Stephen Pumfrey.
How to apply
The CASS Summer School is free to attend, but registration in advance is compulsory, as places are limited. For more details, see the website.
The main CL2013 conference will be preceded by a workshop day on Monday 22nd July. On this day, the following pre-conference workshops will be offered:
- Evaluative Language and Corpus Linguistics
- Corpus-Based Approaches to Figurative Language
- Workshop on Arabic Corpus Linguistics
- Web as Corpus Workshop
- Corpus Analysis with Noise in the Signal
- Annotating Correspondence Corpora
- Compiling and analysing a spoken academic corpus
- A Fully-annotated Pragmatic Corpus -- the SPICE-Ireland Corpus
More information -- including detailed workshop descriptions and a provisional schedule -- will be published on the CL2013 Workshop webpage as it becomes available.
The seventh international Corpus Linguistics conference (CL2013) will be held at Lancaster University from Tuesday 23rd July 2013 to Friday 26th July 2013. The conference is hosted by the UCREL research centre, which brings together the Department of Linguistics and English Language with the School of Computing and Communications at Lancaster. The goals of the conference are:
- To gather together current and developing research in the study and application of corpus linguistics;
- To push the field forwards by promoting dialogue among the many different users of corpora across interconnected sub-disciplines of linguistics – be they descriptive, theoretical, applied or computational;
- To explore new challenges both within corpus linguistics, and in the extension of corpus approaches to new fields of study.
- Karin Aijmer
- Guy Cook
- Michael Hoey
- Ute Römer
Website: For more information, visit the CL2013 website.
Wed28Aug20139:00am - 3:30pmiCourts Open Meeting Area, Studiegaarden, Studiestraede 6, 1st floor, DK-1455 Copenhagen K
iCourts invites to the seminar Law as Texts in Context: International Case Law from a Discourse Perspective focusing on the application of corpus linguistics and discourse analysis in legal research.
One of the main objectives of iCourts’ research agenda is to explain how international law establishes itself as an independent legal order through the case law of international courts. We want to study that development, not through a conventional legal analysis of the court decisions, but through a variety of descriptive approaches that entail a detailed study of the discursive processes that create, influence and change international law. These include discourse analysis and corpus linguistics.
The speakers at this research seminar are all renowned researchers in their respective fields, and will present their view on how descriptive studies of law as text can elucidate not only the texts of law, but also the law as text. In doing so, they invite the participants to reflect on the possible gains, losses, new understandings and misunderstandings that may result from applying insights of language studies and linguistic methods in legal research.
See the programme here: http://jura.ku.dk/icourts/calendar/law-texts-in-context/programme/
Wed04Sep2013Fri06Sep2013Lancaster University, UK
Lancaster Centre for Mobilities Research: Call for Participation
As part of celebrations of the tenth anniversary of the Centre for Mobilities Research (CeMoRe) at Lancaster University, we are pleased to announce the Global Conference on Mobility Futures, September 4-6th, 2013, at Lancaster University, UK and invite contributions.
Over the past ten years the work of CeMoRe and others have helped to 'mobilise' the social and human sciences and developed innovative analyses of economic, social, technological, political, policy and design transformations.
The 'Global Conference on Mobility Futures' will reflect this work and provide a forum for the presentation of cutting edge research from across the social sciences that reflects back on, explores the present and looks towards future mobilities.
See the website for more details. We look forward to seeing you in Lancaster!
Thu12Sep20135:30pm - 7:00pmBradley Forum, UniSA City West campus, Hawke Building level 5, 55 North Terrace, Adelaide
Jointly presented by The Bob Hawke Prime Ministerial Centre and Hawke Research Institute at UniSA
What are the costs - at once personal, social and environmental - of our civilization's carbon addiction? Does the Age of Tough Oil necessarily mean the 'powering down' of societies? What does the future hold for people, energy and climates in a post-carbon world? In this wide-ranging discussion with one of Europe's most celebrated social thinkers, John Urry discusses the scale, speed and impact of future energy changes over the next century. From oil dregs to carbon rationing, Urry envisions the future of an oil-dependent world facing energy descent.
In Conversation with.....
John Urry is widely acknowledged as one of Europe's most important social theorists. He is Distinguished Professor of Sociology at Lancaster University. Educated at Cambridge University, he is the editor of the International Library of Sociology; Co-editor of Mobilities and Director of the Lancaster Centre for Mobilities Research. His recent books include Automobilities (2005), Mobilities, Networks, Geographies (2006), Mobilities (2007), Aeromobilities (2009), After the Car (2009),Mobile Lives (with Anthony Elliott, 2010), Mobile Methods (2011), The Tourist Gaze 3.0 (2011) and Climate Change and Society (2011).
Anthony Elliott is Director of the Hawke Research Institute, where he is Research Professor of Sociology at the University of South Australia. He is a Fellow of the Academy of the Social Sciences in Australia and is the author and editor of some thirty books, translated into over a dozen languages. His most recent books include Making The Cut: How Cosmetic Surgery is Transforming our Lives (Chicago University Press, 2008), The New Individualism (with Charles Lemert, Routledge, 2009), and Reinvention (Routledge, 2013).
Visit the official website for more information.
Mon16Sep20135pm - 6pmUniversity of Wollongong
Distinguished Professor John Urry invites you to an exclusive lecture, where he will present his forthcoming book Offshoring: Secrets, Lies & Globalization.
John Urry is the co-founder and Director of the Centre for Mobilities Research at Lancaster University and is the founding co-editor of the journal Mobilities and author of numerous books on globalization, capitalism and tourism.
Social interaction and (im)politeness in digital communication: Exploring the potential of corpus-related approaches9:30am - 5:30pmLancaster University
A one-day workshop hosted by the ESRC Centre for Corpus Approaches to Social Science
In recent times, there has been lively public interest in online “aggressive” phenomena such as flaming, trolling, and cyberbullying. However, in the field of computer-mediated communication (CMC), whilst some studies have been conducted on the social dynamics of digital media, relatively few have focused specifically on politeness or impoliteness, despite their centrality to social interaction, obvious relevance to digital communication, keen public interest and recent explosion in related academic activity (witness the establishment in 2005 of the Journal of Politeness Research). Corpus-related approaches – typically involving the computational analysis of vast collections of text – to studying the nature of digital communication are still in their infancy. The aim of this workshop is to explore the potential that the array of corpus-related approaches might have for enhancing our understandings of social interaction in digital communication in general, and (im)politeness in particular.
The workshop will encompass a range of digital communication types, such as email, blogs, texts and tweets. It is not restricted to any particular definition of politeness or impoliteness. Broadly, we understand politeness to be the social spadework that we undertake to oil the wheels of interaction and impoliteness to be the opposite. As far as politeness is concerned, issues in the domain of digital communication might include: What are the politeness practices of particular media (e.g. what are considered polite ways of opening or closing, of achieving particular goals, or of self-disclosing)? How do they vary across types? As far as impoliteness is concerned, issues in the domain of digital communication might include: What are the characteristics of impoliteness-related practices such as flaming, trolling and cyberbullying? How are they to be defined? Why are the new digital media often perceived by the public as sources of impoliteness, and how true is this perception? As far as corpus-related approaches are concerned, issues we might explore include: How is data sourced? What are the ethical issues? How is data transcribed for computational analysis, especially when multiple writers and addressees are involved? How is the social context factored in to the corpus? How can corpus-related approaches handle pragmatic phenomena in general and (im)politeness in particular? What is to be gained from using notions like collocates, keywords and word sketches?
The workshop will involve a mixture of activities. At its core there will be three presentations by the following speakers:
- Andrew Kehoe and Ursula Lutzky (Birmingham City University, UK)
- Ruth Page (University of Leicester, UK)
- Caroline Tagg (University of Birmingham, UK)
In addition, we will have discussion panels, and a data exhibition. Regarding the data exhibition, participants may, if they wish, submit samples of data along with very brief descriptions of their research interests to our team. We will apply corpus methods to that data. In the data exhibition, we will present snapshot analyses and discuss methodological issues. (Please see the simple data-submission guidelines if you wish to take advantage of this.)
Sat28Sep2013Mon30Sep2013Queen Mary, University of London
This is the fifth in a biennial series of international conferences organized around the role of the media in relation to the representation, construction and production of language.
The primary theme of the 2013 conference will be on journalism – in its many historical and contemporary manifestations – and its redefinition in the face of social media and established practice, taking its cue from the confluence of historic Fleet Street and new media and digital innovation in London.
Also on the agenda: A plenary panel of active journalists and linguists talking about what has changed in journalism and what remains the same, with an eye to inspiring future research/ers and providing new directions for investigation of language in the media.
Allan Bell (Auckland University of Technology, NZ)
Martin Conboy (University of Sheffield, UK)
Helen Kelly-Holmes (University of Limerick, Ireland)
Daniel Perrin (Zurich University of Applied Sciences, Switzerland)
Wed23Oct2013Norwegian University of Science and Technology
Costas Gabrielatos will give two courses, 'Keyword analysis', and 'Beyond word frequency'.
In this session Gabrielatos will explore definitions of the terms keyword and keyness, and discuss appropriate metrics, focusing on the distinction between effect size and statistical significance. He will also focus on how to derive true keywords (i.e. based on effect-size), while also catering for statistical significance, as all but one current corpus tools use an inappropriate metric (log-likelihood), which only specifies statistical significance (the exception being Sketch Engine).
Beyond word frequency
Overall, this session will focus on a more comprehensive view of 'frequency: it will discuss how the normalized word frequency in a corpus may not always be the best way to count instances of a linguistic feature, and why it is best to view the normalized frequency of a linguistic unit as the number of instances of a feature out of the total number of opportunities for it to appear (Ball, 1994). The session will also focus on how the total number of instances (however measured) may be misleading on its own, and may need to be supplemented with metrics of dispersion/spread. Regarding word frequency, the session will show how token and type frequencies can be examined in combination – not collapsed into a single type-token ratio metric, but visualised two-dimensionally in a scatterplot.
Fall School in Corpus Linguistics: http://www.ntnu.edu/lingphil/corpus-linguistics
Fall School full schedule: http://www.ntnu.edu/lingphil/schedule
Thu24Oct20132pm - 3pmLancaster University, FASS Meeting Room 1
More information on the UCREL CRS page: http://ucrel.lancs.ac.uk/crs/presentation.php?id=44
Thu31Oct2013Lancaster University, FASS Meeting Room 3
More information on the UCREL CRS website: http://ucrel.lancs.ac.uk/crs/presentation.php?id=45
Wed06Nov20134:00pm - 5:00pm D18, Fylde, Lancaster University
Speaker: Max Louwerse, Professor Cognitive Psychology and Artificial Intelligence, Tilburg Center for Cognition and Communication, Tilburg University
A vast amount of literature has demonstrated that cognitive processes can be explained by a perceptual simulation account. Oftentimes such studies interpret an effect for perceptual simulation as the only explanation. This talk will discuss whether there are alternatives. One such alternative is provided by the Symbol Interdependency Hypothesis, which proposes that language encodes perceptual information. Language has evolved into a system of regularities that allows for a symbolic shortcut to perceptual relations in the world around us. Language users can rely on symbolic and perceptual relations depending on the nature of the stimulus, the cognitive task, and the individual, as well as the time course of processing. The Symbol Interdependency Hypothesis thereby emphasizes the role of language statistics in cognitive processing: Various studies ranging from conceptual metaphor, iconic relationships, geographical orientation, and the Spatial Numerical Association of Response Codes (SNARC) will be discussed showing that what might seem to be best explained by a perceptual simulation account is in fact best explained by statistical linguistic frequencies.
Sat09Nov2013University of Portsmouth
Corpus-assisted discourse analysis across languages
09.30 Arrival & coffee
10.00 Overview of the area & aims of the workshop, Rachelle Vessey, Newcastle University & Charlotte Taylor, University of Sussex
10.15 Translation studies, critical discourse analysis and corpus linguistics: An overview, Bandar Al-Hejin, Institute of Public Administration (English Language Center), Riyadh
11.00 Corpus-based intercultural pragmatics, Rachele de Felice, University College London
11.45 A contrastive discourse analysis of patient-centred communication in British, Italian and Spanish Ask-the- Expert Healthcare Websites, Gabrina Pounds, University of East Anglia
Corpus-Based Investigations into the Attitudes to Bilingualism and Code-Switching: Insights from German and Polish migrant discussion forums Sylvia Jaworska, University of Reading
The linguistic framing of inequality in times of austerity: A corpus linguistics study of ‘fairness’ in elite political discourse in Spain and Britain, Rosa Escanes Sierra, University of Sheffield
14.25 Comparative analyses of discourse key words across languages: integration and multicultural society in English and German, Melani Schroeter, University of Reading
15.10 A corpus-linguistic approach to the analysis of transnational political discourses. Europe in British, French and German election manifestos, Ronny Scholz, University of Warwick
(5 min break)
16.00 Public apology felicity conditions in British and French apology press uptakes, Clyde Ancarno, King’s College
16.45 Problems, solutions & where next? Open discussion led by Charlotte Taylor, University of Sussex & Rachelle Vessey, Newcastle University
To attend, please contact firstname.lastname@example.org or email@example.com
Wed13Nov20132:00pm - 3:00pmLancaster University, FASS Meeting Room 3
Title: A corpus linguistic approach to newsworthiness - Towards a new methodological framework for analyzing news discourse in Critical Discourse Analysis and beyond
In this talk I will introduce a new framework for the analysis of news discourse, which emphasises the importance of news values for linguistic analysis and encourages a constructivist approach to their analysis. This framework is situated within what Bednarek & Caple (2012a, b) call a 'discursive' approach to news values. From this perspective, the primary research interest is in how texts construct newsworthiness. The framework itself is intended for both multimodal discourse analysis and corpus linguistic analysis, although in this paper I will focus on the integration of corpus linguistic methods. The guiding question is: how can corpus techniques help us to identify the linguistic construction of newsworthiness in a given text or corpus?