SECTION C

SECTION C EXPLORATION

Having introduced the key concepts in corpus linguistics and presented excerpts from published material, we now want to engage readers in a series of case studies. These case studies investigate research questions in some of the areas of linguistic analysis introduced in Section A and further discussed in Section B. Each case study starts with an overview of the background knowledge needed for the study and a brief description of the corpus data used. Then it explores, together with the reader, a particular research question using specific tools (a corpus exploration tool and/or a statistics package). This is where the reader learns how to 'do' corpus linguistics, as the process of investigating the data using the package(s) concerned will be spelt out step by step, using text and screenshots. Thus by the end of each case study, a corpus has been introduced, the reader has learnt how to use a retrieval package and some research questions have been explored. Readers are then encouraged to explore a related research question using the same corpus data, tools and techniques. Readers can visit the authors' companion website given in the Appendix for details of the availability of corpora and tools used in these case studies.

This section consists of six case studies. Case study 1 explores the area of pedagogical lexicography on the basis of the BNC corpus (Word Edition), using BNCWeb. The focus of this study is on collocation analysis and the study seeks to describe collocation patterns of sweet from the BNC and integrate that information into a description of a dictionary entry. Case study 2 uses four corpora of the Brown family to explore the potential factors that may influence a language user’s choice of a full or bare infinitive after HELP, which include language variety (British English vs. American English), language change (English in the early 1960s and the early 1990s) and a range of syntactic conditions (e.g. an intervening nominal phrase, a preceding infinitive marker and the passive). This case study also introduces MonoConc Pro and SPSS. Case study 3 uses WordSmith version 4 and the Japanese component of the Longman Learners' Corpus to study the second language acquisition of English grammatical morphemes. Case study 4 uses the metadata encoded in the BNC (version 2) pertaining to demographic features such as user age, gender and social class, and textual features such as register, publication medium and domain to explore such dimensions of variation to discover a general pattern of swearing (more specifically the use of the word FUCK) in modern British English. This case study demonstrates how to use BNCWeb to make complex queries and provides readers with an opportunity to practice using SPSS. Case study 5 compares two approaches to genre analysis - Biber's (1988) multi-feature/multi-dimensional analysis and Tribble's (1999) use of the keyword function of WordSmith - through a comparison of speech and conversation in American English. This study introduces some advanced functions of WordSmith version 3. The final case study uses parallel and comparable corpora of English and Chinese to examine the effect of domain, text type and translation upon aspect marking in Chinese. This study also introduces parallel concordancing.

We would remind the readers that for each case study alternate versions of the study are available on our companion website covering most concordance packages. Note also that if any of the results gained by the readers do not match those given here they should check the website for an update.

Most of the case studies in this Section are based upon articles published elsewhere by the authors, as indicated in individual units. Readers interested in particular research questions can refer to our full papers for further discussion.