Lancaster UniversityGraduate School
Faculty of Arts and Social Sciences

You are here: Home >

UCREL Corpus Research Seminar: Manual annotation in a functional- typological grammar study

Date: 31 January 2013 Time: 2.00-3:00 pm

Venue: FASS Meeting Room 3

UCREL Corpus Research Seminar

Manual annotation in a functional- typological grammar study (A study on the Javanese dialect of Kudus, Indonesia)

Noor Malihah (LAEL, Lancaster University)

The corpus was constructed with some specific research purposes. I have developed a system of manual annotation in this corpus. This is because this corpus has a relatively small data collection and I also need to develop specific rules for the annotation. This corpus is produced not for the perfect corpus for future generations, but a workable corpus for my own use.

The data was collected from 49 native speakers of Javanese dialect of Kudus in Central Java, Indonesia and six articles from a local newspaper.

The corpus was manually annotated for various grammatical features represented in codes. The annotation is used to examine the correlation between one code to the other code(s) to answer my research questions. The general research question is to investigate the distinctive features of the passive, the applicative and the causative of Kudus dialect compared to the standard Javanese. Therefore, this manual annotation is linguistically rich information ranging from morphology through syntax and semantics.

The results of this manual annotation are the annotated dataset containing relevant information to answer my research questions. Quantitative results are also obtained by counting the co-occurrence of a particular feature in the dataset.

To use the annotated corpus, each code in it is combined with other code(s) to answer anything about the use of a particular grammatical construction.

During the process of manual annotation, there are some challenges that I encountered, for example human mistakes, accuracy, time-consuming, and consistency. Although these challenges make the manual annotation hard, this work still fits because I can use the results for my research purposes.

Event website:


Who can attend: Anyone


Further information

Organising departments and research centres: Computing and Communications, Linguistics and English Language, University Centre for Computer Corpus Research on Language (UCREL)


| Home | Who's who? | Research Training | News and Events | Resources |

Graduate School, Faculty of Arts and Social Sciences, Lancaster University, Lancaster LA1 4YD, UK
Tel: +44 (0) 1524 510880 E-mail:
Copyright & Disclaimer | Privacy and Cookies Notice