There is no universal agreement about which semantic features ought to be annotated - in fact in the past much of the annotation was motivated by social scientific theories of, for instance, social interaction. However, Sedelow and Sedelow (1969) made use of Roget's Thesarus - in which words are organised into general semantic categories.
The example below (Wilson, forthcoming) is intended to give the reader an idea of the types of categories used in semantic tagging:
And 00000000 the 00000000 soldiers 23241000 platted 21072000 a 00000000 crown 21110400 of 00000000 thorns 13010000 and 00000000 put 21072000 it 00000000 on 00000000 his 00000000 head 21030000 and 00000000 they 00000000 put 21072000 on 00000000 him 00000000 a 00000000 purple 31241100 robe 21110321The numeric codes stand for:
00000000 Low content word (and, the, a, of, on, his, they etc) 13010000 Plant life in general 21030000 Body and body parts 21072000 Object-oriented physical activity (e.g. put) 21110321 Men's clothing: outer clothing 21110400 Headgear 23231000 War and conflict: general 31241100 ColourThe semantic categories are represented by 8-digit numbers - the one above is based on that used by Schmidt (1993) and has a hierarchical structure, in that it is made up of three top level categories, which are themselves subdivided, and so on.