Statistics Forum: Emanuele Giorgi

Emanuele Giorgi, CHICAS, Lancaster University

Thursday 05 December 2013, 1230-1300
A54, Postgraduate Statistics Centre Lecture Theatre

Combining data from multiple spatially referenced prevalence surveys using generalized linear geostatistical models

Geostatistical methods are becoming more widely used in epidemiology to analyze spatial variation in disease prevalence. These methods are especially useful In resource-poor settings where disease registries are either non-existent or geographically incomplete, and data on prevalence must be obtained by survey sampling of the population of interest. In order to obtain good geographical coverage of the population, it is often necessary also to combine information from multiple prevalence surveys in order to estimate model parameters and for prevalence mapping.

However, simply fitting a single model to the combined data from multiple surveys is inadvisable without testing the implicit assumption that both the underlying process and its realization are common to all of the surveys. We have developed a multivariate generalized linear geostatistical model to combine data from multiple spatially referenced prevalence surveys so as to address each of two common sources of variation across surveys: variation in prevalence over time; variation in data-quality.

In the case of surveys that differ in quality, we assume that at least one of the surveys delivers unbiased gold-standard estimates of prevalence, whilst the others are potentially biased. For example, some surveys might use a random sampling design, the others opportunistic convenience samples. For parameter estimation and spatial predictions, we used Monte Carlo Maximum Likelihood methods.

We describe an application to malaria prevalence data from Chikhwawa District, Malawi. The data consist of two Malaria Indicator Surveys (MISs) and an Easy Access Group (EAG) study, conducted over the period 2010-2012. In the two MISs, the data were collected by random selection of households in an area of 50 villages within 400 square kilometers, whilst the EAG study enrolled a random selection of children attending the vaccination clinic in Chikhwawa District Hospital. The second sampling strategy is more economical, but the sampling bias inherent to such "convenience" samples needs to be taken into account.