Bayes4Health and CoSInES Methodology Week


Bayes for Health Logo

Postdocs for Bayes4Health and CoSInES came together at the University of Warwick for a week-long investigation of some open statistical challenges. The postdocs were split into three groups, with each group tackling a different challenge: (i) efficient MCMC for Bayesian inverse problems; (ii) robust Bayesian methods for Epidemics; and (iii) merging information from multiple data sources.

The last group considered a framework for Bayesian inference where you analyse different shards of data separately, and then combine the posterior samples from each shard to get an approximate posterior for the full data. These methods are known to struggle if there is considerable heterogeneity across different shards of data.

For applications where you have choice of which data goes into which subsets, the group looked at strategies for choosing this to reduce the heterogeneity across subsets. The basic idea is to first divide the data into small subsets, then measure the discrepancy between the posteriors for each pair of small subsets, and finally merge the subsets into shards by merging the most disparate subsets together. One important feature of this approach is that there is flexibility in terms of how measure discrepancy in data, and thus in high dimensional settings you can tune this to the features of the posterior that is most relevant.

An example of the proposed procedure, called HEterophilic Merging Partition (HEMP), on a simple example is shown below.

This shows how the HEMP strategy is able to create shards of data which are more similar in terms of posterior mean and variance than random allocation of data to shards — and that this leads to some improvement in the accuracy of the posterior approximation when they are combined.

Back to News