In the sequential decision-making task faced by a web site when choosing which adverts to display, each decision is both an opportunity to receive revenue and an opportunity to gain information that may be useful for gaining reward in the future.
It is essential that the need to perform well in the short term is balanced with performing well in the long term; the decision-maker must exploit what is already known, but also forego short-term reward to explore the environment to discover information that could be useful in the future.
This problem is well-studied when only one advert is to be selected on each display of the website. But in actual fact it is normal that the system must select several adverts simultaneously, and the usefulness of each advert will be affected by which others are displayed.
This project will develop a methodology to directly address the exploration-exploitation dilemma with interacting simultaneous actions. We will use a game-theoretical control methodology, in which each of the parallel actions are controlled by an independent player of a game. By careful design of the utility functions of the game, independent play by individuals will result in desirable system-level behaviour.
The project title is "Game-theoretic control to select website elements", and the Faculty Research Award will support STOR-i PhD student James Edwards to work with David for one year.