{"id":92,"date":"2023-01-09T15:51:24","date_gmt":"2023-01-09T15:51:24","guid":{"rendered":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/?page_id=92"},"modified":"2025-07-21T12:47:26","modified_gmt":"2025-07-21T12:47:26","slug":"phd","status":"publish","type":"page","link":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/phd\/","title":{"rendered":"PhD"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Technical Description<\/h2>\n\n\n\n<p class=\"has-black-color has-text-color has-medium-font-size\" style=\"font-style:normal;font-weight:100\">Using Reinforcement Learning (RL) algorithms, such as Q-learning, to dynamically price perishable goods over a finite planning horizon with limited supply. We develop algorithms for ensuring safety during the exploration phase of an RL agent. The safety feature will ensure the RL algorithm does not take unsafe or fatal actions, without compromising the optimal policy convergence of the RL algorithm.<\/p>\n\n\n\n<h2 class=\"wp-block-heading has-black-color has-text-color has-medium-font-size\">Outputs<\/h2>\n\n\n\n<div class=\"wp-block-file\"><a id=\"wp-block-file--media-d19b22c8-5222-4588-8782-91986d3f5e24\" href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-content\/uploads\/sites\/48\/2025\/01\/INFORMS2024-1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Tabular Reinforcement Learning for Revenue Management Problems &#8211; Adam Page &#8211; INFORMS2024 Annual Meeting<\/a><a href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-content\/uploads\/sites\/48\/2025\/01\/INFORMS2024-1.pdf\" class=\"wp-block-file__button wp-element-button\" download aria-describedby=\"wp-block-file--media-d19b22c8-5222-4588-8782-91986d3f5e24\">Download<\/a><\/div>\n\n\n\n<div class=\"wp-block-file\"><a id=\"wp-block-file--media-8baea81f-d899-465e-8e47-50945414ebf4\" href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-content\/uploads\/sites\/48\/2025\/07\/IMA_OR_Conference_Poster_2025-1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Smooth Tabular Reinforcement Learning for Dynamic Pricing &#8211; 5th IMA-OR 2025 Conference Runner Up Poster <\/a><a href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-content\/uploads\/sites\/48\/2025\/07\/IMA_OR_Conference_Poster_2025-1.pdf\" class=\"wp-block-file__button wp-element-button\" download aria-describedby=\"wp-block-file--media-8baea81f-d899-465e-8e47-50945414ebf4\">Download<\/a><\/div>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Technical Description Using Reinforcement Learning (RL) algorithms, such as Q-learning, to dynamically price perishable goods over a finite planning horizon with limited supply. We develop algorithms for ensuring safety during the exploration phase of an RL agent. The safety feature will ensure the RL algorithm does not take unsafe or fatal actions, without compromising the&hellip; <br \/> <a class=\"read-more\" href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/phd\/\">Read more<\/a><\/p>\n","protected":false},"author":57,"featured_media":165,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-92","page","type-page","status-publish","has-post-thumbnail","hentry"],"_links":{"self":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-json\/wp\/v2\/pages\/92","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-json\/wp\/v2\/users\/57"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-json\/wp\/v2\/comments?post=92"}],"version-history":[{"count":9,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-json\/wp\/v2\/pages\/92\/revisions"}],"predecessor-version":[{"id":236,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-json\/wp\/v2\/pages\/92\/revisions\/236"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-json\/wp\/v2\/media\/165"}],"wp:attachment":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/adam-page\/wp-json\/wp\/v2\/media?parent=92"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}