Skip to yearly menu bar Skip to main content


Invited Talk
in
Workshop: Advances in Financial AI: Opportunities, Innovations, and Responsible AI

What Distributional Reinforcement Learning is Learning?

Chi Seng Pun


Abstract:

Distributional reinforcement learning (RL) emerges as a powerful tool for modeling risk-sensitive sequential decisions, where leveraging distribution functions in place of scalar value functions has allowed for the flexible incorporation of risk measures. However, due to the inherent time inconsistency (TIC) in the use of numerous risk measures in sequential decision making, the nature of controls under distributional RL has remained a mystery. For its use in the risk-sensitive problems in mathematical finance, this paper seeks to fill the research gap by building on the cumulative prospect theory (CPT)-based analysis of human gambling behavior and the emergence of three policy classes under TIC: precommitment, equilibrium, and dynamically optimal. We focus on the prevailing quantile-based distributional RL (QDRL) for CPT risk measures. Our theoretical results extend some results from the risk-insensitive QDRL theory to CPT prediction, from which we derive the characterization of QDRL control as an approximate equilibrium of an intrapersonal game. We empirically demonstrate the efficacy of our CPT QDRL algorithm in approaching the equilibrium. Finally, by further exploring the economic interpretation of the three policy classes in their handling of TIC, we devise some metrics and instances relevant for driving interesting patterns of interactions between these policies, including when and how the equilibrium may be more desirable than the precommitment.

Patrick Pun is currently a tenured Associate Professor, Assistant Chair (MSc Programmes), and the Programme Director of Master of Science in Financial Technology at School of Physical and Mathematics Sciences, Nanyang Technological University, Singapore. Prior to NTU, Patrick obtained his Ph.D. in Statistics at the Chinese University of Hong Kong in 2016. His Ph.D. thesis won numerous awards, including Nicola Bruti Liberati Prize 2016 and the Young Scholars Thesis Award 2016. His research paper on high-dimensional portfolio selection won Best Student Research Paper (First Place) in INFORMS Financial Section in 2015. Patrick has strong research interests in Financial / Actuarial Mathematics, Big Data Analytics, and AI applications in Finance, as evidenced by his numerous top-tier publications in these fields.

Chat is not available.