Risk-sensitive partially observable Markov decision processes as fully observable multivariate utility optimization problems
Authors
- Afsardeir, Arsham
- Kapetanis, Andreas
- Laschos, Vaios
ORCID: 0000-0001-8721-5335 - Obermayer, Klaus
2020 Mathematics Subject Classification
- 93E20
Keywords
- Markov decision processes, partial observability, risk sensitivity, utility function, sums of exponentials
DOI
Abstract
We provide a new algorithm for solving Risk Sensitive Partially Observable Markov Decisions Processes, when the risk is modeled by a utility function, and both the state space and the space of observations are fi- nite. This algorithm is based on an observation that the change of measure and the subsequent introduction of the information space, which is used for exponential utility functions, can be actually extended for sums of exponentials if one introduces an extra vector parameter that tracks the expected accumulated cost that corresponds to each exponential. Since every increasing function can be approximated by sums of expo- nentials in finite intervals, the method can be essentially applied for any utility function, with its complexity depending on the number
Download Documents