Asymptotically Optimal Learning for Parametric Prophet Inequalities
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
We study learning in prophet inequalities with i.i.d. rewards drawn from an exponential-type parametric family with an unknown parameter $\theta$, a class that includes exponential, Pareto, and bounded-support power-family distributions.
We first characterize the optimal full-information asymptotic competitive ratio for this family.
In the unbounded-support case, the limit is $ {\left({\theta}/({\theta-c_+})\right)^{c_+/\theta}}/ {\Gamma(1-c_+/\theta)},$ while in the bounded-support case, the limit is $1$.
We then propose a confidence-based dynamic-programming policy for online learning.
By exploiting the explicit parametric structure, the policy achieves the same optimal asymptotic competitive ratio using only online observations, without external offline samples.
We further derive distribution-specific convergence rates for canonical examples.
Finally, numerical experiments on synthetic instances illustrate the performance of our algorithm.