Killing the Case for Randomization in Dynamic Assortment Optimization
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
One of the traditional approaches for constructing approximate policies for dynamic assortment optimization problems is to use sampling-based inventory-agnostic policies.
Such policies are called sampling-based, as they sample an assortment of products from a fixed distribution at each time period to offer to a customer of each type.
Such policies are called inventory-agnostic, as the sampled assortments may include products without remaining inventories, so if a customer chooses a product without remaining inventories, then she leaves without a purchase.
Inventory-agnostic nature of a policy is not a concern, because it is known that if the policy samples an assortment that includes products without remaining inventories, then dropping the products without remaining inventories does not degrade the performance.
However, sampling-based nature of a policy is a concern, because sampling brings another source of uncertainty in the performance.
In this paper, we give an algorithm to de-randomize any sampling-based inventory-agnostic policy, so the de-randomized policy offers a deterministic sequence of assortments within the support of the original policy without degrading the performance.
Furthermore, we give a variation of our de-randomization algorithm that searches for a deterministic sequence of assortments beyond the support of the original policy.
We show that we can implement the latter variation efficiently as long as we can solve the static assortment optimization problem under the choice model governing the choice process of the customers.
As our crowning technical contribution, we study locally-optimal deterministic policies, where changing any single one of the assortments in the policy does not improve the total expected revenue.
We show that any locally-optimal policy has a performance guarantee of 1/2 - epsilon when compared with the best sampling-based policy.