Imagine a gambler playing a game repeatedly (albeit a finite number of times, say ), accumulating or losing some winnings each time. When a game has finished, the gambler is able to stop and consider how much money they have and must make a decision; do they cash out and take home what they have, or do they play with fire, risking their winnings in a bid to take home more after the next game? The goal, of course, is to maximise your potential winnings. Solving this problem, however, isn’t quite as simple as defining it.
Let us define to be the gambler’s expected winnings before they begin playing the game. They are not stupid, however; they have a strategy! This strategy will rely on the expected reward of the game in general, and they gambler can use their knowledge of the current play of the game and previous plays in order to decide on their next move (cashing out or continuing to play). If the gambler is successful in the use of their strategy, then you may expect them to take home some decent winnings – our goal is to quantify how good their strategy was.
Let’s now imagine another player of this game playing side-by-side with the gambler. The gambler looks over and notices the other player’s winnings are higher than theirs! They don’t know how this is possible since they are so confident their strategy is the best… the obvious conclusion is that the other player is cheating. (Yeah, that must be it…) The gambler concludes this cheater is a “prophet”; they can look ahead into the game and know exactly how to win every single time, thereby taking home the maximum winnings each time.
We will now denote the cheater’s expected winnings to be . The gambler knows they cannot beat this cheater, but they are interested in comparing their performance to the cheater to see how well their strategy holds up. (Note that this seems much fairer than simply comparing the gambler’s expected winnings to the average since they are so confident their strategy is the best!) By analysing several rounds of the winnings, we might be able to conclude that the prophet wins, on average, no more than k times than the gambler wins (so ) – this is an example of a prophet inequality. In fact, we find that is the best possible bound; the cheater wins no more than twice that of the gambler using their strategy – good and surprising news for the gambler!