ABSTRACT
|
---|
Beginning at least as early as the 1950s, the long and still growing literature on no-regret learning establishes the following type of result: On any sequence of T trials in which the predictions of K experts are observed, it is possible to maintain a dynamically weighted prediction whose cumulative regret to the best single expert in *hindsight* (that is, after the full sequence has been revealed) is only O(\sqrt(T)), with absolutely no statistical assumptions on the data. The algorithms achieving such remarkable guarantees generally agressively reward winning experts and punish losing experts, often via multiplicative updating. Is there any price paid for such agressive updating? In this work we simultaneously analyze an algorithm's regret to the best expert and its regret to a more modest benchmark --- the *average* regret of the experts, which can be achieved with no learning at all by simply leaving the weights uniform. It would be natural to hope for algorithms that can simultaneously achieve regret O(\sqrt(T)) to the best and little or no regret to the average, thus providing a kind of "safety net" on top of the traditional bounds. We describe the following striking sequence of results: * Any algorithm that achieves only O(\sqrt(T)) cumulative regret to the best expert must, in the worst case, suffer regret Omega(\sqrt(T)) to the average. * For a wide class of algorithms, including standard multiplicative update rules, the product of the cumulative regret to the best and the cumulative regret to the average is Omega(T), and this bound can be achieved by multiplicative updates. * We give a new algorithm that, for any small constant c > 0, achieves regret O(T^{1/2 + c}) to the best expert while suffering only *constant* regret to the average (that is, with no dependence on T). This algorithm works in phases that begin with rather conservative update rules, but gradually become more aggressive as the performance of the best expert pulls away from the crowd. This talk describes joint work with Eyal Even-Dar, Yishay Mansour, and Jenn Wortman. Return to DMTCS home page. |