[Liste-proml] 2nd CfP - 2014 NIPS Workshop "From Bad Models to Good Policies" (Sequential Decision Making under Uncertainty)

Odalric-Ambrym Maillard odalric-ambrym.maillard at ens-cachan.org
Dim 28 Sep 17:52:59 CEST 2014

2014 NIPS Workshop "From Bad Models to Good Policies" (Sequential Decision
Making under Uncertainty),  Montreal, Canada
Website:  https://sites.google.com/site/badmodelssdmuworkshop2014/
[We apologize if you receive multiple copies of this email]

Due to special requests, the submission deadline is postponed to

*Thursday, October 9, 2014 at 11:59 pm PST.*

This workshop aims to gather researchers in the area of sequential decision
making to discuss recent findings and new challenges around the concept of
model misspecification. A misspecified model is a model that either (1)
cannot be tractably solved, (2) solving the model does not produce an
acceptable solution for the target problem, or (3) the model clearly does
not describe the available data perfectly. However, even though the model
has its issues, we are interested in finding a good policy. The question is
thus: How can misspecified models be made to lead to good policies?
We refer to the following (non exhaustive) types of misspecification.

   1. States and Context. A misspecified state representation relates to
   research problems such as Hidden Markov Models, Predictive State
   Representations, Feature Reinforcement Learning, Partially Observable
   Markov Decision Problems, etc. The related question of misspecified context
   in contextual bandits is also relevant.
   2. Dynamics. Consider learning a policy for a class of several MDPs
   rather than a single MDP, or optimizing a risk averse (as opposed to
   expected) objective. These approaches can be used to derive reasonable
   policies even from a misspecified model. Thus, robustness, safety, and
   risk-aversion are examples of relevant approaches to this question.
   3. Actions. The underlying insight of working with high-level actions
   built on top of lower-level actions is that if we had the right high-level
   actions, we would have faster learning/planning. However, finding an
   appropriate set of high-level actions can be difficult. One form of model
   misspecification occurs when the given high-level actions cannot be
   combined to derive an acceptable policy.

More generally, since misspecification may slow learning or prevent an
algorithm from finding any acceptable solution, improving the efficiency of
planning and learning methods under misspecification is of primary
importance. At another level, all these challenges can benefit greatly from
the identification of finer properties of MDPs (local recoverability, etc.)
and better notions of complexity. These questions are deeply rooted in
theory and in recent applications in fields diverse as air-traffic control,
marketing, and robotics. We thus also want to encourage presentations of
challenges that provide a red-line and agenda for future research, or a
survey of the current achievements and difficulties. This includes concrete
problems like Energy management, Smart grids, Computational sustainability
and Recommender systems.

We welcome contributions on these exciting questions, with the goals of (1)
helping close the gap between strong theoretical guarantees and challenging
application requirements, (2) identifying promising directions of near
future research, for both applications and theory of sequential decision
making, and (3) triggering collaborations amongst researchers on learning
good policies despite being given misspecified models.
 Motivation, objectives
Despite the success of sequential decision making theory at providing
solutions to challenging settings, the field faces a limitation. Often
strong theoretical guarantees depend on the assumption that a solution to
the class of models considered is a good solution to the target problem. A
popular example is that of finite-state MDP learning for which the model of
the state-space is assumed known. Such an assumption is however rarely met
in practice. Similarly, in recommender systems and contextual bandits, the
context may not capture an accurate summary of the users. Developing a
methodology for finding, estimating, and dealing with the limitations of
the model is paramount to the success of sequential decision processes.
Another example of model misspecification occurs in Hierarchical
Reinforcement Learning: In many real-world applications, we could solve the
problem easily if we had the right set of high-level actions. Instead, we
need to find a way to build those from a cruder set of primitive actions or
existing high-level actions that do not suit the current task.
 Yet another applicative challenge is when we face a process that can only
be modeled as an MDP evolving in some class of MDPs, instead of a fixed
MDP, leading to robust reinforcement learning, or when we call for safety
or risk-averse guarantees.

These problems are important bottlenecks standing in the way of applying
sequential decision making to challenging application, and motivate the
triple goal of this workshop.
 Relevance to the community

Misspecification of models (in the senses we consider here) is an important
problem that is faced in many – if not all – real-world applications of
sequential decision making under uncertainty. While theoretical results
have primarily focused on the case when models of the environment are
well-specified, little work has been done on extending the theory to the
case of misspecification. Attempting at understanding why and when
incorrectly specified models lead to good empirical performance beyond what
the current theory explains is also an important goal. We believe that this
workshop will be of great interest for both theoreticians and applied
researchers in the field.
 Invited Speakers
Academic speakers:

   - [Confirmed] Thomas Dietterich, Oregon State University
   - [Tentative]   Peter Grünwald, Centrum voor Wiskunde en Informatica
   - [Confirmed] Joelle Pineau, McGill Univerisity
   - [Tentative]   Peter Stone, University of Texas at Austin
   - [Confirmed] Ronald Ortner, Montänuniversität Leoben
   - [Tentative]  Raphael Fonteneau, University of Liège

Industry speakers:

   - [Confirmed] Georgios Theocharous, Adobe Research
   - [Confirmed] Esteban Arcaute, WalmartLabs
   - [Tentative] Dotan Di-castro, Yahoo! Research


   - Odalric-Ambrym Maillard, Senior Researcher, The Technion, Israel.
   - Timothy A. Mann, Senior Researcher, The Technion, Israel.
   - Shie Mannor, Professor, The Technion, Israel.
   - Jeremie Mary, Associate Professor, INRIA Lille - Nord Europe, France.
   - Laurent Orseau, Associate Professor, AgroParisTech/INRA, France.

Important Dates
Please, refer to https://sites.google.com/site/badmodelssdmuworkshop2014/
for up-to-date information.

   - Workshop call for paper:
   - August 20th, 2014.
   - Paper submission deadline:
   - October 09th, 2014, 11:59 pm PST.
      - Notification of acceptance:
   - October 23rd, 2014.
   - Camera-ready version:
   - November 27th, 2014.
   - Date of the workshop (one day):
   - December 12, Montreal, 2014.

A notification of submission will be sent to you. Submitted papers will be
evaluated by the PC members. Authors do not need to send an anonymous
 If you have any question about the workshop, you can use this email
<nips2014-workshop-bad_models_to_good_policies at googlegroups.com?subject=%5BQuestion%5D>,
or send an email to nips2014-workshop
-bad_models_to_good_policies at googlegroups.com
<nips2014-workshop-bad_models_to_good_policies at googlegroups.com?subject=%5BQuestion%5D>,
with a subject starting with "[Question]".

Odalric-Ambrym Maillard, Timothy A. Mann, Shie Mannor, Jeremie Mary
and  Laurent
-------------- section suivante --------------
Une pièce jointe HTML a été nettoyée...
URL: <http://lists.lri.fr/pipermail/liste-proml/attachments/20140928/a71c7239/attachment.html>

More information about the Liste-proml mailing list