Friday, February 23, 2018 - 9:30am to 10:30am
Location:Traffic21 Classroom 6501 Gates Hillman Centers
Speaker:ZHAOHAN DANIEL GUO, Ph.D. Student http://www.cs.cmu.edu/~zguo/
This thesis proposes using more sophisticated exploration techniques to construct new sample efficient algorithms and advance theory for more practical reinforcement learning settings, as well as adapt theoretically efficient exploration techniques to practical algorithms and the deep reinforcement learning setting. One proposed technique, directed exploration, involves explicitly performing exploration for specific goals, which can be used to accumulate useful information that can narrow down the possibility space of unknown parameters. Using directed exploration can improve sample complexity in a variety of more practical settings: when solving multiple tasks either concurrently or sequentially, algorithms can explore distinguishing state--action pairs to cluster similar tasks together and share samples to speed up learning; in large, factored MDPs, repeatedly trying to visit lesser known state—action pairs can reveal whether the current dynamics model is faulty and which features are unnecessary. Other techniques such as using data-dependent confidence intervals as a form of tempered optimism combined with explicit exploration towards gathering information about value gap between actions may result in better empirical performance as well as make progress towards tighter, problem-dependent bounds. Finally, these exploration techniques can be adapted to the deep reinforcement learning setting by reducing this setting back to the small, discrete setting by using deep learning as a state abstraction and discretizing the state representation.
Emma Brunskill (Chair)
Remi Munos (Google DeepMind)