Computer Science Thesis Oral

Wednesday, December 6, 2017 - 1:00pm to 2:30pm


Traffic21 Classroom 6501 Gates Hillman Centers



Distributed Learning in Referral Networks

Human experts as autonomous agents in a referral network must decide whether to accept a task or refer to a more appropriate expert, and if so to whom. In order for the referral network to improve over time, the experts must learn to estimate the topical expertise of other experts. This thesis extends concepts from Multi-agent Reinforcement Learning and Active Learning to referral networks. Among a wide array of algorithms evaluated, Distributed Interval Estimation Learning (DIEL), based on Interval Estimation Learning, was found to be promising for learning appropriate referral choices, compared to Greedy, Q-learning, Thompson Sampling and Upper Confidence Bound (UCB) methods. DIEL's rapid performance gain in the early phase of learning makes it a practically viable algorithm, including when multiple referral hops are allowed. In addition to a synthetic data set, we compare the performance of several top-performing referral algorithms on a referral network of high-performance Stochastic Local Search (SLS) solvers for the propositional satisfiability problem (SAT). Our experimental results demonstrate that the referral learning algorithms can learn appropriate referral choices in the real task of solving satisfiability problems where expertise does not obey any known parameterized distribution. Apart from evaluating overall network performance, we conduct a robustness analysis across the learning algorithms, with an emphasis on capacity constraints (limits on number of tasks per time period), evolving networks (changes in connectivity or agents joining or leaving the referral network) and expertise drift (skills improving over time or atrophying through disuse) --- situations that often arise in real-world scenarios but are largely ignored in the Active Learning literature. Several high-performance referral learning algorithms proved to be robust to capacity constraints and evolving networks, while Hybrid, a novel combination of multiple algorithms, proved the most resilient to expertise drift. In an augmented learning setting, where experts may report their top skills to their colleagues, we proposed three algorithms, proactive-DIEL, proactive-Q-Learning, and proactive-$\epsilon$-Greedy. All proactive algorithms exhibited robustness to noisy self-skill estimates, evolving networks and strategic misreporting.

Thesis Committee:
Jaime Carbonell (Chair)
Manuel Blum
Manuela Veloso
Victor Lesser (University of Massachusetts, Amherst)

For More Information, Contact:


Thesis Oral