pic of Bo


I'm interested in theory of how information is (algorithmically) gathered, aggregated, and used to make predictions or decisions. Much of my research situates AI, theoretical CS, or machine-learning problems in a societal context where information has privacy implications or is held by strategic agents who might misreport it.

a gif

Contents: Papers, Talks


Non-parametric Binary Regression in Metric Spaces with KL Loss (draft, 2020)
Ariel Avital, Klim Efremenko, Aryeh Kontorovich, David Toplin, Bo Waggoner
Suppose your machine-learning algorithm must predict, given x, the probability that an associated label y is 1. You're optimizing the log scoring rule, or KL-divergence from the true distributions. We give an optimization algorithm for doing so and some notes on sample complexity.

Prophet Inequalities with Linear Correlations and Augmentations (EC, 2020)
Nicole Immorlica, Sahil Singla, Bo Waggoner.
In this optimal-stopping problem, a sequence of values arrive and our algorithm must decide when to stop the process and take the current value. In this paper the values are correlated: positive linear combinations of underlying independent variables.

Preventing Arbitrage from Collusion When Eliciting Probabilities (AAAI 2020)
Rupert Freeman, David M. Pennock, Dominik Peters, Bo Waggoner.
Suppose a group all make predictions and are scored based on accuracy, e.g. with a scoring rule. Studies when they can collude to make untruthful predictions yet risklessly improve their payments.

Channel Auctions (Management Science, 2020)
Eduardo M. Azevedo, David M. Pennock, Bo Waggoner, E. Glen Weyl
Proposes a single-item auction incorporating both an ascending and a descending-price simultaneously. When inspection of the item is costly, but is optional to acquire the item, there are cases where this format is necessary to achieve optimal welfare.

Computing Equilibria of Prediction Markets via Persuasion. (WINE 2019)
Jerry Anunrojwong, Yiling Chen, Bo Waggoner, and Haifeng Xu.
Building on [Kong+Schoenebeck 2018], considers algorithms for playing in a simple two-player prediction market. Shows a connection to Bayesian persuasion and efficient algorithms for new cases.

Toward a Characterization of Loss Functions for Distribution Learning (NeurIPS 2019)
Nika Haghtalab, Cameron Musco, Bo Waggoner
Considers loss functions (or "scoring rules") that evalute a prediction in the form of a probability distribution. We define natural conditions for these losses of "strongly proper", "sample proper", and "concentrating". We show that these can be achieved if one also requires the prediction to be calibrated.

An Embedding Framework for Consistent Polyhedral Surrogates (NeurIPS 2019)
Jessie Finocchiaro, Rafael Frongillo, Bo Waggoner
In machine learning, we prefer loss functions defined over R^d, a nice convex continuous space. Given a problem with discrete alternatives, like classification or ranking, a common approach is to "embed" each alternative into R^d and define a "polyhedral" (i.e. piecewise linear, convex) surrogate loss function over this space. This work investigates this general approach.

Equal Opportunity in Online Classification with Partial Feedback (NeurIPS 2019)
Yahav Bechavod, Katrina Ligett, Aaron Roth, Bo Waggoner, Z. Steven Wu
An algorithm must classify a sequence of arrivals as plus or minus, but it will only observe the true label if it chooses plus. Under this partial feedback model, we construct bandit learning algorithms that get low regret while enforcing "fairness" constraints such as equalizing false positive rates across two groups.

Decentralized & Collaborative AI on Blockchain (IEEE Blockchain 2019)
Justin D. Harris, Bo Waggoner
Proposes collaboratively providing data for training machine learning models, which are open and accessible to all for free, via blockchains like Ethereum. Investigates some incentive structures for getting good data. See Justin's code at https://github.com/microsoft/0xDeCA10B

Matching Markets via Descending Price (draft, 2019)
Bo Waggoner, E. Glen Weyl
Proposes an auction-style mechanism for two-sided marketplaces, such as short-term labor or crowdsourcing. Participants place bids on their different possible matches, then adjust the bids as a global price descends, causing high-value pairs to match first.

Multi-Observation Regression (AISTATS 2019)
Rafael Frongillo, Nishant Mehta, Tom Morgan, Bo Waggoner
Regression involves learning some model that predicts a statistic from a set of features. This paper investigates algorithms that use multi-observation loss functions (see paper "Multi-Observation Elicitation") for regression on higher-order statistics such as the variance.

Local Differential Privacy for Evolving Data (NeurIPS 2018; Journal of Privacy and Confidentiality 2020)
Matthew Joseph, Aaron Roth, Jonathan Ullman, Bo Waggoner
We suppose people are reporting statistics or observations while adding randomness in order to preserve their privacy (this is the "local" model). How can we keep up to date if the underlying distribution of data is changing occasionally over time?
*Full version in Journal of Privacy and Confidentiality, Vol. 10 No. 1 (2020).

A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem (NeurIPS 2018)
Sampath Kannan, Jamie Morgenstern, Aaron Roth, Bo Waggoner, Steven Wu
In this problem, each day an algorithm sees data about a set of choices and picks one, obtaining a random reward. It hopes to do well by eventually learning the (linear) relationships between data and average rewards. A short-sighted greedy approach is known to sometimes do very poorly; but we show that it actually performs very well when small amounts of randomness are present in the data.

Bounded-Loss Private Prediction Markets (NeurIPS 2018)
Rafael Frongillo, Bo Waggoner
We construct versions of prediction markets that preserve privacy of participants while also satisfying a fixed budget limit (getting around previous impossibility results).

Strategic Classification from Revealed Preferences (EC 2018)
Jinshuo Dong, Aaron Roth, Zachary Schutzman, Bo Waggoner, Steven Wu
Suppose you sequentially deploy a spam filter, see how the spammers react, improve the spam filter, and continue. How should you make changes, knowing that spammers will react to your "improvements" but not knowing their exact objectives? This paper formalizes a model for this process and investigates settings where efficient algorithms can be used.

Active Information Acquisition for Linear Optimization (UAI 2018)
Shuran Zheng, Bo Waggoner, Yang Liu, Yiling Chen
Imagine solving an optimization problem -- like finding a shortest path through a network -- without knowing some of the parameters (like edge lengths). You can acquire information about the parameters by drawing "samples" or estimates of the parameters. We consider algorithms for parsimoniously drawing samples and solving the optimization problem.

An Axiomatic Study of Scoring Rule Markets (ITCS 2018)
Rafael Frongillo, Bo Waggoner
Prediction markets are well-studied for predicting the probability of an event, or the expected value of a future variable; but little is known about predicting other statistics such as the median or mode. We investigate prediction markets for general statistics and what useful properties these markets can have.

Accuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained ERM (NeurIPS 2017; Journal of Privacy and Confidentiality 2019)
Katrina Ligett, Seth Neel, Aaron Roth, Bo Waggoner, Z. Steven Wu
(CODE) Traditionally in differentially private statistics, one fixes a required privacy level and attempts to achieve good accuracy. We consider a setting where a given accuracy level is required, and we wish to guarantee as much privacy as possible given that requirement.
*Full version in Journal of Privacy and Confidentiality, Vol. 9 No. 2 (2019).

Multi-Observation Elicitation (COLT 2017)
Sebastian Casalaina-Martin, Rafael Frongillo, Tom Morgan, Bo Waggoner
In elicitation, we ask an expert to make a prediction about a random variable, then observe its outcome and score their prediction. This paper considers (from a mostly machine-learning perspective) an extension to where we can observe multiple i.i.d. observations. This can give drastic improvements for eliciting some statistics that have large "elicitation complexity".

The Complexity of Stable Matchings Under Substitutable Preferences (AAAI 2017)
Yuan Deng, Debmalya Panigrahi, and Bo Waggoner
When computing a many-to-one stable matching, e.g. doctors to hospitals in the NRMP, hospitals must solve this problem: Given a set of doctors, what is my most-preferred subset? We show that this is a computationally hard problem even with substitutable preferences, but there is a simple additional assumption -- that hospitals can "verify" whether a set of doctors is preferred to all of its subsets -- that makes it efficiently-solvable, hence making stable matchings efficiently findable.


Acquiring and Aggregating Information from Strategic Sources (PhD Thesis)
Bo Waggoner
Some investigations into how to accomplish the title objective. Some key themes are how we can improve by allowing aggregation (what we know so far) to influence acquistion (what we want to learn); different approaches needed when treating information as e.g. data points versus opinions or beliefs; and challenges posed by strategic information sources. The intro is not too technical and summarizes some ideas; the main content is redundant with some of the below papers.

Informational Substitutes (FOCS 2016)
Yiling Chen and Bo Waggoner
Introduces a definition for when two pieces of information, "signals", are substitutes or complements. The definitions are natural from several perspectives; the main result is that they turn out to capture (respectively) "good" and "bad" equilibria of prediction markets. Also has some algorithmic applications to information acquisition under constraints.

Descending Price Optimally Coordinates Search (EC 2016 and more recent working paper)
Robert Kleinberg, Bo Waggoner, and E. Glen Weyl
Proposes a descending-price auction for settings in which bidders have some cost they must incur to inspect the item and learn their valuation. The challenge for welfare in this setting is to encourage "enough" of the "right" bidders to spend the inspection cost, without wasting many useless inspections. Combines "smoothness" with optimal search theory to prove welfare guarantees.

A Market Framework for Eliciting Private Data (NeurIPS 2015)
Bo Waggoner, Rafael Frongillo, and Jacob Abernethy
Designs a prediction-market like mechanism for "purchasing" information or data from the crowd. Gives a contest-like structure where participants' contributions collaboratively update a single collective hypothesis. Preserves differential privacy of contributions.
Spin-off note on Differentially Private, Classic Prediction Markets.

Low-Cost Learning via Active Data Procurement (EC 2015)
Jacob Abernethy, Yiling Chen, Chien-Ju Ho, and Bo Waggoner
Designs mechanisms/learning algorithms to purchase data from agents who arrive one by one. The key idea is to actively change the prices we offer based on the current state of the learning algorithm. Proves learning guarantees as a function of the budget constraint.
Supplementary: simulation code.

Fair Information Sharing for Treasure Hunting. (AAAI 2015)
Yiling Chen, Kobbi Nissim, and Bo Waggoner
Designs a mechanism for the following problem: A group of agents (e.g. pirates) each hold private information about the solution to a search problem (e.g. location of buried treasure); we want them to truthfully report to the mechanism, which then assigns search tasks fairly.

lp Testing and Learning of Discrete Distributions (ITCS 2015)
Bo Waggoner
Examines uniformity testing and learning of discrete distributions, given access to independent samples, under ℓp metrics. One result is that a 6-sided die is easier to test for fairness than a 2-sided coin, and a 52-card shuffler easier than the die, if the measure of fairness is ℓp with p > 4/3. In general, gives upper and lower bounds on number of samples needed (tight everywhere for learning and in many cases for uniformity testing).

Online Stochastic Matching with Unequal Probabilities (SODA 2015)
Aranyak Mehta, Bo Waggoner, and Morteza Zadimoghaddam.
Designs an algorithm for online bipartite matching when edges are labeled with a probability of success and the goal is to maximize the expected number of matches. Prior work considered the case where all edge probabilities are equal; we consider the general case with unequal (but vanishing) edge probabilities. The approximation ratio is 0.534.

Output Agreement Mechanisms and Common Knowledge (HCOMP 2014)
Bo Waggoner and Yiling Chen.
Considers equilibria of "output agreement" games, where two people get the same input and are asked to answer a question about it, rewarded based on how closely their answers agree. Rather than fully truthful, such games elicit "common knowledge" (though some subtleties arise).
Preceded by the following workshop version, containing essentially the same results with different presentation: Information Elicitation Sans Verification. Bo Waggoner and Yiling Chen. The 3rd Workshop on Social Computing and User-Generated Content (SCUGC-13), at EC-13.

Designing Markets for Daily Deals (WINE 2013)
Yang Cai, Mohammad Mahdian, Aranyak Mehta, and Bo Waggoner.
Designs auctions for maximizing welfare when the bidders, like marketers on a Daily Deals site (e.g. Groupon), have private information about how valuable their deal is to consumers in the form of beliefs or predictions. We elicit this information truthfully, select an outcome, and assign payments to maximize a combination of bidder, auctioneer, and consumer welfare.

Evaluating Resistance to False-Name Manipulations in Elections (AAAI 2012)
Bo Waggoner, Lirong Xia, and Vincent Conitzer.
Considers voting when voters might cast multiple votes ("false-name manipulation"). We consider how best to design a deterrent against such manipulation in order to maximize the probability that the outcome of the election matches the true population preference, and how to statistically test whether this occurred.
Supplementary: simulation code.


2020-10-29. Information Elicitation and Design of Surrogate Losses. Peking U. 60min.
slides (pdf) (used digital whiteboard, so a few details aren't present).

2019-10-30. Prophet Inequalities with Linear Correlations.
U. Colorado-Boulder. 60min.
slides (pdf)

2019-07-16. Market Approaches to Aggregating Predictions and Data.
Makerere University, Kampala, Uganda. 60min.
slides (pdf)

2019-05-10. On Valuing and Procuring Personal Data.
Simons, Berkeley, CA. 30min.
slides (pdf)

2019-04-04. Multi-Observation Losses.
Columbia University. 60min.
slides (pdf)

2018-12-05. A Smoothed Analysis of the Greedy Algorithm for Linear Contextual Bandits.
Neural Information Processing Systems, Montreal, Canada. 5min.
slides (pdf)

2018-12-04. Bounded-Loss Private Prediction Markets.
Neural Information Processing Systems, Montreal, Canada. 5min.
slides (pdf)

2018-11-07. Descending Price Optimally Coordinates Search.
INFORMS, Phoenix, AZ. 20min.
slides (pdf)

2018-08-22. Market-Based Mechanisms for Acquiring and Aggregating Data.
Workshop on Learning + Strategic Behavior, Chicago, IL. 15min.
slides (pdf)

2018-06-22. Differentially Private, Bounded-Loss Prediction Markets.
WADE, EC, Ithaca, NY. 25min.
slides (pdf)

2018-06-19. Strategic Classification from Revealed Preferences.
EC, Ithaca, NY. 20min.
slides (pdf)

2018-05-01. Local Differential Privacy for Evolving Data.
Banff, Canada. 30min.
slides (pdf)

2018-01-11. An Axiomatic Study of Scoring Rule Markets.
ITCS, Cambridge, MA. 20min.
slides (pdf)

2017-12-13. Buying and Learning from User Data, Privately.
Georgetown Computer Science, Washington, D.C. 60min.
slides (pdf)

2017-12-09. Scoring Rule Markets as Collaborative ML Contests.
CiML Workshop at NIPS, Long Beach, CA. 25min.
slides (pdf)

2017-07-09. Multi-Observation Elicitation.
COLT, Amsterdam. 10min.
slides (pdf)

2017-06-16. Bitcoin, blockchain, etc.
UPenn theory group, informal discussion. (Warning: I am not a Bitcoin expert, just wanted to describe some basics.)
slides (pdf).

2017-01-04. On Information Elicitation and Mechanism Design.
Young Researchers - EC. Tel Aviv, Israel. 30min.
slides (pdf).

2016-11-16. Prediction Market Equilibria via Substitutes and Complements.
INFORMS, Nashville, TN. 20min.
slides (pdf).

2016-11-09. Informational Substitutes.
TCS+ video series, The Internet. 60min.
Video (61min).
slides (pdf), slides (Google presentation).

2016-10-19. Some Aspects of Acquiring and Aggregating Information.
Career Networking Conference (CNC), Boston, MA. 15min.
slides (pdf).

2016-10-09. Informational Substitutes.
FOCS, New Brunswick, New Jersey. 20min.
Video (22min).
slides (pdf).

2016-09-27. What Dice Are These?
University of Colorado - Boulder. 60min.
Chalk talk.

2016-07-26. Descending Price Optimally Coordinates Search.
EC, Maastricht, Netherlands. 20min.
slides (Google presentation), slides (pdf).

2016-07-24. Informational Substitutes: Definitions and Design.
GAMES, Maastricht, Netherlands. 20min.
slides (pdf).

2016-05-27. (Thesis Defense) Acquiring and Aggregating Information from Strategic Sources.
Harvard. 60min.
slides (Google presentation), slides (pdf).

2016-03-16. Some Approaches for Information Acquisition and Aggregation.
UPenn, Philadelphia PA. 60min.
slides (Google presentation), slides (pdf).

2015-10-14. Low-Cost Learning via Active Data Procurement.
Microsoft Research NYC.
slides (Google presentation), slides (pdf).
Almost identical to the version below.

2015-09-09. Low-Cost Learning via Active Data Procurement.
Duke CSEcon Group.
slides (Google presentation), slides (pdf).

2015-06-19. Low-Cost Learning via Active Data Procurement.
EC 2015, Portland, Oregon. 18min.
slides (Google presentation), slides (pdf).

2015-02-06. Fair Information Sharing for Treasure Hunting.
Harvard EconCS group. 60min.
slides and notes (Google presentation), slides (pdf).
12min version given at AAAI 2015, 29 January 2015, Austin, Texas.

2015-01-13. lp Testing and Learning of Discrete Distributions.
ITCS 2015, Rehovot, Israel. 20min.
slides (pdf), slides with notes (pdf).

2015-01-06. Online Stochastic Matching with Unequal Probabilities.
SODA 2015, San Diego, CA. 20min.
slides (Google presentation), slides (pdf).

2014-01-10. Toward Buying Labels from the Crowd.
Indo-US Lectures Week in Machine Learning, Game Theory, and Optimization, Bangalore. 20min.
slides (pdf), slides and notes (pdf).

2013-12-13. Designing Markets for Daily Deals.
WINE 2013, Cambridge, MA. 20min.
slides (Google presentation, with notes), slides (pdf), slides and notes (pdf).

2013-06-16. Information Elicitation Sans Verification.
Workshop on Social Computing and User-Generated Content (SCUGC), at EC 2013, Philadelphia, PA. 20min.
slides (pdf).
Different version given at HCOMP 2014, 4 November 2014, Pittsburgh, PA.

2012-03-26. Evaluating Resistance to False-Name Manipulations in Elections.
Harvard EconCS Group. 45min.
slides (pptx), slides (pdf), slides and notes (pdf).