**Contents:**Papers, Talks

## Papers

Bibtex citations: Complete bibtex

**An Embedding Framework for the Design and Analysis of Consistent Polyhedral Surrogates**(draft, 2022)

Jessie Finocchiaro, Rafael M. Frongillo, Bo Waggoner (bibtex)

If we have a discrete loss function in machine learning, we can "embed" the predictions in space. This yields a piecewise linear, convex ("polyhedral") loss function that is a good surrogate for the original. This paper analyzes the structure of this phenomenon deeply. Supersedes FFW2019 and to an extent FW2021.

**Balls and Bins -- Simple Concentration Bounds**(draft, 2022)

Ernst Schulte-Geers, Bo Waggoner (bibtex)

When throwing m balls into n bins according to a distribution p, we give tail bounds for the max-loaded bin. The probability that there exists a k-loaded bin is controlled by m||p||

_{k}/k.

**Contracts with Information Acquisition, via Scoring Rules**(EC, 2022)

Maneesha Papireddygari, Bo Waggoner (bibtex)

Suppose an employee gets to first gather some information, then take a hidden action on behalf of an employer. How should their contract be structured? The answer turns out to be proper scoring rules, and we investigate their properties.

**High Welfare Matching Markets via Descending Price**(draft, 2022)

Robin Bowers, Bo Waggoner (bibtex)

The "Marshallian Match", proposed in Waggoner-Weyl 2018, is an auction-like mechanism for matching e.g. employers to workers. We analyze the social welfare of the match in several settings and explore connections to stability.

**Agreement Implies Accuracy for Substitutable Signals**(draft, 2022)

Rafael Frongillo, Eric Neyman, Bo Waggoner (bibtex)

Aumann (1976) famously proved that Bayesians cannot "agree to disagree"; Aaronson (2004) showed that they agree after a small number of rounds of discussion. We show that if Bayesians' knowledge satisfies informational substitutes, and they agree, then they are actually accurate with respect to the entire collection of their information.

**Linear Functions to the Extended Reals**(draft, 2021)

Bo Waggoner (bibtex)

Normally, "linear" functions map to the real numbers. But in convex analysis, we often extend the reals to allow functions taking the value +infinity or -infinity. What happens to "linear" functions that do so? We find out. With applications to the construction of proper scoring rules.

**Surrogate Regret Bounds for Polyhedral Losses**(NeurIPS 2021)

Rafael Frongillo, Bo Waggoner (bibtex)

In machine learning, we sometimes train a model to minimize a nice continuous loss, the "surrogate", in order to obtain good performance on a less-tractable "target" loss. We show that if the surrogate is polyhedral (piecewise linear and convex), then good performance on the surrogate translates optimally to good performance on the target.

**Unifying Lower Bounds on Prediction Dimension of Consistent Convex Surrogates**(NeurIPS 2021)

Jessie Finocchiaro, Rafael Frongillo, Bo Waggoner (bibtex)

If we are trying to minimize an intractable loss function in machine learning, statistics, finance, etc, we can utilize a surrogate loss instead. This work proposes a simple and general criterion that gives lower bounds on the dimensions of these surrogates.

**Towards a Theory of Confidence in Market-Based Predictions**(ISIPTA 2021)

Rupert Freeman, David M. Pennock, Daniel M. Reeves, David Rothschild, Bo Waggoner (bibtex)

If we get an aggregate group prediction, such as a price in a prediction market, can we infer anything about the aggregate "confidence" in that prediction? Discusses a few formalizations and requests further research.

**Efficient Competitions and Online Learning with Strategic Forecasters**(EC 2021)

Rafael Frongillo, Robert Gomez, Anish Thilagar, Bo Waggoner (bibtex)

How should we design winner-take-all prediction competitions like Kaggle? Using learning algorithms like Multiplicative Weights fixes common incentivize problems (approximately) and are optimally efficient. They also work for online learning with strategic forecasters.

**Non-parametric Binary Regression in Metric Spaces with KL Loss**(draft, 2021)

Ariel Avital, Klim Efremenko, Aryeh Kontorovich, David Toplin, Bo Waggoner (bibtex)

Suppose your machine-learning algorithm must predict, given x, the probability that an associated label y is 1. You're optimizing the log scoring rule, or KL-divergence from the true distributions. We give an optimization algorithm for doing so and some notes on sample complexity.

**Embedding Dimension of Polyhedral Losses**(COLT 2020)

Jessie Finocchiaro, Rafael Frongillo, Bo Waggoner (bibtex)

Our prior work proposed the embedding framework for surrogate losses that are piecewise linear and convex. This work considers the dimensionality of such surrogates, or in other words: when our algorithm produces predictions in the nice tractable space R^d, how small can the dimension d be?

**Prophet Inequalities with Linear Correlations and Augmentations**(EC, 2020)

Nicole Immorlica, Sahil Singla, Bo Waggoner (bibtex)

In this optimal-stopping problem, a sequence of values arrive and our algorithm must decide when to stop the process and take the current value. In this paper the values are correlated: positive linear combinations of underlying independent variables.

**Preventing Arbitrage from Collusion When Eliciting Probabilities**(AAAI 2020)

Rupert Freeman, David M. Pennock, Dominik Peters, Bo Waggoner (bibtex)

Suppose a group all make predictions and are scored based on accuracy, e.g. with a scoring rule. Studies when they can collude to make untruthful predictions yet risklessly improve their payments.

**Channel Auctions**(Management Science, 2020)

Eduardo M. Azevedo, David M. Pennock, Bo Waggoner, E. Glen Weyl (bibtex)

Proposes a single-item auction incorporating both an ascending and a descending-price simultaneously. When inspection of the item is costly, but is optional to acquire the item, there are cases where this format is necessary to achieve optimal welfare.

### Postdocs

**Computing Equilibria of Prediction Markets via Persuasion.**(WINE 2019)

Jerry Anunrojwong, Yiling Chen, Bo Waggoner, and Haifeng Xu. (bibtex)

Building on [Kong+Schoenebeck 2018], considers algorithms for playing in a simple two-player prediction market. Shows a connection to Bayesian persuasion and efficient algorithms for new cases.

**Toward a Characterization of Loss Functions for Distribution Learning**(NeurIPS 2019)

Nika Haghtalab, Cameron Musco, Bo Waggoner (bibtex)

Considers loss functions (or "scoring rules") that evalute a prediction in the form of a probability distribution. We define natural conditions for these losses of "strongly proper", "sample proper", and "concentrating". We show that these can be achieved if one also requires the prediction to be calibrated.

**An Embedding Framework for Consistent Polyhedral Surrogates**(NeurIPS 2019)

Jessie Finocchiaro, Rafael Frongillo, Bo Waggoner (bibtex)

In machine learning, we prefer loss functions defined over R^d, a nice convex continuous space. Given a problem with discrete alternatives, like classification or ranking, a common approach is to "embed" each alternative into R^d and define a "polyhedral" (i.e. piecewise linear, convex) surrogate loss function over this space. This work investigates this general approach.

**Equal Opportunity in Online Classification with Partial Feedback**(NeurIPS 2019)

Yahav Bechavod, Katrina Ligett, Aaron Roth, Bo Waggoner, Z. Steven Wu (bibtex)

An algorithm must classify a sequence of arrivals as plus or minus, but it will only observe the true label if it chooses plus. Under this partial feedback model, we construct bandit learning algorithms that get low regret while enforcing "fairness" constraints such as equalizing false positive rates across two groups.

**Decentralized & Collaborative AI on Blockchain**(IEEE Blockchain 2019)

Justin D. Harris, Bo Waggoner (bibtex)

Proposes collaboratively providing data for training machine learning models, which are open and accessible to all for free, via blockchains like Ethereum. Investigates some incentive structures for getting good data. See Justin's code at https://github.com/microsoft/0xDeCA10B

**Matching Markets via Descending Price**(draft, 2019)

Bo Waggoner, E. Glen Weyl (bibtex)

Proposes an auction-style mechanism for two-sided marketplaces, such as short-term labor or crowdsourcing. Participants place bids on their different possible matches, then adjust the bids as a global price descends, causing high-value pairs to match first.

**Multi-Observation Regression**(AISTATS 2019)

Rafael Frongillo, Nishant Mehta, Tom Morgan, Bo Waggoner (bibtex)

Regression involves learning some model that predicts a statistic from a set of features. This paper investigates algorithms that use multi-observation loss functions (see paper "Multi-Observation Elicitation") for regression on higher-order statistics such as the variance.

**Local Differential Privacy for Evolving Data**(NeurIPS 2018; Journal of Privacy and Confidentiality 2020)

Matthew Joseph, Aaron Roth, Jonathan Ullman, Bo Waggoner (bibtex)

We suppose people are reporting statistics or observations while adding randomness in order to preserve their privacy (this is the "local" model). How can we keep up to date if the underlying distribution of data is changing occasionally over time?

***Full version in Journal of Privacy and Confidentiality, Vol. 10 No. 1 (2020).**

**A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem**(NeurIPS 2018)

Sampath Kannan, Jamie Morgenstern, Aaron Roth, Bo Waggoner, Steven Wu (bibtex)

In this problem, each day an algorithm sees data about a set of choices and picks one, obtaining a random reward. It hopes to do well by eventually learning the (linear) relationships between data and average rewards. A short-sighted greedy approach is known to sometimes do very poorly; but we show that it actually performs very well when small amounts of randomness are present in the data.

**Bounded-Loss Private Prediction Markets**(NeurIPS 2018)

Rafael Frongillo, Bo Waggoner (bibtex)

We construct versions of prediction markets that preserve privacy of participants while also satisfying a fixed budget limit (getting around previous impossibility results).

**Strategic Classification from Revealed Preferences**(EC 2018)

Jinshuo Dong, Aaron Roth, Zachary Schutzman, Bo Waggoner, Steven Wu (bibtex)

Suppose you sequentially deploy a spam filter, see how the spammers react, improve the spam filter, and continue. How should you make changes, knowing that spammers will react to your "improvements" but not knowing their exact objectives? This paper formalizes a model for this process and investigates settings where efficient algorithms can be used.

**Active Information Acquisition for Linear Optimization**(UAI 2018)

Shuran Zheng, Bo Waggoner, Yang Liu, Yiling Chen (bibtex)

Imagine solving an optimization problem -- like finding a shortest path through a network -- without knowing some of the parameters (like edge lengths). You can acquire information about the parameters by drawing "samples" or estimates of the parameters. We consider algorithms for parsimoniously drawing samples and solving the optimization problem.

**An Axiomatic Study of Scoring Rule Markets**(ITCS 2018)

Rafael Frongillo, Bo Waggoner (bibtex)

Prediction markets are well-studied for predicting the probability of an event, or the expected value of a future variable; but little is known about predicting other statistics such as the median or mode. We investigate prediction markets for general statistics and what useful properties these markets can have.

**Accuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained ERM**(NeurIPS 2017; Journal of Privacy and Confidentiality 2019)

Katrina Ligett, Seth Neel, Aaron Roth, Bo Waggoner, Z. Steven Wu (bibtex)

(CODE) Traditionally in differentially private statistics, one fixes a required privacy level and attempts to achieve good accuracy. We consider a setting where a given accuracy level is required, and we wish to guarantee as much privacy as possible given that requirement.

***Full version in Journal of Privacy and Confidentiality, Vol. 9 No. 2 (2019).**

**Multi-Observation Elicitation**(COLT 2017)

Sebastian Casalaina-Martin, Rafael Frongillo, Tom Morgan, Bo Waggoner (bibtex)

In elicitation, we ask an expert to make a prediction about a random variable, then observe its outcome and score their prediction. This paper considers (from a mostly machine-learning perspective) an extension to where we can observe multiple i.i.d. observations. This can give drastic improvements for eliciting some statistics that have large "elicitation complexity".

**The Complexity of Stable Matchings Under Substitutable Preferences**(AAAI 2017)

Yuan Deng, Debmalya Panigrahi, and Bo Waggoner (bibtex)

When computing a many-to-one stable matching, e.g. doctors to hospitals in the NRMP, hospitals must solve this problem: Given a set of doctors, what is my most-preferred subset? We show that this is a computationally hard problem even with substitutable preferences, but there is a simple additional assumption -- that hospitals can "verify" whether a set of doctors is preferred to all of its subsets -- that makes it efficiently-solvable, hence making stable matchings efficiently findable.

### PhD

**Acquiring and Aggregating Information from Strategic Sources**(PhD Thesis)

Bo Waggoner

Some investigations into how to accomplish the title objective. Some key themes are how we can improve by allowing aggregation (what we know so far) to influence acquistion (what we want to learn); different approaches needed when treating information as e.g. data points versus opinions or beliefs; and challenges posed by strategic information sources. The intro is not too technical and summarizes some ideas; the main content is redundant with some of the below papers.

**Informational Substitutes**(FOCS 2016)

Yiling Chen and Bo Waggoner (bibtex)

Introduces a definition for when two pieces of information, "signals", are substitutes or complements. The definitions are natural from several perspectives; the main result is that they turn out to capture (respectively) "good" and "bad" equilibria of prediction markets. Also has some algorithmic applications to information acquisition under constraints.

**Descending Price Optimally Coordinates Search**(EC 2016 and more recent working paper)

Robert Kleinberg, Bo Waggoner, and E. Glen Weyl (bibtex)

Proposes a descending-price auction for settings in which bidders have some cost they must incur to inspect the item and learn their valuation. The challenge for welfare in this setting is to encourage "enough" of the "right" bidders to spend the inspection cost, without wasting many useless inspections. Combines "smoothness" with optimal search theory to prove welfare guarantees.

**A Market Framework for Eliciting Private Data**(NeurIPS 2015)

Bo Waggoner, Rafael Frongillo, and Jacob Abernethy (bibtex)

Designs a prediction-market like mechanism for "purchasing" information or data from the crowd. Gives a contest-like structure where participants' contributions collaboratively update a single collective hypothesis. Preserves differential privacy of contributions.

Spin-off note on Differentially Private, Classic Prediction Markets.

**Low-Cost Learning via Active Data Procurement**(EC 2015)

Jacob Abernethy, Yiling Chen, Chien-Ju Ho, and Bo Waggoner (bibtex)

Designs mechanisms/learning algorithms to purchase data from agents who arrive one by one. The key idea is to actively change the prices we offer based on the current state of the learning algorithm. Proves learning guarantees as a function of the budget constraint.

Supplementary: simulation code.

**Fair Information Sharing for Treasure Hunting**(AAAI 2015)

Yiling Chen, Kobbi Nissim, and Bo Waggoner (bibtex)

Designs a mechanism for the following problem: A group of agents (e.g. pirates) each hold private information about the solution to a search problem (e.g. location of buried treasure); we want them to truthfully report to the mechanism, which then assigns search tasks fairly.

**lp Testing and Learning of Discrete Distributions**(ITCS 2015)

Bo Waggoner (bibtex)

Examines uniformity testing and learning of discrete distributions, given access to independent samples, under ℓp metrics. One result is that a 6-sided die is easier to test for fairness than a 2-sided coin, and a 52-card shuffler easier than the die, if the measure of fairness is ℓp with p > 4/3. In general, gives upper and lower bounds on number of samples needed (tight everywhere for learning and in many cases for uniformity testing).

**Online Stochastic Matching with Unequal Probabilities**(SODA 2015)

Aranyak Mehta, Bo Waggoner, and Morteza Zadimoghaddam (bibtex)

Designs an algorithm for online bipartite matching when edges are labeled with a probability of success and the goal is to maximize the expected number of matches. Prior work considered the case where all edge probabilities are equal; we consider the general case with unequal (but vanishing) edge probabilities. The approximation ratio is 0.534.

**Output Agreement Mechanisms and Common Knowledge**(HCOMP 2014)

Bo Waggoner and Yiling Chen (bibtex)

Considers equilibria of "output agreement" games, where two people get the same input and are asked to answer a question about it, rewarded based on how closely their answers agree. Rather than fully truthful, such games elicit "common knowledge" (though some subtleties arise).

Preceded by the following workshop version, containing essentially the same results with different presentation: Information Elicitation Sans Verification. Bo Waggoner and Yiling Chen.

*The 3rd Workshop on Social Computing and User-Generated Content (SCUGC-13), at EC-13.*

**Designing Markets for Daily Deals**(WINE 2013)

Yang Cai, Mohammad Mahdian, Aranyak Mehta, and Bo Waggoner (bibtex)

Designs auctions for maximizing welfare when the bidders, like marketers on a Daily Deals site (e.g. Groupon), have private information about how valuable their deal is to consumers in the form of beliefs or predictions. We elicit this information truthfully, select an outcome, and assign payments to maximize a combination of bidder, auctioneer, and consumer welfare.

**Evaluating Resistance to False-Name Manipulations in Elections**(AAAI 2012)

Bo Waggoner, Lirong Xia, and Vincent Conitzer (bibtex)

Considers voting when voters might cast multiple votes ("false-name manipulation"). We consider how best to design a deterrent against such manipulation in order to maximize the probability that the outcome of the election matches the true population preference, and how to statistically test whether this occurred.

Supplementary: simulation code.

## Talks

2022-06-21.**Contracts with Information Acquisition.**

TheoryFest at STOC, Rome, Italy. 30min.

slides (pdf)

2022-06-17.

**(Efficient Competitions and) Online Learning with Strategic Forecasters.**

WALE, Naxos, Greece. 20min.

slides (pdf)

2022-04-06.

**Foundations of Forecasting.**

Neumann University (virtual).

slides (pdf)

2020-10-29.

**Information Elicitation and Design of Surrogate Losses.**

Peking U. 60min.

slides (pdf) (used digital whiteboard, so a few details aren't present).

2019-10-30.

**Prophet Inequalities with Linear Correlations.**

U. Colorado-Boulder. 60min.

slides (pdf)

2019-07-16.

**Market Approaches to Aggregating Predictions and Data.**

Makerere University, Kampala, Uganda. 60min.

slides (pdf)

2019-05-10.

**On Valuing and Procuring Personal Data.**

Simons, Berkeley, CA. 30min.

slides (pdf)

2019-04-04.

**Multi-Observation Losses.**

Columbia University. 60min.

slides (pdf)

2018-12-05.

**A Smoothed Analysis of the Greedy Algorithm for Linear Contextual Bandits.**

Neural Information Processing Systems, Montreal, Canada. 5min.

slides (pdf)

2018-12-04.

**Bounded-Loss Private Prediction Markets.**

Neural Information Processing Systems, Montreal, Canada. 5min.

slides (pdf)

2018-11-07.

**Descending Price Optimally Coordinates Search.**

INFORMS, Phoenix, AZ. 20min.

slides (pdf)

2018-08-22.

**Market-Based Mechanisms for Acquiring and Aggregating Data.**

Workshop on Learning + Strategic Behavior, Chicago, IL. 15min.

slides (pdf)

2018-06-22.

**Differentially Private, Bounded-Loss Prediction Markets.**

WADE, EC, Ithaca, NY. 25min.

slides (pdf)

2018-06-19.

**Strategic Classification from Revealed Preferences.**

EC, Ithaca, NY. 20min.

slides (pdf)

2018-05-01.

**Local Differential Privacy for Evolving Data.**

Banff, Canada. 30min.

slides (pdf)

2018-01-11.

**An Axiomatic Study of Scoring Rule Markets.**

ITCS, Cambridge, MA. 20min.

slides (pdf)

2017-12-13.

**Buying and Learning from User Data, Privately.**

Georgetown Computer Science, Washington, D.C. 60min.

slides (pdf)

2017-12-09.

**Scoring Rule Markets as Collaborative ML Contests.**

CiML Workshop at NIPS, Long Beach, CA. 25min.

slides (pdf)

2017-07-09.

**Multi-Observation Elicitation.**

COLT, Amsterdam. 10min.

slides (pdf)

2017-06-16.

**Bitcoin, blockchain, etc.**

UPenn theory group, informal discussion. (Warning: I am not a Bitcoin expert, just wanted to describe some basics.)

slides (pdf).

2017-01-04.

**On Information Elicitation and Mechanism Design.**

Young Researchers - EC. Tel Aviv, Israel. 30min.

slides (pdf).

2016-11-16.

**Prediction Market Equilibria via Substitutes and Complements.**

INFORMS, Nashville, TN. 20min.

slides (pdf).

2016-11-09.

**Informational Substitutes.**

TCS+ video series, The Internet. 60min.

Video (61min).

slides (pdf), slides (Google presentation).

2016-10-19.

**Some Aspects of Acquiring and Aggregating Information.**

Career Networking Conference (CNC), Boston, MA. 15min.

slides (pdf).

2016-10-09.

**Informational Substitutes.**

FOCS, New Brunswick, New Jersey. 20min.

Video (22min).

slides (pdf).

2016-09-27.

**What Dice Are These?**

University of Colorado - Boulder. 60min.

Chalk talk.

2016-07-26.

**Descending Price Optimally Coordinates Search.**

EC, Maastricht, Netherlands. 20min.

slides (Google presentation), slides (pdf).

2016-07-24.

**Informational Substitutes: Definitions and Design.**

GAMES, Maastricht, Netherlands. 20min.

slides (pdf).

2016-05-27.

**(Thesis Defense) Acquiring and Aggregating Information from Strategic Sources.**

Harvard. 60min.

slides (Google presentation), slides (pdf).

2016-03-16.

**Some Approaches for Information Acquisition and Aggregation.**

UPenn, Philadelphia PA. 60min.

slides (Google presentation), slides (pdf).

2015-10-14.

**Low-Cost Learning via Active Data Procurement.**

Microsoft Research NYC.

slides (Google presentation), slides (pdf).

Almost identical to the version below.

2015-09-09.

**Low-Cost Learning via Active Data Procurement.**

Duke CSEcon Group.

slides (Google presentation), slides (pdf).

2015-06-19.

**Low-Cost Learning via Active Data Procurement.**

EC 2015, Portland, Oregon. 18min.

slides (Google presentation), slides (pdf).

2015-02-06.

**Fair Information Sharing for Treasure Hunting.**

Harvard EconCS group. 60min.

slides and notes (Google presentation), slides (pdf).

12min version given at AAAI 2015, 29 January 2015, Austin, Texas.

2015-01-13.

**lp Testing and Learning of Discrete Distributions.**

ITCS 2015, Rehovot, Israel. 20min.

slides (pdf), slides with notes (pdf).

2015-01-06.

**Online Stochastic Matching with Unequal Probabilities.**

SODA 2015, San Diego, CA. 20min.

slides (Google presentation), slides (pdf).

2014-01-10.

**Toward Buying Labels from the Crowd.**

Indo-US Lectures Week in Machine Learning, Game Theory, and Optimization, Bangalore. 20min.

slides (pdf), slides and notes (pdf).

2013-12-13.

**Designing Markets for Daily Deals.**

WINE 2013, Cambridge, MA. 20min.

slides (Google presentation, with notes), slides (pdf), slides and notes (pdf).

2013-06-16.

**Information Elicitation Sans Verification.**

Workshop on Social Computing and User-Generated Content (SCUGC), at EC 2013, Philadelphia, PA. 20min.

slides (pdf).

Different version given at HCOMP 2014, 4 November 2014, Pittsburgh, PA.

2012-03-26.

**Evaluating Resistance to False-Name Manipulations in Elections.**

Harvard EconCS Group. 45min.

slides (pptx), slides (pdf), slides and notes (pdf).