Instructor: Bo Waggoner
Course webpage: https://www.bowaggoner.com/courses/2025/csci6314/
The project in this course is an investigation or deep dive into a topic beyond the level covered in class. It will emphasize reading and synthesis. It will have a smaller original component as well.
Projects will be done in groups of two people except with permission of the instructor. It is ok for multiple groups to pick the same topic.
Each of the following could go in several different directions!
Complexity of playing games. Going beyond PPAD-completeness, what is known about the computational complexity of playing games? In what special cases can we solve games in polynomial time? What is known about the hardness of solving or even approximately solving large games?
Some relevant papers: Aviad Rubinstein. Settling the Complexity of Computing Approximate Two-Player Nash Equilibria. Foundations of Computer Science (FOCS), 2016.
Dynamics of online learning. When e.g. two online learning algorithms repeatedly play against each other, how do their strategies actually evolve? How does this relate to the time-average of their strategies? Does this lead to new suggestions for learning algorithms or computing equilibria?
Some relevant papers: Bailey, Gidel, Piliouras. Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent; e.g. Last-iterate convergence rates for min-max optimization by Abernethy et al.
Deep dive into no-regret algorithms. There is a deep and extensive literature on no-regret learning in various settings. For example, one can study the popular Follow-The-Regularized Leader algorithm, Mirror Descent, Follow-The-Perturbed-Leader, and/or other algorithms.
Some resources: Shalev-Shwartz, Online Learning and Online Convex Optimization; there are many other resources available online as well.
Solving large zero-sum games. What are the best algorithms to sotlve huge two-player zero-sum games like chess, go, poker, and more? Several modern algorithms make use of deep learning, for example. Why do they work well and where is there room for improvement?
Some relevant papers: Brown, Lerer, Gross, Sandholm. Deep Counterfactual Regret Minimization. International Conference on Machine Learning (ICML), 2019; Mastering the game of Go without human knowledge by DeepMind group; Simplified AlphaZero Tutorial; and more.
Bandit learning algorithms. In online learning, we focused on cases with "full feedback", meaning that the learner always observes "what would have happened" if a different action had been chosen. The case of "partial feedback" is the domain of what are called multi-armed bandits in machine learning. How do algorithms for this setting work? Include variants like EXP3, Upper Confidence Bound (UCB), $\epsilon$-Greedy, and possibly more.
Some resources: Slivkins, Introduction to Multi-Armed Bandits; there are many other surveys and lecture notes available online as well.
Connections of game-solving to reinforcement learning. This is closely related to the previous two topics. How do reinforcement learning and online learning techniques combine to solve zero-sum games, and possibly also single-player games like Atari games? This can also be related to investigating "multi-armed bandit" algorithms.
Some relevant papers: Combining no-regret and Q-learning by Kash et al.; Playing Atari with deep reinforcement learning; and more.
Calibration. We saw that online learning algorithms can produce calibrated forecasts. But there is a lot more to the story. There are many measures of calibration. How are they related? How are calibrated predictions related to good decisionmaking more broadly? There are many very recent papers in this area, so this project could be really cutting edge.
Some resources: Hartline course on calibration.
Blackwell approachability. We saw the basics, but there are many (a) applications of approachability, (b) connections to other technical topics, including swap regret and approachability. Both of these could be great topics.
Some resources: Perchet survey, for starters.
Variations and refinements of Nash equilibrium. There are many types of games, and often the "right" notion of equilibrium is not clear, especially in extensive-form settings. Describe the motivation behind concepts like perfect Bayesian equilibrium, trembling hand, sequential, and other notions of equilibrium in games.
Some resources: Osborne and Rubinstein text; other textbooks on game theory.
Evolutionary game theory. In this branch of game theory, one considers a population of agents that repeatedly play a game against each other. Agents have an inherent preference for a strategy they always play. The strategies that win against other strategies become more prevalent in the population over time. How do these dynamics play out in the long run, and what strategies are "evolutionarily stable", in e.g. the Prisoner's Dilemma?
Some resources: Book chapter from Kleinberg and Easley; Detailed slides from Ozdaglar
This component consists of one or two paragraphs describing the topic chosen for the project and listing at least two next steps that will be taken for the project.
This component consists of a short summary of what the group has done so far and what they have encountered (about half a page is probably sufficient), along with an annotated bibliography.
The bibliography should be a list of references properly cited in LaTeX, where each reference is described in at least one to two sentences. More important references may need one to two paragraphs. The bibliography should contain a majority of the references that will be used in the final paper, so most updates should have at least 10 or so references.
This component is a short-medium length paper (expect to need at least 8 pages; do not exceed 15). You can think of this paper as a tutorial in which you teach the reader about the topic you've studied. The target audience of the paper is one of your classmates; someone who is familiar with all the material in class, but not your specific topic area.
Expository papers should be organized with Abstract and Introduction sections first. The Abstract gives a brief overview of one to two paragraphs, while the introduction describes important background context and motivation for the topic. The rest of the paper should teach the reader about the topic.
References should be used throughout where appropriate to both give credit for specific research and to refer the reader to further reading on a given topic.
Grading will be based on the above requirements along with the following criteria:
In this component, you read another group's expository paper draft and submit feedback for that group. Your feedback should include:
Your feedback will likely total 1-2 pages.
This component can be taken in any creative direction you like. The originality requirement is that it investigate or produce something new and not contained in the literature of the expository paper. Suggested options include: