Symposium on Mathematical Foundations of Trustworthy Learning

In today's data-driven world, machine learning plays an increasingly pivotal role, but with this prominence comes the need for rigorous mathematical underpinnings to ensure trustworthiness. There has been a recent flurry of theoretical research activities that examine fundamental principles such as privacy, adversarial robustness, reproducibility, generalization, and other societal aspects of machine learning algorithms.

Our aim is to create a forum where researchers at all career stages can engage in a vibrant exchange of ideas, foster collaborations, identify pressing challenges, and collectively advance the mathematical foundations of trustworthy machine learning. This symposium will bring together a unique diverse mix of leading researchers from (i) different fields (such as theoretical computer science, statistics, optimization, causality), (ii) different parts of the world (in Europe, North America, and Asia), and (iii) a balanced group of junior and senior researchers. The planned session topics will include

● Theory of deep learning - stat. / generalization viewpoint
● Theory of deep learning - optimization viewpoint
● Privacy-aware machine learning
● Reproducible optimization and learning
● Robustness under distribution shifts and adversarial attacks, causality
● Societal foundations of machine learning

Invited Speakers

Rediet Abebe

Ellis / Uni Tuebingen

Peter L. Bartlett

UC Berkeley

Mikhail (Misha) Belkin

UCSD

Shai Ben-David

University of Waterloo

Cristina Butucea

ENSAE‑CREST

Ira Globus-Harris

Cornell University

Hoda Heidari

CMU

Gautam Kamath

University of Waterloo

Amin Karbasi

Yale University

Hongseok Namkoong

Columbia University

Sewoong Oh

University of Washington

Jonas Peters

ETH Zurich

Jessica Sorrell

Johns Hopkins University

Martin Wainwright

MIT

Programme

Afternoon	Arrival
from 15:00	Check-in for overnight participants at the reception desk
16:00–19:00	Conference check-in & nametag collection at the reception desk for commuting participants
19:00	Welcome gathering at Sala Balint
19:30	Dinner & Networking

8:45 - 9:00	Announcement (CSF)
9:00	Keynote: Gautam Kamath - Differential Privacy and the Broader Landscape of Robustness in Algorithmic Statistics
Differential Privacy and the Broader Landscape of Robustness in Algorithmic Statistics Speaker: Gautam Kamath Abstract: I will discuss differential privacy, a rigorous notion of data privacy, and its relationship to other constraints, including adversarial contamination and heavy-tailed data. I will explore conceptual and algorithmics connections between these seemingly different notions of robustness.
9:50	Contributed Talk: Mahdi Haghifam - The Sample Complexity of Membership Inference and Privacy Auditing
10:15	Coffee Break
10:45	Keynote: Sewoong Oh - Private statistical estimation via robustness and stability
private statistical estimation via robustness and stability Speaker: Sewoong Oh Abstract: Privacy enhancing technologies, such as differentially private stochastic gradient descent (DP-SGD), allow us to access private data without worrying about leaking sensitive information. This is crucial in the modern era of data-centric AI, where all public data has been exhausted and the next frontier models rely on access to high-quality data. Central component in these technologies is private statistical estimation. We present a series of results where robust statistics and stable algorithms have played critical roles in advancing the state-of-the-art in differentially private statistical estimation. This talk is based on papers https://arxiv.org/abs/2404.15409, https://arxiv.org/abs/2301.13273, and https://arxiv.org/abs/2111.06578.
11:35	Contributed Talk: Simone Bombari - Privacy for Free in the Overparameterized Regime
12:00 - 14:00	Lunch
14:00 - 15:00	Poster session
15:00	Keynote: Cristina Butucea - Gentle Measurements of Quantum States
Gentle Measurements of Quantum States Speaker: Cristina Butucea Abstract: Gentle measurements of quantum states result in both a random variable and a non-collapsed post-measurement state which is at a prescribed trace-distance from the initial state. Unlike collapsed states, this can be further used in quantum computing. Connections of gentle measurements to quantum differential privacy have been established. We introduce here locally gentle measurements and prove a quantum data processing inequality for such measurements. We introduce a physically feasible gentle measurement called quantum Label Switch and show optimal rates for learning and testing of qubits.
15:50	Contributed Talk: Tosca Lechner - “Distribution Learning with adaptive and oblivious adversaries" or "Robust Learning with Unknown Manipulation capabilities"
16:15	Coffee Break
16:45	Keynote: Peter L. Bartlett - Gradient Optimization Methods: the Benefits of a Large Step-Size
Gradient optimization methods: the benefits of a large step-size Speaker: Peter L. Bartlett Abstract: Deep learning has revealed some major surprises from the perspective of theory. Optimization in deep learning relies on simple gradient descent algorithms that are traditionally viewed as a time discretization of gradient flow. However, in practice, large step sizes - large enough to cause oscillation of the loss - exhibit performance advantages. This talk will review recent results on gradient descent with logistic loss with a step size large enough that the optimization trajectory is at the "edge of stability". We show the benefits of this initial oscillatory phase for linear functions and for multi-layer networks, and identify an asymptotic implicit bias that gradient descent imposes for a large family of deep networks. Based on joint work with Yuhang Cai, Michael Lindsey, Song Mei, Matus Telgarsky, Jingfeng Wu, Bin Yu and Kangjie Zhou.
17:35	Contributed Talk: Ya-Ping Hsieh - When scores learn geometry: rate separations under the manifold hypothesis
19:00	Dinner & Networking

9:00	Keynote: Martin Wainwright - Wild Refitting for Black Box Prediction
Wild refitting for black box prediction Speaker: Martin Wainwright Abstract: Obtaining inferential guarantees on the performance of a prediction method is essential in practice. However, many modern predictive methods are opaque, so that a statistician is limited to querying its predicted values only (with no further insight into its properties). At the same time, such queries can be computationally expensive, so that inferential methods only make a small number of queries. In this talk, we describe and analyze a method for black-box inference that meet these desideredata, and illustrate its performance on motion reconstruction and image denoising.
9:50	Contributed Talk: Geelon So - On a theory of multi-objective learning
10:15	Coffee Break
10:45	Keynote: Shai Ben-David - On potential ethical harms inflicted by common types of bias in training data
Fairness for automated decision making - more challenges than definite answers Speaker: Shai Ben-David Abstract: I consider the impact of training data being "censored" in a couple of ways that are common in some real life scenarios. I examine social and ethical costs of applying classifiers trained on such censored data for human impactful decisions. I will discuss three data-censoring setups: 1) Learning when the only (but not all) positive points are labeled in the training data (a.k.a. PU learning), 2) The so-called ``apple tasting" setup, in which the learner can access only the labels of points that were previously predicted (maybe wrongfully) to be positive, 3) Data shift between training and deployment.
11:35	Contributed Talk: Licong Lin - A statistical theory of contrastive learning via approximate sufficient statistics
12:00 - 14:00	Lunch
14:00 - 15:00	Poster session
15:00	Contributed Talk: Ittai Rubinstein - Data Attribution without Strong Convexity
15:25	Keynote: Hoda Heidari - On the merits of economic modeling for understanding the long-term impacts of Al regulation
On the merits of economic modeling for understanding the long-term impacts of AI regulation Speaker: Hoda Heidari Abstract: The new paradigm in AI is that of General-purpose models: A firm releases a large, pretrained model, designed to be adapted and tweaked by other entities to perform domain-specific functions and create economic value. In this talk, I will explore how we can use tools from economics to understand the resulting market dynamics and the strategic behavior of key players in response to regulatory choices around AI (e.g., on safety and openness). I will end the talk with a discussion of merits and limitations of mathematical modeling as a tool for analyzing the broader, long-term impacts of AI in society.
16:15	Coffee Break
16:45	Keynote: Rediet Abebe - When does Allocation Require Prediction
When does allocation require prediction? Speaker: Rediet Abebe Abstract: TBD
17:35	Contributed Talk: Ekaterina Fedorova - Altruistic Collective Action in Recommender Systems
19:00	Dinner & Networking

9:00	Keynote: Hongseok Namkoong - Interactive Decision-Making via Autoregressive Generation
Interactive Decision-Making via Autoregressive Generation Speaker: Hongseok Namkoong Abstract: AI agents interacting with the real-world have to grapple with a perpetual lack of data in ever-changing environments. Interactive decision-making requires going beyond knowledge distillation: an intelligent agent must comprehend its own uncertainty and actively gather information to resolve it. Despite exhibiting impressive capabilities in basic knowledge work, state-of-the-art AI systems struggle to articulate their own uncertainty, e.g., OpenAI recently noted its latest agentic system “DeepResearch often fails to convey uncertainty accurately”. This talk covers a series of recent works that tackle the central challenge of uncertainty quantification in natural language-based interactive decision-making problems. Instead of modeling latent environment parameters, we view uncertainty as arising out of missing future outcomes and quantify it through autoregressive sequence generation—iteratively predicting the next outcome given the past. By adapting to new information through in-context learning rather than cumbersome posterior inference, our approach seamlessly scales to problems involving unstructured data, e.g., adaptive student assessment involving text and images. Formally, we establish a reduction from online decision-making to offline next-outcome prediction, enabling us to leverage the enormous datasets and computational resources dedicated to improving sequence prediction for interactive decision-making tasks.
9:50	Contributed Talk: Houssam Zenati - Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings
10:15	Coffee Break
10:45	Keynote: Ira Globus-Harris - Collaborative Prediction via Tractable Agreement Protocols
Collaborative Prediction via Tractable Agreement Protocols Speaker: Ira Globus-Harris Abstract: Designing effective collaboration between humans and AI systems is crucial for leveraging their complementary abilities in complex decision tasks. But how should agents possessing unique, private knowledge—like a human expert and an AI model—interact to reach decisions better than either could alone? If they were perfect Bayesians with a shared prior, Aumann's classical agreement theorem suggests conversation leads to a prediction via agreement which is accuracy-improving. However, this relies on implausible assumptions about their knowledge and computational power. We show how to recover and generalize these guarantees using only computationally and statistically tractable assumptions. We develop efficient "collaboration protocols" where parties iteratively exchange only low-dimensional information – their current predictions or best-response actions – without needing to share underlying features. These protocols are grounded in conditions like conversation calibration/swap regret, which relax full Bayesian rationality, and are computationally efficiently enforceable. First, we prove this simple interaction leads to fast convergence to agreement, generalizing quantitative bounds even to high-dimensional and action-based settings. Second, we introduce a weak learning condition under which this agreement process inherently aggregates the parties' distinct information, that is, agents via our protocols arrive at final predictions that are provably competitive with an optimal predictor having access to their joint features. Together, these results offer a new, practical foundation for building systems that achieve the power of pooled knowledge through tractable interaction alone. Bio: Ira Globus-Harris is an assistant research professor at Cornell's Center for Data Science for Enterprise and Society, hosted by ORIE and the school of Communication. Their work focuses on the foundations of responsible computing.
11:35	Contributed Talk: Michele Caprio - The Joys of Categorical Conformal Prediction
12:00	Group picture
12:15	Lunch
Afternoon	Free Time
19:00	Dinner

9:00	Contributed Talk: Jun Park - Causal and counterfactual spaces: measure-theoretic axiomatisations of causality and counterfactuals
9:25	Keynote: Jonas Peters - Causality and robustness
Causality and Robustness Speaker: Jonas Peters Abstract: TBD
10:15	Coffee Break
10:45	Contributed Talk: Patrik Reizinger - Identifiable exchangeable mechanisms for causal structure and representation learning
11:10	Keynote: Mikhail Belkin - Feature Learning and the Linear Representation Hypothesis for Monitoring and Steering LLMs
Feature learning and the linear representation hypothesis for monitoring and steering LLMs Speaker: Mikhail Belkin Abstract: Feature learningA trained Large Language Model (LLM) contains much of human knowledge. Yet, it is difficult to gauge the extent or accuracy of that knowledge, as LLMs do not always ``know what they know'' and may even be unintentionally or actively misleading. In this talk I will discuss feature learning introducing Recursive Feature Machines—a powerful method originally designed for extracting relevant features from tabular data. I will demonstrate how this technique enables us to detect and precisely guide LLM behaviors toward almost any desired concept by manipulating a single fixed vector in the LLM activation space. and "the linear representation hypothesis" for monitoring and steering LLMs
12:00 - 14:00	Lunch
14:00	Contributed Talk: Linara Adilova - Information, Geometry, and Generalization: Untangling Neural Representation Compression
14:25	Keynote: Amin Karbasi - When We Talk About Replicability, What Are We Talking About?
When We Talk About Replicability, What Are We Talking About? Speaker: Amin Karbasi Abstract: Lack of replicability in experiments has been a major issue, usually referred to as the reproducibility crisis, in many scientific areas such as biology, chemistry, and artificial intelligence. Indeed, the results of a survey that appeared in Nature are very worrisome: more than 70% of the researchers that participated in it could not replicate other researchers’ experimental findings while over half of them were not able to even replicate their own conclusions!!! In its simplest form, replicability requires that if two different groups of researchers carry out an experiment using the same methodologies but different samples of the same population, it better be the case that the two outcomes of their studies are statistically indistinguishable. In this talk, we investigate this notion in the context of machine learning and characterize for which learning problems statistically indistinguishable learning algorithms exist.
15:15	Coffee Break
15:45	Keynote: Jessica Sorrell - Replicability: A Computational View
Replicability: A computational view Speaker: Jessica Sorrell Abstract: Connections between replicability and differential privacy have helped us establish boundaries on the statistical efficiency of replicable PAC learning, but the limits of computationally efficient replicability are somewhat less clear. In this talk, we'll discuss the conditions under which replicable PAC learning is computationally tractable, as well as cryptographic assumptions that separate differential privacy and replicability.
16:35	Contributed Talk: Anastasiia Koloskova - Certified Unlearning for Neural Networks
17:00	End of the workshop
Evening	Departure

Locations

Talks: Auditorium
Poster sessions & coffee breaks: Sala Balint
Lunch & dinner: Sala Luce

Organizing Committee

Fanny Yang

ETH Zurich

Niao He

ETH Zurich

Santiago Mazuelas

BCAM

Location

Monte Verita Congress Center

Strada Collina 84
CH-6612 Ascona

Registration

Registration Fee: 350 CHF per person.

Please note that:

Spots are limited and registration will be processed on a first-come, first-served basis.
Registration is binding and cannot be canceled or refunded.

In addition to the main sessions, we will host dedicated poster sessions and short contributed talks. All registered participants will have the opportunity to submit an abstract of their work they'd like to present - we will then select a few of them to be presented as contributed talks, all others will be featured in the poster sessions. The call for abstracts will open after the registration deadline.

Registration will close on ~~August 31st~~ September 15th, 2025.

Registration closed

If you are still interested in attending, please email mml2025@ethz.ch; we will let you know if a spot becomes available.

Registration Information

Dear Participant, please follow the two steps below to register for the conference and book your meals and accommodation at Monte Verità Centre.

Please note that payments can only be made via credit card, TWINT, Apple Pay, or Google Pay.

Step 1 - registration and payment of the conference registration fee

Registration has closed.

Once you have successfully paid the registration fee, you will receive two emails:
1. The first email will be sent by the Saferpay platform, confirming payment of the registration fee. (Please check your spam folder.)
2. The second email, from the CSF platform, will contain information for the booking of your meals and the bedroom.

Step 2 – booking and payment of the meals & bedroom fee

Link: http://shop2.monteverita.org/en/

Enter the event code 728084.

Book your meals and bedroom latest 15 days prior to arrival. After that date the platform is closed.

Meals:
- Please note that all participants must book the meal package at Restaurant Monte Verità, even if staying at an external hotel.
- In case you are not attending the whole conference, it is also possible to book the meals only for the duration of your stay.
Bedroom:
- The number of single rooms is limited, and some have a shared bathroom.
- If you wish to share a twin room with a colleague, please provide both names when booking.
- If you wish to book a shared double room with an accompanying person (not active participant), please provide both names when booking. Be aware that accompanying persons pay the standard room price and are not entitled the CHF 28.- additional price reduction per person and night granted by CSF/ETHZ to active participants.
- Rooms can only be booked for the conference duration via this platform. If you wish to book a pre- and/or a post-night, please contact the Hotel Reception Monte Verità (info@monteverita.org).
Accommodation Options

In case there is no availability or in case you do not wish to stay at Hotel Monte Verità, you can look for another hotel in the nearby Ascona village (15 to 20 min. walking distance from Monte Verita Centre).

Symposium on Mathematical Foundations of Trustworthy Learning

12th - 16th October, 2025

Monte Verita Congress Center, Switzerland

Rediet Abebe

Ellis / Uni Tuebingen

Peter L. Bartlett

UC Berkeley

Mikhail (Misha) Belkin

UCSD

Shai Ben-David

University of Waterloo

Cristina Butucea

ENSAE‑CREST

Ira Globus-Harris

Cornell University

Hoda Heidari

CMU

Gautam Kamath

University of Waterloo

Amin Karbasi

Yale University

Hongseok Namkoong

Columbia University

Sewoong Oh

University of Washington

Jonas Peters

ETH Zurich

Jessica Sorrell

Johns Hopkins University

Martin Wainwright

MIT

Locations

Fanny Yang

ETH Zurich

Niao He

ETH Zurich

Santiago Mazuelas

BCAM

Registration closed

Registration Information

Step 1 - registration and payment of the conference registration fee

Step 2 – booking and payment of the meals & bedroom fee

Meals:

Bedroom:

Accommodation Options

Contact email: mml2025@ethz.ch