Does efficient, private, agnostic learning imply efficient, agnostic online learning?
Users of online services today must trust platforms with their personal data. Platforms can choose to enable privacy by default through methods such as differential privacy but the correctness and efficiency incentives are lacking.
Traditionally, models are trained offline in batches of data and re-deployed, but with the growth of IoT, and the growing need to compute at the edge, the online model which incorporates new data points as soon as they are observed is becoming increasingly appealing.
The long-term goal for this research is to either show there is an efficient equivalence between PAC (probably approximately correct) learning and private online learning or to create a framework for privacy that ensures this equivalence – thus motivating privacy by default.
This project has taken a look at recent work in the field of Computational Learning Theory. In 2019 it was shown that the problem of privately learning a class of hypotheses was equivalent to the problem of online learning it. It has also been shown that the equivalence is efficient if one assumes the pure private learner is highly sample efficient.
This means, at least in the pure learning case, that there is no reason not to privately learn a hypothesis if you can online-learn it. It also means that success in the online learning framework means success in the private learning framework.
However, since there may not exist a hypothesis that perfectly labels the data, the agnostic setting is a much more realistic framework to operate in. It is still an open problem as to whether the same process, by which we can reduce an efficient, private PAC learner into an efficient online learner, works in the agnostic setting. Further study into this qualitative relationship lays the groundwork to allow the knowledge from the field of online learning to design better differentially private learning algorithms.
We made headway into resolving this open question by attempting to derive an efficient black-box reduction from agnostic, private PAC learning to online learning.