Nov 26, 2024 1 min read Cryptography

SoK: cryptographic methods for secure machine learning audits

Together with my co-author Sam Holton, I'm excited to share a Systematization of Knowledge, reviewing techniques to securely audit Machine Learning (ML) models: https://drive.google.com/file/d/1PirVBsvi6URpoOW5xF4dfhYsoOiQH7c3/view?usp=drive_link

We find that while generating privacy preserving proofs about the training phase is infeasible, proving properties about fine-tuning or inference passes is feasible, even for large language models. We review improvements in the speed of zero-knowledge proofs, which allow an auditor to check arbitrary properties of a model without having actual access to model weights. This allows audits to be performed with the rigor of white box audits while maintaining the privacy of black box audits.

We also find that publishing a paper to Arxiv for certain categories is non-trivial due to its endorsement system, and we look forward to one day fullfil the necessary requirements.

You might also like...

Some commonalities between interactive versus non-interactive proof systems

Secure hardware: a high level introduction

White-box methods for auditing language models

SoK: Public Goods Funding

The benefits of zero-knowledge for mechanism design

Popular tags