SoK: cryptographic methods for secure machine learning audits

Together with my co-author Sam Holton, I'm excited to share a Systematization of Knowledge, reviewing techniques to securely audit Machine Learning (ML) models: https://drive.google.com/file/d/1PirVBsvi6URpoOW5xF4dfhYsoOiQH7c3/view?usp=drive_link

We find that while generating privacy preserving proofs about the training phase is infeasible, proving properties about fine-tuning or inference passes is feasible, even for large language models. We review improvements in the speed of zero-knowledge proofs, which allow an auditor to check arbitrary properties of a model without having actual access to model weights. This allows audits to be performed with the rigor of white box audits while maintaining the privacy of black box audits.

We also find that publishing a paper to Arxiv for certain categories is non-trivial due to its endorsement system, and we look forward to one day fullfil the necessary requirements.