As machine learning (ML) applications become increasingly prevalent, protecting the confidentiality of ML models becomes paramount. One way to protect model confidentiality is to limit access to the model only via well-defined prediction APIs. Nevertheless, prediction APIs still leak information so that it is possible to mount model extraction attacks. In model extraction, the adversary only has access to the prediction API of a target model which he queries to extract information about the model internals. The adversary uses this information to gradually train a substitute model that reproduces the predictive behaviour of the target model.


Conference paper publications

  • Buse Gul Atli, Sebastian Szyller, Mika Juuti, Samuel Marchal, N. Asokan: Extraction of Complex DNN Models: Real Threat or Boogeyman? AAAI-EDSMLS. arXiv preprint: arXiv:1910.05429 [cs.LG]
  • Mika Juuti, Buse Gul Atli, N. Asokan: Making targeted black-box evasion attacks effective and efficient. AISec 2019. arXiv preprint arxiv:1906.03397
  • Mika Juuti, Sebastian Szyller, Alexey Dmitrenko, Samuel Marchal, N. Asokan: PRADA: Protecting against DNN Model Stealing Attacks. IEEE Euro S&P 2019. arXiv preprint arXiv:1805.02628

Technical reports

  • Sebastian Szyller, Buse Gul Atli, Samuel Marchal, N. Asokan: DAWN: Dynamic Adversarial Watermarking of Neural Networks. arXiv preprint arXiv:1906.00830
  • Buse Gul Atli, Yuxi Xia, Samuel Marchal, N. Asokan. WAFFLE: Watermarking in Federated Learning. arXiv preprint arXiv:2008.07298


  • Extraction of Complex DNN models: Brief overview [pdf], AAAI-EDSMLS talk [pdf]
  • Blackbox-evasion attacks: AISec talk [pdf]
  • PRADA: Euro S&P talk [pdf]

Demos & Posters

Source code