New Technique Developed by Australian Researchers to Prevent Unauthorized AI Learning from Images
A groundbreaking new technique created by Australian researchers could potentially put a stop to unauthorized artificial intelligence (AI) systems from learning from images, photos, artwork, and other visual content. This innovative method, developed by CSIRO, Australia’s national science agency, in collaboration with the Cyber Security Cooperative Research Centre (CSCRC) and the University of Chicago, involves subtly altering content to make it unreadable to AI models while remaining visually unchanged to the human eye.
One of the key applications of this technology is to protect sensitive data such as satellite imagery or cyber threat information from being absorbed by AI models, especially within defense organizations. Additionally, this breakthrough could also aid artists, organizations, and social media users in safeguarding their work and personal data from being utilized to train AI systems or create deepfakes. For instance, a social media user could automatically apply a protective layer to their images before posting, preventing AI systems from learning facial features for deepfake manipulation.
This technique establishes a boundary on what an AI system can learn from the protected content, providing a mathematical assurance that this protection remains intact even against adaptive attacks or retraining efforts. Dr. Derui Wang, a scientist at CSIRO, emphasized that this method offers a heightened level of certainty for individuals sharing content online.
“Our approach is distinct in that we can mathematically ensure that unauthorized machine learning models are unable to learn beyond a specified threshold from the content. This offers a robust safeguard for social media users, content creators, and organizations,” Wang explained.
Moreover, the application of this technique can be automated on a large scale. Wang stated, “A social media platform or website could integrate this protective layer into every uploaded image, potentially mitigating the proliferation of deepfakes, reducing instances of intellectual property theft, and empowering users to maintain control over their content.”
While the current implementation of this method is focused on images, there are plans to extend it to text, music, and videos in the future. Although the technology is still in the theoretical stage, with results validated in a controlled laboratory environment, the code is publicly accessible on GitHub for academic purposes. The research team is actively seeking partnerships with various sectors, including AI safety and ethics, defense, cybersecurity, and academia.
The paper detailing this technique, titled “Provably Unlearnable Data Examples,” was presented at the 2025 Network and Distributed System Security Symposium (NDSS) and was honored with the Distinguished Paper Award.