An evolving field of computer science known as machine unlearning finds ways to induce selective amnesia in artificial intelligence software, with the goal of removing all traces of a specific person or data point from a machine learning system without affecting its performance.
If this breakthrough is put into practice, it could give individuals greater control over their data and value. Machine unlearning could allow someone to take back both their data and a company’s ability to profit from it.
Technology companies spend millions of dollars training machine learning algorithms to recognize faces or arranging social posts because algorithms can often solve a problem faster than human programmers. But, once trained, machine learning is not so easy to change. The traditional way to eliminate the influence of a particular data point is to rebuild a system from the start, a potentially costly endeavour.
Studies on machine unlearning are partly due to the growing attention paid to the impact of artificial intelligence on privacy. Data regulators around the world have long had the power to force companies to delete information they have illegally obtained. Recently, the U.S. and European regulators have said that owners of AI systems sometimes have to go so far as to delete a system that has been trained with sensitive data.
The small field of machine-unlearning research deals with various practical and mathematical questions raised by these regulatory shifts. Researchers have shown that they can make machine learning algorithms forget under certain conditions, but the technology is not yet ripe for wider use.
One promising method proposed by researchers at the University of Toronto and the University of Wisconsin-Madison in 2019 is to split the source data for a new machine learning project into several parts, which are then processed separately before the results are incorporated into the final machine learning model. If a data point needs to be deleted later, only a fraction of the original input data needs to be reprocessed.
Roth and collaborators at Penn, Harvard, and Stanford have recently demonstrated a flaw in such an approach, showing that the unlearning system would malfunction if deletion requests were made in a particular order, either accidentally or by a malicious actor. They have also shown how the problem could be solved.
It will take superb engineering work for technology companies to finally implement machine unlearning as a way to give people more control over the algorithmic fate of their personal data, and even then, technology may not significantly change the privacy risks of the AI age.
For more information, read the original story in Ars Technica.