Catastrophic interference is a phenomenon in which AI forgets what it has learned. Find out why it's a serious problem in Machine Learning, and how to remedy it!
With the rise of tools like ChatGPT, artificial intelligence is now ubiquitous.
This technology is destined to occupy an increasingly important place in our personal and professional lives.
However, AI is not infallible. Like humans, it can make mistakes and even have memory lapses.
Unfortunately, forgetting can spell disaster for a neural network. In the field of Machine Learning, this phenomenon is known as “catastrophic interference”.
What is a neural network?
Neural networks are a popular form of Machine Learning, widely used for prediction.
As the name suggests, this type of system is inspired by the way the human brain learns new information.
In the way that neurons are interconnected, multiple mathematical equations are linked together.
And just as the brain reacts when the senses perceive a phenomenon, the artificial neural network is activated when it receives data.
Certain pathways are activated and others inhibited depending on the nature of the information received. At the end of the process, as a result, a node produces new information, such as a prediction.
For example, when you see a dog, your brain immediately identifies it. A neural network can also learn to recognize a dog.
However, to do so, it must first be trained to distinguish a dog from a cat. It therefore needs to be fed with data.
During this training phase, the neural network is fed data sets. To continue with the same example, this could be a series of images with a caption indicating whether they show a dog or a cat.
Subsequently, another dataset is used to test the network to see if the training has been successful. This is the prediction phase.
If the neural network achieves a sufficient rate of accurate predictions, it is ready for deployment. However, the test is not always conclusive…
What is catastrophic interference?
One important feature distinguishes the human brain from artificial neural networks: plasticity.
Plasticity is the human capacity for continuous learning. Thanks to this, after learning to distinguish dogs from cats, we also learn to recognize other animals, plants, trees and the whole world around us.
Neural Networks, on the other hand, are more limited. When learning a new task, they tend to forget what they’ve learned before.
For example, in a famous experiment conducted by McCloskey and Cohen in 1989, the researchers trained a neural network to solve mathematical problems based on examples containing the number 1.
They then fed the model another series of problems, this time containing the number 2. As a result, the neural network learned to solve problems containing a 2, but forgot how to solve those with the number 1.
How can we explain this phenomenon? The neural network dynamically created the paths between nodes during the training phase, based on the data supplied to it.
However, by feeding it new information, new paths were formed. This is what sometimes causes an algorithm to “forget” the previous tasks for which it has been trained.
The extent of this amnesia can vary. It may be as simple as an increase in the margin of error, but it can go as far as completely forgetting a previously learned task.
Note that catastrophic interference can occur even when the data sets successively fed to the neural network are not so different.
The various layers between the input and output of a neural network are hidden and operate as a black box. It is therefore impossible to know which data may break a path before it happens.
Why is this a big problem?
Most of today’s neural networks are trained using supervised learning. Engineers manually select and clean the data with which they feed the network, to avoid biases and other concerns that may arise from datasets.
With this type of neural network, catastrophic interference is not really a problem. However, as Machine Learning develops, agents are approaching autonomous, continuous learning.
Such neural networks are able to continue learning from new data, without even needing to be supervised by humans.
This development offers tremendous opportunities, but also introduces new risks. It is no longer really possible to know what type of data the network is using to learn.
And if it decides to use data far removed from its initial training, this can lead to catastrophic oversight.
How to avoid catastrophic interference?
There are many ways of minimizing the risk of catastrophic interference.
One approach is to add regularization terms such as L1 or L2 to control the complexity of the model and reduce its sensitivity to minor changes in the input data.
Dropout involves randomly deleting a certain number of neurons during training. This can prevent the model from relying too heavily on specific neurons.
In addition, training data augmentation techniques such as rotation, translation or inversion help the model to generalize better on unknown data.
Transfer learning, based on models pre-trained on similar tasks, can also help initialize the model with weights that have already learned important features to speed up learning and reduce catastrophic interference.
Similarly, progressive training involves training the model on progressively larger subsets of data. The model can then focus on simpler features before moving on to complex tasks.
Another common trick is to train a new neural network with all the data simultaneously. This avoids the sequential learning that can lead to the overwriting of previously acquired knowledge.
Some architectures are more resistant to forgetting, such as residual networks or evolutionary neural networks.
A judicious strategy is to create a backup of a neural network before re-training it. In the event of a problem, it is then possible to restore the previous version.
In a study carried out at the end of 2022, researchers discovered that catastrophic interference can be avoided by letting neural networks rest. This confirms the resemblance with our brain, which needs sleep to memorize better!
Conclusion: Catastrophic interference, one of the many challenges of Machine Learning
Catastrophic interference is just one of the many challenges facing Machine Learning. It will be necessary to overcome these obstacles in order to reveal the full potential of artificial intelligence.
To gain expertise as a Machine Learning engineer, you can choose DataScientest training.
Our ML Engineer course enables you to learn how to develop AI systems and use large data sets to research, develop and generate algorithms capable of learning and predicting.
You’ll master the entire Machine Learning process, from algorithm design to deployment and production. This will enable you to deal with problems such as catastrophic interference.
As you progress through the various modules, you’ll acquire solid skills in Python programming, DataViz, Machine Learning, data engineering, DataOps and MLOps, as well as Business Intelligence.
At the end of the course, you’ll receive a diploma from Mines ParisTech PSL Executive Education, a level 7 RNCP36129 “Project Manager in Artificial Intelligence” certification (Master level) issued by the Collège de Paris and an AWS Amazon Certified Cloud Practitioner certification!
This state-recognized training program is eligible for funding options and can be taken entirely by distance learning in an intensive 7-month BootCamp, or part-time over 16 months. Discover DataScientest!