Meta-Learning in neural networks, A Gentle Introduction

Meta-learning is one of the most active research areas in the field of deep learning. Some schools of thought in the artificial intelligence (AI) community join the thesis that meta-learning is one step toward the discovery of artificial general intelligence (AGI). In recent years we have seen an explosion in meta-learning research and development. Nevertheless, some of the basic ideas of meta-learning are still widely misunderstood by scientists and engineers. From this perspective, I thought it would be a good idea to review some of the fundamental concepts and history of meta-learning, as well as some popular algorithms in the space.

The ideas of meta-learning can be traced back to 1979 and the work of Donald Maudsley when he again addressed the new cognitive paradigm as “The process by which learners become aware and increasingly in control of the habits of perception, exploration, learning, and growth that they have learned. “A simpler definition can be found in the work of John Biggs (1985), in which he defined meta-learning as “being aware and in control of your learning”. These definitions are accurate in terms of cognitive science, but they seem a bit difficult to adapt to the workings of artificial intelligence (AI).

In the context of AI systems, meta-learning can simply be defined as the ability to acquire universality of knowledge. As humans, we can acquire multiple tasks at once with a minimum of information. We can recognize a new type of object by seeing a single image of it, or we can learn complex, multitasking actions, such as driving or piloting an airplane at once. Although AI agents can handle complex tasks, they require enormous amounts of training on any atomic subtasks, and they remain incredibly bad at multitasking. Thus, the path to universal knowledge requires AI agents to “learn to learn” or, to use a more unpleasant term, meta-learner.

read also: Deep learning in structural optimization

Types of meta-learning models

People learn from different methodologies adapted to specific circumstances. Similarly, not all meta-learning models use the same methods. Some meta-learning models focus on optimizing neural network structures, while others (e.g., Reptile) focus more on finding the right datasets to train specific models. A recent research paper from the University of California, Berkeley, AI Lab, does a comprehensive job of listing the different types of meta-learning. Here are some of my favorites:

Meta-learning multiple shots.

The idea behind multiple-shot meta-learning is to create deep neural networks that can learn from minimalistic datasets, mimicking, for example, how children can learn to recognize objects by seeing only one or two pictures. The ideas of meta-learning multiple pictures have inspired techniques such as memory augmented neural networks or one-shot generative models,

read also: why do scientists use mathematical equations at describing reality?

Meta learning optimizer

Meta-learning optimizer models focus on learning how to optimize a neural network for better task performance. These models typically involve neural networks that apply various optimizations to the hyperparameters of another neural network to improve the target task. An excellent example of optimizer meta-learning is models that focus on improving gradient descent methods, such as those published in this study,

Metric meta learning.

The goal of metric meta-learning is to identify the metric space in which learning is particularly effective. This approach can be thought of as a subset of multi-shot meta-learning, in which we used the learned metric space to evaluate the quality of learning with a few examples. This research paper shows how to apply meta-learning to classification tasks.

Recurrent meta-learning model

This type of meta-learning model is adapted to recurrent neural networks (RNNs) such as Long-Short-Term Memory (LSTM), In this architecture, the meta-learning algorithm will train the RNN model, process the data set sequentially, and then process new input data from the task. In an image classification setup, this may involve sequentially passing a set of pairs (image, label) of datasets followed by new examples to be classified. Meta-Arming Learning is an example of this approach.

The importance of meta-learning is steadily increasing as deep learning increasingly moves to unsupervised models. If we can generalize learning to learn new tasks, the idea of AGI suddenly becomes more pragmatic. However, just like humans, artificial intelligence models are discovering that learning to learn is harder than learning to learn.

Leave a Reply

Your email address will not be published.