Types of learning

Different types of learning have been popping up as we approach a more artificial general intelligence. Artificial General intelligene (AGI) aims to achieve computers that are capable of performing tasks humans can.

Transfer Learning

Transfer learning refers to using the weights learned from a given problem to solve a different related problem. For example, a neural network can be trained using the ImageNet dataset as a starting point. It is then retrained to work on the CUBS dataset. Transfer Learning helps speed up the learning process and obtain higher accuracy for the new dataset by using the pre-trained model instead of training from scratch.

Few-Shot Learning

One shot learning refers to learning from a limited number of examples per class. More particularly learning using only one example per class is called one-shot learning.

Meta-learning

Meta learning is learning how to learn the learning process. Most neural networks learn the parameters to perform specific task, Meta-learning refers to learning the ability to learn tasks such that when a given unknown task is shown it can perform it properly with little training. Few-shot learning is a good application of the meta-learning concept.

Incremental Learning

Incremental Learning refers to starting with pre-trained model and adding new tasks while retaining its ability to perform previous task. This is in contrst with transfer learning in which the network loses its ability to perform previous task because of catastrophic forgetting.

Multitask Learning

Multitask Learning refers to a type of learning in which learning from a given task can help improve learning on another task. This is similar to transfer learning in this sense, however the tasks are learned in parallel but not consecutively. This can also be confused with multilabel classification in which multiple labels are assigned per image. In the case of multitask learning a single image can have multiple labels, i.e. a picture with a car, stop sign and a pedestrian.

Continual learning

Continual learning (CL) is a type of sequential learning that aims to extend its abilities to new tasks as they come while retaining its ability to perform previous tasks. This ability is important for continuously changing environments and is an import aspect of AGI. One of the core issues of continual learning is to address the issue of catastrophic forgetting. This has been used interchangeably with lifelong learning.

CL is a type of online learning in which learning occurs at the presence pf a task. There is no set boundaries between task and the tasks can continuously grow as time progresses. The system must be able to use its previous experience to generalize and improve its capabilities with the presence of new information but must be resilient with catastrophic forgetting in which new information can be detrimental to the system’s performance with previous tasks. CL is also challenged by limited resources such as memory, and the system hence must use its resources, learning what information to retain and what to forget in the process.

One-Shot Learning

Deep neural networks are known to be a data-intensive algorithm. Thousands of examples are usually needed to be able to make a good classification. Few-shot learning is the opposite of that such that it aims to learn using only a few examples of each class.

One/few-shot learning refers to rapid learning from one or a few examples. Experiments on few shot learning are usually shown on N-way K-shot learning, where N is the number of classes is and K is the number of examples per 5-wayclass.

 

Screenshot from 2018-12-08 16-28-07
An example of a 5-way 1-shot classification. 5-way stands for 5 classes which are birds, dogs, cats, dolphons and rabbits). We also have one example (1-shot) for each class.

 

 

Screenshot from 2018-12-08 16-53-39
A 3-way 3-shot learning example. We have 3 classes (3-way) in the support set which are birds, dogs and cats with 3 examples (3-shot) each.

 

The dataset is split such that the classes in the training set are disjoint from the classes in the test set. For example, the training set can include cats, dogs and birds while the test set contains rabbits and dolphins.  The number of classes is large while the number of images per class is typically small. Given a set of support images with one/few image(s) per class and a query image, the goal of one/few shot learning is to be able to identify which support image the query image is most similar to.

A dataset commonly used for one/few-shot learning is the omniglot dataset. The omniglot dataset has a large number of classes and only a few examples for each class (i.e., as opposed to the MNIST database of handwritten digits which has few classes and a large number of examples for each class.) Omniglot contains 1623 characters from 50 different alphabets; each character is handwritten by 20 different people. 1200 characters are used for training while the remaining characters are used for testing.

 

Relevant Papers:

1. Human-level concept learning through probabilistic program induction.

Catastrophic forgetting

 

To be able to achieve artificial general intelligence, or human-like intelligence, machines should be able to remember previously known tasks simultaneously. In a given scenario, certains tasks may not appear frequently or recently as others. Neural networks in particular are known to catastrophically forget, information when they are not frequently or recently seen. This happens because neural network must frequently adapt to weights for newer task.

For example, we have a neural network which is trained to recognize birds, cats and dogs (Set 1). We then train it with other classes such as dolphins, rabbits and lions (Set 2). After some time, the neural network can start getting poor results in recognizing any class in Set 1 because the weights are biased to recognize the classes in Set 2. This is  known as catastrophic forgetting.

 

Screenshot from 2018-12-08 15-19-33

Screenshot from 2018-12-08 15-19-12

 

Related Papers:

  1. Overcoming catastrophic forgetting in neural networks
  2. Catastrophic Interference in Connectionist Network
  3. Catastrophic Forgetting in Connectionist Network