Glossary Background Image

No Bad Questions About ML

Definition of Zero-shot learning (ZSL)

What is zero-shot learning?

Zero-shot learning (ZSL) is a machine learning paradigm that aims to classify objects or understand concepts that a model has never encountered in previous situations. ZSL is not limited to classification tasks alone and involves generalizing knowledge from seen to unseen instances.

Developing AI models that can classify objects quickly and accurately requires a large amount of time and data during a model's training. ZSL eliminates the need to collect data about every possible object a machine learning model may encounter and instead allows the model to use pre-existing knowledge to determine an unknown object’s class.

How does zero-shot learning work?

Zero-shot learning involves classes that a model is aware of, classes it needs to generalize about, and auxiliary information to assist the model in classifying unknown objects. This is based on how humans observe new objects, such as an unknown animal, and use pre-existing knowledge to classify them.

If someone has never seen a lion and comes to the zoo and sees this animal, they will begin to infer what it could be based on their pre-existing knowledge of other animals. They do know what a cat and dog are, as well as other common animals, such as various birds and fish. The process looks like this:

  • Environment. No water and it moves on land. It’s not a fish.
  • No wings. The lion is not a bird.
  • Fur. It could be a kind of cat or dog.

Based on observations of their cat at home, this person begins to notice how similar the lion and cat are. In conclusion, they decide that a lion is similar to a cat, just bigger and more dangerous. 

ZSL works in a similar way. A machine learning model is fed data with parameters, for example, to distinguish between cats and dogs. ZSL enables the model to compare this new object with its pre-existing data sets on cats and dogs when it encounters a bird. Does it have fur? No. Does it walk on four legs? No. While it may not be able to classify the object specifically as a bird, it will not identify it as a dog or a cat.

To do this, a zero-shot learning component interacts with two modules:

  • Semantic embedding module uses information from documents, knowledge graphs, and/or image descriptions to analyze the object, CLIP, DALL-E.
  • Visual embedding module captures the visual properties of the object.

The advent of the CLIP and DALL-E architectures has allowed models to operate with full-text embeddings of language models. Thus, we generalize knowledge from two modalities.

The zero-shot component then receives data from these modules to determine the relationship between their results. Returning to the cat, dog, and bird example, the zero-shot component will notice that the semantic and visual modules have both determined that the bird is not, in fact, a dog or a cat and will classify the unknown object according to its parameters for such conditions.

How does a zero-shot classifier work in a real-world application?

A zero-shot classifier has positive applications in different scenarios, for example, in natural language processing (NLP). Zero-shot learning facilitates text generation tasks, such as machine translation or summarization, for languages or topics where training data is scarce. For instance, a zero-shot translation system might translate text into a language it’s never encountered before by using similarities with languages it knows.

Likewise, zero-shot capabilities empower question-answering systems to respond accurately to queries about topics for which they were not specifically trained. This involves leveraging contextual and semantic relationships between what the system knows and the new queries.

Key Takeaways

  • Zero-shot learning (ZSL) is a machine learning paradigm that enables pre-trained models to classify objects and understand concepts outside of their learning parameters.
  • ZSL reduces the amount of time and other resources required to train machine learning models.
  • The principles of ZSL are based on how humans identify previously unknown objects.
  • ZSL can be applied to machine learning models in various scenarios, such as Natural Language Processing, to enhance text generation, translation, and query answering.

More terms related to ML