Skip to yearly menu bar Skip to main content


Virtual presentation / top 25% paper

Not All Tasks Are Born Equal: Understanding Zero-Shot Generalization

Jing Zhou · Zongyu Lin · Yanan Zheng · Jian Li · Zhilin Yang

Keywords: [ Deep Learning and representational learning ] [ zero-shot learning ] [ multi-task learning ] [ transfer learning ]


Abstract:

Recent work has achieved remarkable zero-shot performance with multi-task prompted pretraining, but little has been understood. For the first time, we show that training on a small number of key tasks beats using all the training tasks, while removing these key tasks substantially hurts performance. We also find that these key tasks are mostly question answering (QA) tasks. These novel findings combined deepen our understanding about zero-shot generalization—training on certain tasks such as QA encodes general knowledge transferable to a wide range of tasks. In addition, to automate this procedure, we devise a method that (1) identifies key training tasks without observing the test tasks by examining the pairwise generalization results and (2) resamples training tasks for better data distribution. Empirically, our approach achieves improved results across various model scales and tasks.

Chat is not available.