Bai GR, Liu QB, He SZ
et al
. Unsupervised domain adaptation on sentence matching through self-supervision. JOURNAL OF
COMPUTER SCIENCE AND TECHNOLOGY 38(6): 1237−1249 Nov. 2023. DOI:
10.1007/s11390-022-1479-0
.
Citation:
Bai GR, Liu QB, He SZ
et al
. Unsupervised domain adaptation on sentence matching through self-supervision. JOURNAL OF
COMPUTER SCIENCE AND TECHNOLOGY 38(6): 1237−1249 Nov. 2023. DOI:
10.1007/s11390-022-1479-0
.
研究背景
随着深度学习的发展,神经网络模型在句子匹配任务上,取得了好的效果。深度学习模型是数据驱动的方法,虽然它们能利用一个领域上已有的标注数据训练,并得到好的效果,但是在面临新的领域时,由于源领域和目标领域的差异,它们的性能将会大幅下降。并且,新领域往往缺乏大量现成的标注数据,所以如何利用源领域的标注数据和目标领域的无标注数据实现领域自适应是不得不面对的问题。过去的研究中,基于对抗的领域自适应方法是一个经典的解决无监督领域自适应的方法,但是这种对抗训练方法在实践中通常难以收敛,并且没有针对性的考虑句子匹配任务的特性。所以面向句子匹配任务的无监督领域自适应是一个有价值的挑战。
目的
我们的目的是找到一个方法,能针对句子匹配任务的特性,在句子匹配任务上实现领域迁移,并且相比于过去的方法来说能够比较容易优化训练。
方法
我们提出了基于自监督的领域自适应。提出了四个不同辅助任务,其中包含了针对句子匹配任务特性的任务,来帮助两个领域在无监督的情况下对齐,缓解深度学习模型在新领域中性能下降的问题。
结果
我们在六个数据集上进行了实验,我们的方法比之前的方法平均提高了6.3%,证明了我们方法的有效性。此外,我们实验性地探索了如何考虑使用自监督任务来提高效果。我们发现领域相关的自监督任务是最有用的,导致领域分离的自监督任务是不好的,并且自监督任务多一点会更好。
结论
基于深度学习的句子匹配模型在面临新领域时,不可避免地出现性能下降。在无监督领域自适应的问题上,我们提出了基于自监督的方法,更好优化训练并且也在实验上取得了效果。此外我们发现在使用自监督任务时,相关的自监督任务是最有用的,导致领域分离的自监督任务是不好的,并且自监督任务多一点会更好。
Abstract:
Although neural approaches have yielded state-of-the-art results in the sentence matching task, their performance inevitably drops dramatically when applied to unseen domains. To tackle this cross-domain challenge, we address unsupervised domain adaptation on sentence matching, in which the goal is to have good performance on a target domain with only unlabeled target domain data as well as labeled source domain data. Specifically, we propose to perform self-supervised tasks to achieve it. Different from previous unsupervised domain adaptation methods, self-supervision can not only flexibly suit the characteristics of sentence matching with a special design, but also be much easier to optimize. When training, each self-supervised task is performed on both domains simultaneously in an easy-to-hard curriculum, which gradually brings the two domains closer together along the direction relevant to the task. As a result, the classifier trained on the source domain is able to generalize to the unlabeled target domain. In total, we present three types of self-supervised tasks and the results demonstrate their superiority. In addition, we further study the performance of different usages of self-supervised tasks, which would inspire how to effectively utilize self-supervision for cross-domain scenarios.