添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Hallucinations in Neural Automatic Speech Recognition: Identifying Errors and Hallucinatory Models

2024-01-03 06:56:56 Rita Frieske, Bertram E. Shi
Abstract

Hallucinations are a type of output error produced by deep neural networks. While this has been studied in natural language processing, they have not been researched previously in automatic speech recognition. Here, we define hallucinations in ASR as transcriptions generated by a model that are semantically unrelated to the source utterance, yet still fluent and coherent. The similarity of hallucinations to probable natural language outputs of the model creates a danger of deception and impacts the credibility of the system. We show that commonly used metrics, such as word error rates, cannot differentiate between hallucinatory and non-hallucinatory models. To address this, we propose a perturbation-based method for assessing the susceptibility of an automatic speech recognition (ASR) model to hallucination at test time, which does not require access to the training dataset. We demonstrate that this method helps to distinguish between hallucinatory and non-hallucinatory models that have similar baseline word error rates. We further explore the relationship between the types of ASR errors and the types of dataset noise to determine what types of noise are most likely to create hallucinatory outputs. We devise a framework for identifying hallucinations by analysing their semantic connection with the ground truth and their fluency. Finally, we discover how to induce hallucinations with a random noise injection to the utterance.

Abstract (translated)

幻觉是一种由深度神经网络产生的输出错误类型。虽然已经在自然语言处理领域进行了研究,但在自动语音识别(ASR)领域之前并没有进行研究。在这里,我们定义在ASR中幻觉为由一个模型生成的转录,其语义与源语音句毫不相关,但仍然流畅、连贯。幻觉与模型可能的自然语言输出相似,会带来欺骗的危险,并影响系统的可靠性。我们证明了常用的指标,如词错误率,无法区分幻觉和非幻觉模型。为了应对这个问题,我们提出了一个基于扰动的方法来评估ASR模型在测试时间对幻觉的易感性,无需访问训练数据。我们证明了这种方法有助于区分具有类似基线词错误率的幻觉和非幻觉模型。我们进一步探索了ASR错误类型和数据噪声之间的关系,以确定哪种噪声最有可能产生幻觉输出。我们设计了一个通过分析幻觉与真实情况的语义联系及其流畅性来识别幻觉的框架。最后,我们发现了一种通过随机噪声注入来诱导幻觉的方法。

https://arxiv.org/abs/2401.01572

https://arxiv.org/pdf/2401.01572.pdf