Can Contextual Biasing Remain Effective with Whisper and GPT-2?

Sun, Guangzhi; Zheng, Xianrui; Zhang, Chao; Woodland, Philip C.

Computer Science > Computation and Language

arXiv:2306.01942 (cs)

[Submitted on 2 Jun 2023]

Title:Can Contextual Biasing Remain Effective with Whisper and GPT-2?

Authors:Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland

View PDF

Abstract:End-to-end automatic speech recognition (ASR) and large language models, such as Whisper and GPT-2, have recently been scaled to use vast amounts of training data. Despite the large amount of training data, infrequent content words that occur in a particular task may still exhibit poor ASR performance, with contextual biasing a possible remedy. This paper investigates the effectiveness of neural contextual biasing for Whisper combined with GPT-2. Specifically, this paper proposes integrating an adapted tree-constrained pointer generator (TCPGen) component for Whisper and a dedicated training scheme to dynamically adjust the final output without modifying any Whisper model parameters. Experiments across three datasets show a considerable reduction in errors on biasing words with a biasing list of 1000 words. Contextual biasing was more effective when applied to domain-specific data and can boost the performance of Whisper and GPT-2 without losing their generality.

Comments:	To appear in Interspeech 2023
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2306.01942 [cs.CL]
	(or arXiv:2306.01942v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.01942

Submission history

From: Guangzhi Sun [view email]
[v1] Fri, 2 Jun 2023 22:56:01 UTC (617 KB)

Computer Science > Computation and Language

Title:Can Contextual Biasing Remain Effective with Whisper and GPT-2?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Can Contextual Biasing Remain Effective with Whisper and GPT-2?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators