Published July 20, 2015 | Version v1
Journal article Open

Speech Analysis and Synthesis with a Computationally Efficient Adaptive Harmonic Model

  • 1. Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH-ICS)

Description

Harmonic models have to be both precise and fast in order to represent the speech signal adequately and be able to process large amount of data in a reasonable amount of time. For these purposes, the full-band adaptive Harmonic Model (aHM) used by the Adaptive Iterative Refinement (AIR) algorithm has been proposed in order to accurately model the perceived characteristics of a speech signal. Even though aHMAIR is precise, it lacks the computational efficiency that would make its use convenient for large databases. The Least Squares (LS) solution used in the original aHM-AIR accounts for most of the computational load. In a previous paper, we suggested a Peak Picking (PP) approach as a substitution to the LS solution. In order to integrate the adaptivity scheme of aHM in the PP approach, an adaptive Discrete Fourier Transform (aDFT), whose frequency basis can fully follow the variations of the f0 curve, was also proposed. In this article, we complete the previous publication by evaluating the above methods for the
whole analysis process of a speech signal. Evaluations have shown an average time reduction by four times using Peak Picking and
aDFT compared to the LS solution. Additionally, based on formal listening tests, when using Peak Picking and aDFT, the quality of
the re-synthesis is preserved compared to the original LS-based approach.

Files

bare_jrnl.pdf

Files (545.3 kB)

Name Size Download all
md5:102b8fcfc5e31676eaf09c938384024d
545.3 kB Preview Download

Additional details

Funding

LISTEN – Hands-free Voice-enabled Interface to Web Applications for Smart Home Environments 644283
European Commission