Content-based encrypted speech retrieval scheme with deep hashing

Zhang, Qiu-yu; Zhao, Xue-jiao; Zhang, Qi-wen; Li, Yu-zhou

doi:10.1007/s11042-022-12123-8

Content-based encrypted speech retrieval scheme with deep hashing

Published: 14 February 2022

Volume 81, pages 10221–10242, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Qiu-yu Zhang ORCID: orcid.org/0000-0003-1488-388X¹,
Xue-jiao Zhao¹,
Qi-wen Zhang¹ &
…
Yu-zhou Li¹

230 Accesses
5 Citations
Explore all metrics

Abstract

In order to improve the limitations of manual features and poor feature semantics in the feature extraction process of existing content-based encrypted speech retrieval methods, and as well as improve retrieval accuracy and retrieval efficiency, a content-based encrypted speech retrieval scheme with deep hashing was proposed. Firstly, the original speech file is encrypted by using Henon mapping chaotic encryption to construct encrypted speech library. Secondly, adopting secondary feature extraction method to extract the spectrogram feature, and using the spectrogram as the input of the designed convolutional neural network (CNN) for model training and deep hashing feature learning, to obtain the deep hash binary code of original speech, and upload it to the deep hash index table in the cloud. In addition, the batch normalization (BN) method is introduced to improve robustness and generalization ability of the model. Finally, establish a one-to-one mapping relationship between the encrypt speech in the encrypted speech library and the hash sequence in the deep hash index table. When retrieving for speech users, the normalized Hamming distance algorithm is used for retrieve matching. The experimental results show that the deep hash binary code constructed by the proposed method has strong discriminability and robustness, and it still has high recall rate, precision rate and retrieval efficiency under various general content preserving operations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing

Article 07 September 2020

Secure speech retrieval method using deep hashing and CKKS fully homomorphic encryption

Article 25 January 2024

A retrieval method for encrypted speech based on improved power normalized cepstrum coefficients and perceptual hashing

Article 27 February 2022

References

Ali TS, Ali R (2020) A novel medical image signcryption scheme using tent-logistic-tent system and Henon chaotic map. IEEE Access 8:71974–71992. https://doi.org/10.1109/ACCESS.2020.2987615
Article Google Scholar
Bartz C, Herold T, Yang H, Meisel C (2017) Language identification using deep convolutional recurrent neural networks. In: International Conference on Neural Information Processing. Springer pp 880–889. https://doi.org/10.1007/978-3-319-70136-3_93
Choi K, Fazekas G, Sandler M, Cho K (2018) Convolutional recurrent neural networks for music classification. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE pp 2392–2396. https://doi.org/10.1109/ICASSP.2017.7952585
Dhiraj BR, Ghattamaraju N (2019) An effective analysis of deep learning based approaches for audio based feature extraction and its visualization. Multimed Tools Appl 78(17):23949–23972. https://doi.org/10.1007/s11042-018-6706-x
Article Google Scholar
Elmaghraby E, Gody A, Farouk M (2020) Noise-robust speech recognition system based on multimodal audio-visual approach using different deep learning classification technique. Egypt J Lang Eng 7(1):27–42. https://doi.org/10.21608/ejle.2020.22022.1002
Article Google Scholar
Fan L, Jiang QY, Yu YQ, Li WJ (2019) Deep hashing for speaker identification and retrieval. In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH2019). ISCA pp 2908–2912. https://doi.org/10.21437/Interspeech.2019-2457
Glackin C, Chollet G, Dugan N, Cannings N (2017) Privacy preserving encrypted phonetic search of speech data. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE pp 6414-6418. https://doi.org/10.1109/ICASSP.2017.7953391
He S, Zhao H (2017) A retrieval algorithm of encrypted speech based on syllable-level perceptual hashing. Comput Sci Inf Syst 14(3):703–718. https://doi.org/10.2298/CSIS170112024H
Article Google Scholar
Hung J, Lin JS, Wu PJ (2018) Employing robust principal component analysis for noise-robust speech feature extraction in automatic speech recognition with the structure of a deep neural network. Appl Syst Innov 01(03):1–14. https://doi.org/10.3390/asi1030028
Article Google Scholar
Kaur A, Singh A, Kadyan V (2016) Correlative consideration concerning feature extraction techniques for speech recognition—a review. In: International Conference on Circuit, Power and Computing Technologies (ICCPCT). IEEE pp 1–4. https://doi.org/10.1109/ICCPCT.2016.7530308
Kim B, Pardo B (2019) Improving content-based audio retrieval by vocal imitation feedback. In: ICASSP 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE pp 4100–4104. https://doi.org/10.1109/ICASSP.2019.8683461
Li HG, Zhang FG (2020) A cloud storage method supporting speech encryption search. China Patent, CN108366072B, 2020-7-24
Li Y, Kong X, Fu H (2018) Exploring geometric information in CNN for image retrieval. Multimed Tools Appl 78(21):30585–30598. https://doi.org/10.1007/s11042-018-6414-6
Article Google Scholar
Li Y, Wan L, Fu T, Hu W (2020) Piecewise supervised deep hashing for image retrieval. Multimed Tools Appl 78(17):24431–24451. https://doi.org/10.1007/s11042-018-7072-4
Article Google Scholar
Li W, Xiao Y, Tang C (2020) Multi-user searchable encryption voice in home IoT system. Internet Things 11:100180. https://doi.org/10.1016/j.iot.2020.100180
Article Google Scholar
Nayyar RK, Nair S, Patil O, Pawar R, Lolage A (2017) Content-based auto-tagging of audios using deep learning. In: 2017 International Conference on Big Data, IoT and Data Science (BID). IEEE pp 30–36. https://doi.org/10.1109/BID.2017.8336569
Patil NM, Nemade MU (2019) Content-based audio classification and retrieval using segmentation, feature extraction and neural network approach. In: Advances in computer communication and computational sciences. Springer pp 263–281. https://doi.org/10.1007/978-981-13-6861-5_23
Qin P, Chen J, Zhang K, Chai R (2018) Convolutional neural networks and hash learning for feature extraction and of fast retrieval of pulmonary nodules. Comput Sci Inf Syst 15(3):517–531. https://doi.org/10.2298/CSIS171210020Q
Article Google Scholar
Qin Q, Wei Z, Huang L, Nie J (2019) A novel deep hashing method with top similarity for image retrieval. In: ICASSP 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE pp 2067–2071. https://doi.org/10.1109/ICASSP.2019.8683328
Shan Y, Liu M, Zhan Q, Du S, Wang J, Xie X (2019) Speech recognition based on deep tensor neural network and multifactor feature. In: 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE pp 650–654. https://doi.org/10.1109/APSIPAASC47483.2019.9023251
Sharma U, Maheshkar S, Mishra AN (2015) Study of robust feature extraction techniques for speech recognition system. In: 2015 International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE). IEEE pp 654–658. https://doi.org/10.1109/ABLAZE.2015.7154944
Shi C, Li X, Wang H (2020) A novel integrity authentication algorithm based on perceptual speech hash and learned dictionaries. IEEE Access 8:22249–22265. https://doi.org/10.1109/ACCESS.2020.2970093
Article Google Scholar
Shon S, Lee Y, Kim T (2018) Large-scale speaker retrieval on random speaker variability subspace. arXiv: Audio and speech processing. https://doi.org/10.21437/Interspeech.2019-1498
Tang Z, Zeng X, Sheng Y (2019) Entropy-based feature extraction algorithm for encrypted and non-encrypted compressed traffic classification. Int J Innov Comput Inf Control 15(03):845–860. https://doi.org/10.24507/ijicic.15.03.845
Article Google Scholar
Wang HX, Hao GY (2015) Encryption speech perceptual hashing algorithm and retrieval scheme based on time and frequency domain change characteristics. China Patent, CN104835499A, 2015-08-12
Wang D, Zhang XW (2015) Thchs-30: a free Chinese speech corpus. arXiv preprint arXiv: 1512.01882. https://arxiv.org/abs/1512.01882
Wang H, Zhou L, Zhang W (2014) Watermarking-based perceptual hashing search over encrypted speech. International Workshop on Digital Watermarking, vol 8389. Springer, Berlin, pp 423–434. https://doi.org/10.1007/978-3-662-43886-2_30
Book Google Scholar
Winursito A, Hidayat R, Bejo A (2018) Improvement of MFCC feature extraction accuracy using PCA in Indonesian speech recognition. In: 2018 International Conference on Information and Communications Technology (ICOIACT). IEEE pp 379–383. https://doi.org/10.1109/ICOIACT.2018.8350748
Xu Y, Kong Q, Wang W, Plumbley M (2018) Large-scale weakly supervised audio classification using gated convolutional neural network. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE pp 121–125. https://doi.org/10.1109/ICASSP.2018.8461975
Zeng F, Hu S, Xiao K (2019) Deep hash for latent image retrieval. Multimed Tools Appl 78(22):32419–32435. https://doi.org/10.1007/s11042-019-07980-9
Article Google Scholar
Zhang B, Lin J (2018) An efficient content based music retrieval algorithm. In: 2018 International Conference on Intelligent Transportation, Big Data and Smart City (ICITBS). IEEE pp 617–620. https://doi.org/10.1109/ICITBS.2018.00161
Zhang Q, Zhou L, Zhang T, Zhang D (2019) A retrieval algorithm of encrypted speech based on short-term cross-correlation and perceptual hashing. Multimed Tools Appl 78(13):17825–17846. https://doi.org/10.1007/s11042-019-7180-9
Article Google Scholar
Zhang Q, Ge Z, Zhou L (2019) An efficient retrieval algorithm of encrypted speech based on inverse fast Fourier transform and measurement matrix. Turk J Electr Eng Comput Sci 27(3):1719–1736. https://doi.org/10.3906/elk-1808-161
Article Google Scholar
Zhang S X, Gong Y, Yu D (2019) Encrypted speech recognition using deep polynomial networks. In: 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE pp 5691–5695. https://doi.org/10.1109/ICASSP.2019.8683721
Zhang Q, Ge Z, Hu Y, Bai J, Huang Y (2020) An encrypted speech retrieval algorithm based on Chirp-Z transform and perceptual hashing second feature extraction. Multimed Tools Appl 79(9):6337–6361. https://doi.org/10.1007/s11042-019-08450-y
Article Google Scholar
Zhang Q, Li Y, Hu Y (2020) An encrypted speech retrieval method based on deep perceptual hashing and CNN-BiLSTM. IEEE Access 8:148556–148569. https://doi.org/10.1109/ACCESS.2020.3015876
Article Google Scholar
Zhao H, He S (2016) A retrieval algorithm for encrypted speech based on perceptual hashing. In: 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD). IEEE pp 1840–1845. https://doi.org/10.1109/FSKD.2016.7603458

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61862041, 61363078). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, China
Qiu-yu Zhang, Xue-jiao Zhao, Qi-wen Zhang & Yu-zhou Li

Authors

Qiu-yu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xue-jiao Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Qi-wen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-zhou Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiu-yu Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Qy., Zhao, Xj., Zhang, Qw. et al. Content-based encrypted speech retrieval scheme with deep hashing. Multimed Tools Appl 81, 10221–10242 (2022). https://doi.org/10.1007/s11042-022-12123-8

Download citation

Received: 27 April 2021
Revised: 12 August 2021
Accepted: 03 January 2022
Published: 14 February 2022
Issue Date: March 2022
DOI: https://doi.org/10.1007/s11042-022-12123-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Content-based encrypted speech retrieval scheme with deep hashing

Abstract

Access this article

Similar content being viewed by others

A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing

Secure speech retrieval method using deep hashing and CKKS fully homomorphic encryption

A retrieval method for encrypted speech based on improved power normalized cepstrum coefficients and perceptual hashing

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Content-based encrypted speech retrieval scheme with deep hashing

Abstract

Access this article

Similar content being viewed by others

A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing

Secure speech retrieval method using deep hashing and CKKS fully homomorphic encryption

A retrieval method for encrypted speech based on improved power normalized cepstrum coefficients and perceptual hashing

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation