Abstract
Tampering localization in document images plays an important role in the field of forensic and security, which has made great progress in recent years, however it is far from being solved. In this work, we aim to improve the tampering localization performance by refining both sides of the localization model. On one hand, we propose a multi-view enhancement (MVE) module at the input side, which combines RGB image, noise residual and texture information to obtain more forensic traces for tampering localization. On the other hand, at the output side, we propose both progressive supervision (PS) and detection assistance (DA) modules to enrich more detailed supervision information. Under the progressive supervision, we calculate BCE loss at each scale to extensively explore multi-scale features, which are vital for the tampering localization. To explore the tampering detection model, we adopt a KL loss to align both tampering localization and detection scores in the DA module, benefiting the estimation of global tampered probability. In the experiments, we evaluate the proposed method on the benchmark dataset DocTamper and the results demonstrate its effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bao, H., Dong, L., Piao, S., Wei, F.: BEiT: BERT pre-training of image transformers. In: ICLR (2022)
Bayar, B., Stamm, M.C.: Constrained convolutional neural networks: a new approach towards general purpose image manipulation detection. IEEE Trans. Inf. Forensics Secur. 13, 2691–2706 (2018)
Bertrand, R., Terrades, O.R., Gomez-Krämer, P., Franco, P., Ogier, J.M.: A conditional random field model for font forgery detection. In: ICDAR, pp. 576–580 (2015)
Cruz, F., Sidere, N., Coustaty, M., d’Andecy, V.P., Ogier, J.M.: Local binary patterns for document forgery detection. In: ICDAR, pp. 1223–1228 (2017)
Dong, C., Chen, X., Hu, R., Cao, J., Li, X.: Mvss-net: multi-view multi-scale supervised networks for image manipulation detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3539–3553 (2022)
Guillaro, F., Cozzolino, D., Sud, A., Dufour, N., Verdoliva, L.: TruFor: leveraging all-round clues for trustworthy image forgery detection and localization. In: CVPR, pp. 20606–20615 (2023)
Jain, H., Joshi, S., Gupta, G., Khanna, N.: Passive classification of source printer using text-line-level geometric distortion signatures from scanned images of printed documents. Multimedia Tools Appl. 79(11–12), 7377–7400 (2020)
Kwon, M.J., Nam, S.H., Yu, I.J., Lee, H.K., Kim, C.: Learning jpeg compression artifacts for image manipulation detection and localization. Int. J. Comput. Vision 130(8), 1875–1895 (2022)
Liu, J., Zheng, L.: A smoothing iterative method for the finite minimax problem. J. Comput. Appl. Math. 374, 112741 (2020)
Liu, Z., Hu, H., Lin, Y.E.A.: Swin transformer v2: Scaling up capacity and resolution. In: CVPR, pp. 12009–12019 (2022)
Mayer, O., Stamm, M.C.: Forensic similarity for digital images. IEEE Trans. Inf. Forensics Secur. 15, 1331–1346 (2020)
Qu, C., et al.: Towards robust tampered text detection in document image: new dataset and new solution. In: CVPR, pp. 5937–5946 (2023)
Sharma, M., Pachori, R., Rajendra, A.: Adam: a method for stochastic optimization. Pattern Recogn. Lett. 94, 172–179 (2017)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: CVPR, pp. 5686–5696 (2019)
Sun, Y., Ni, R., Zhao, Y.: MFAN: multi-level features attention network for fake certificate image detection. Entropy 24(1), 118–133 (2022)
Verdoliva, L.: Media forensics and deepfakes: an overview. IEEE J. Sel. Top. Sig. Process. 14(5), 910–932 (2020)
Wang, Y., Xie, H., Xing, M., Wang, J., Zhu, S., Zhang, Y.: Detecting tampered scene text in the wild. In: ECCV, pp. 215–232 (2022)
Wu, L., et al.: Editing text in the wild. In: ACM MM, pp. 1500–1508 (2019)
Wu, Y., AbdAlmageed, W., Natarajan, P.: Mantra-net: manipulation tracing network for detection and localization of image forgeries with anomalous features. In: CVPR, pp. 9543–9552 (2019)
Xu, W., et al.: Document images forgery localization using a two-stream network. Int. J. Intell. Syst. 37(8), 5272–5289 (2022)
Yang, Q., Huang, J., Lin, W.: SwapText: image based texts transfer in scenes. In: CVPR, pp. 14700–14709 (2020)
Yi, P., Wang, Z., Jiang, K., Jiang, J., Ma, J.: Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: ICCV, pp. 3106–3115 (2019)
Zhou, P., Han, X., Morariu, V.I., Davis, L.S.: Learning rich features for image manipulation detection. In: CVPR, pp. 1053–1061 (2018)
Acknowledgements
This research was funded by National Natural Science Foundation of China (NSFC) no.62276258, Jiangsu Science and Technology Programme no. BE2020006-4, European Union’s Horizon 2020 research and innovation programme no. 956123, and UK EPSRC under projects [EP/T026995/1]
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shao, H., Huang, K., Wang, W., Huang, X., Wang, Q. (2024). Progressive Supervision for Tampering Localization in Document Images. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1969. Springer, Singapore. https://doi.org/10.1007/978-981-99-8184-7_11
Download citation
DOI: https://doi.org/10.1007/978-981-99-8184-7_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8183-0
Online ISBN: 978-981-99-8184-7
eBook Packages: Computer ScienceComputer Science (R0)