Incomplete Cigarette Code Recognition via Unified SPA Features and Graph Space Constraints

Ding, Huiming; Xie, Zhifeng; Lai, Jundong; Xu, Yanmin; Ma, Lizhuang

doi:10.1007/978-3-031-20500-2_5

Huiming Ding¹²,
Zhifeng Xie¹²,
Jundong Lai¹³,
Yanmin Xu¹⁴ &
…
Lizhuang Ma¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13605))

Included in the following conference series:

CAAI International Conference on Artificial Intelligence

1243 Accesses

Abstract

Cigarette code is a 32-character string printed on a cigarette package, which can be used by tobacco administrations to determine the legality of distribution. Unfortunately, the recognition task for incomplete cigarette code often suffers from lowered recognition accuracy and the destruction of semantic context due to complex backgrounds and damaged characters. This paper proposes an end-to-end recognition network for incomplete cigarette code to improve recognition accuracy and estimate character landmarks. The proposed network first extracts multi-scale features using feature pyramid networks (FPN), then utilizes a spatial attention (SPA) mechanism to yield unified SPA features and integrates them into instance segmentation. This strengthens spatial representation ability and improves the recognition accuracy. A graph convolutional network (GCN) is introduced to construct graph space constraints and calculate character spatial correlations and accurately estimates missing character landmarks. Finally, we employ the Hungarian algorithm to align recognition characters with estimated landmarks and fill missing characters with ‘*’ to preserve the complete semantic context, and produce the final regularized cigarette code. The experimental results demonstrate that our proposed network reduces time consumption and improves recognition accuracy, surpassing the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, X., Jin, L., Zhu, Y., Luo, C., Wang, T.: Text recognition in the wild: a survey. ACM Comput. Surv. 54(2), 42:1–42:35 (2021)
Google Scholar
Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., Sun, J.: Cascaded pyramid network for multi-person pose estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 7103–7112 (2018)
Google Scholar
Doosti, B., Naha, S., Mirbagheri, M., Crandall, D.J.: Hope-net: a graph-based model for hand-object pose estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 6607–6616 (2020)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations, pp. 1–14 (2017)
Google Scholar
Kuhn, H.W.: The Hungarian method for the assignment problem. In: Jünger, M., et al. (eds.) 50 Years of Integer Programming 1958-2008, pp. 29–47. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-540-68279-0_2
Chapter Google Scholar
Li, H., Wang, P., Shen, C., Zhang, G.: Show, attend and read: a simple and strong baseline for irregular text recognition. In: The AAAI Conference on Artificial Intelligence, pp. 8610–8617 (2019)
Google Scholar
Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 532–548 (2021)
Article Google Scholar
Lin, Q., Luo, C., Jin, L., Lai, S.: STAN: a sequential transformation attention-based network for scene text recognition. Pattern Recogn. 111, 107692 (2021)
Article Google Scholar
Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944 (2017)
Google Scholar
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, pp. 2999–3007 (2017)
Google Scholar
Liu, R., et al.: An intriguing failing of convolutional neural networks and the CoordConv solution. In: Conference on Neural Information Processing Systems, pp. 9628–9639 (2018)
Google Scholar
Liu, W., Chen, C., Wong, K.Y.K.: Char-net: a character-aware neural network for distorted scene text recognition. In: The AAAI Conference on Artificial Intelligence, pp. 7154–7161 (2018)
Google Scholar
Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: FOTS: fast oriented text spotting with a unified network. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 5676–5685 (2018)
Google Scholar
Liu, Y., Chen, H., Shen, C., He, T., Jin, L., Wang, L.: ABCNet: real-time scene text spotting with adaptive Bezier-curve network. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 9806–9815 (2020)
Google Scholar
Long, S., He, X., Yao, C.: Scene text detection and recognition: the deep learning era. Int. J. Comput. Vis. 129(1), 161–184 (2021)
Article Google Scholar
Luo, C., Jin, L., Sun, Z.: MORAN: a multi-object rectified attention network for scene text recognition. Pattern Recogn. 90, 109–118 (2019)
Article Google Scholar
Milletari, F., Navab, N., Ahmadi, S.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: Fourth International Conference on 3D Vision, pp. 565–571 (2016)
Google Scholar
Neumann, L., Matas, J.: A method for text localization and recognition in real-world images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19318-7_60
Chapter Google Scholar
Shalev-Shwartz, S., Tewari, A.: Stochastic methods for l\({}_{\text{1 }}\)-regularized loss minimization. J. Mach. Learn. Res. 12, 1865–1892 (2011)
MathSciNet MATH Google Scholar
Sun, X., Xiao, B., Wei, F., Liang, S., Wei, Y.: Integral human pose regression. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 536–553. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_33
Chapter Google Scholar
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 648–656 (2015)
Google Scholar
Wang, X., Kong, T., Shen, C., Jiang, Y., Li, L.: SOLO: segmenting objects by locations. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12363, pp. 649–665. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58523-5_38
Chapter Google Scholar
Wang, X., Zhang, R., Kong, T., Li, L., Shen, C.: SOLOv2: dynamic and fast instance segmentation. In: Conference on Neural Information Processing Systems, pp. 1–12 (2020)
Google Scholar
Wei, S., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732 (2016)
Google Scholar
Wu, P., Zhou, Z., Huang, J., Xie, Z., Sheng, B.: Multi-scale feature fusion for incomplete cigarette code recognition. J. Comput.-Aided Des. Comput. Graph. 33(5), 780–788 (2021)
Google Scholar
Xie, Z., Wu, J., Zhang, S., Tang, Z., Fan, J., Ma, L.: Intelligent recognition method for cigarette code based on deep neural networks. J. Comput.-Aided Des. Comput. Graph. 31(1), 111–117 (2019)
Google Scholar
Xie, Z.-F., Zhang, S.-H., Wu, P.: CNN-based erratic cigarette code recognition. In: Zhao, Y., Barnes, N., Chen, B., Westermann, R., Kong, X., Lin, C. (eds.) ICIG 2019. LNCS, vol. 11901, pp. 245–255. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34120-6_20
Chapter Google Scholar
Xie, Z., Zhang, W., Sheng, B., Li, P., Chen, C.L.P.: BaGFN: broad attentive graph fusion network for high-order feature interactions. IEEE Trans. Neural Netw. Learn. Syst. Early Access 1–15 (2021)
Google Scholar
Xin, M., Mo, S., Lin, Y.: EVA-GCN: head pose estimation based on graph convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 1462–1471 (2021)
Google Scholar

Download references

Acknowledgments

This work was supported by the Shanghai Natural Science Foundation of China No. 19ZR1419100.

Author information

Authors and Affiliations

Shanghai University, Shanghai, China
Huiming Ding & Zhifeng Xie
Shanghai Tobacco Group Co., Ltd., Shanghai, China
Jundong Lai
Shanghai Tobacco Monopoly Administration, Shanghai, China
Yanmin Xu
Shanghai Jiao Tong University, Shanghai, China
Lizhuang Ma

Authors

Huiming Ding
View author publications
You can also search for this author in PubMed Google Scholar
Zhifeng Xie
View author publications
You can also search for this author in PubMed Google Scholar
Jundong Lai
View author publications
You can also search for this author in PubMed Google Scholar
Yanmin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Lizhuang Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhifeng Xie .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lu Fang
Xiaomi Inc., Beijing, China
Daniel Povey
Shanghai Jiao Tong University, Shanghai, China
Guangtao Zhai
JD Explore Academy, Beijing, China
Tao Mei
Chinese Academy of Sciences, Beijing, China
Ruiping Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ding, H., Xie, Z., Lai, J., Xu, Y., Ma, L. (2022). Incomplete Cigarette Code Recognition via Unified SPA Features and Graph Space Constraints. In: Fang, L., Povey, D., Zhai, G., Mei, T., Wang, R. (eds) Artificial Intelligence. CICAI 2022. Lecture Notes in Computer Science(), vol 13605. Springer, Cham. https://doi.org/10.1007/978-3-031-20500-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-20500-2_5
Published: 01 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20499-9
Online ISBN: 978-3-031-20500-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Incomplete Cigarette Code Recognition via Unified SPA Features and Graph Space Constraints